Inadequately handle or store data or (bio)materials
Inadequately handle or store data or (bio)materials
What is this about?
Why is this important?
Data are both factual information (e.g. statistical information, cell counts) and materials, means and products of scientific inquiry (e.g. tissue samples, written notes).[1] Handling data refers to how data are “maintained, analysed, interpreted, and shared, transmitted or reported to others.” (pg. 95).[2] When talking about handling data, The Office for Research Integrity (ORI) states that the following needs to be considered[1]:
· Data storage refers to how the data should be stored in such a way in order for (project) results can be reconstructed (by others).
· Data protection concerns the protection of written and electronic data and research materials from possible physical and electronic damage and from theft or tampering.
· Data retention refers to the length of time data needs to be stored after the end of a project. Per country, institution and funder this differs. Secure destruction of data also needs to be guaranteed.
· Data sharing is the act of sharing research results with other researchers and the public. How and if results should be shared is an important point to consider here.
For whom is this important?
What are the best practices?
Data organization
Data should be organized in a logical and structured way. Within research groups, consensus on naming and organizing data and files can help in structuring data. The University of Cambridge has provided this resource which provides a good oversight of what you should keep in mind for naming files, organizing folders and more. In addition, they collected various resources that can support in data management.
Pseudonymization
When performing research involving human subjects, participants should be pseudonymized or anonymized. Pseudonymization removes the information that would allow for identification of individuals from a dataset. The main researchers have access to an encryption key. When John Smith, aged 31 and Jane Doe, aged 25 are in your data set, you should not pseudonymize with ‘JS31’ and ‘JD25’. Correct pseudonymization is naming them, for example, participants 001 and 002.
According to the GDPR, “‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person”. [1]
- ↑ https://gdpr.eu/article-4-definitions/
Iris Lechner, Natalie Evans contributed to this theme. Latest contribution was Oct 27, 2020