Data protection
When it comes to Open Research Data, the management of personal and sensitive data can be very challenging. The laws that regulate personal and sensitive data protection are the General Data Protection Regulation (GDPR) and New Federal Act on Data Protection (nFADP). In Switzerland, the nFADP, which is effective since 1 September 2023, was triggered by technological and social developments, such as the Internet, smartphones, and social networks. The nFADP aims at maintaining compatibility with the GDPR to ensure the free flow of data with the EU.
Personal data: In GDPR, Personal data is any information relating to an […] identifiable natural person. Examples of personal data include, surname, first name, pseudonym, date and place of birth; photos, sound recordings of voices; fixed or mobile telephone number, postal address, email address; IP address, computer connection identifier or cookie identifier; license plate number, social security number, ID number; application usage data, comments, etc.
Sensitive data: Sensitive data, on the other hand, refers to data relating to the health of individuals, sexual life or sexual orientation, data revealing an alleged racial or ethnic origin, political opinions, religious beliefs, philosophical beliefs or trade union membership, genetic and biometric data used for the purpose of uniquely identifying an individual.
Data protection rights
According to European and Swiss data protection laws, data subjects have rights such as:
to be informed (i.e. to know how their data will be used)
the right of access (i.e. to request copies of their personal data)
the right to rectification (i.e., to correct incorrect or incomplete data)
the right “to be forgotten” (i.e., to have their data deleted)
right to restrict processing (i.e., to have their data processed only in certain conditions)
rights related to automated decision-making and profiling (i.e., to decide not to be subjects to decisions based solely on automated processing, including profiling)
Privacy by Design and Privacy by Default
Compared to the previous law, the nFADP introduced the “Privacy by Design“ and “Privacy by Default” principles, which originate in the GDPR.
Privacy by design refers to the fact that the processing, already at the conception phase, has to be designed in such a way as to respect the data protection principles set forth in Article 5 of the GDPR (such as lawfulness, data minimization, purpose and storage limitation, integrity and confidentiality).
Privacy by default means that when a system or service includes choices for the individual on how much personal data he/she shares with others, the default settings should be the most privacy friendly ones.
Best practices for research data
The best practices for collecting and sharing multimodal data are:
Explicit consent
Withdrawal of consent
Data minimization
Anonymization
Encryption
Access control
Blurring of voices and faces (biometric data)
Film people from behind
Store personal information separately from the research data and ensure that only authorized personnel can link the two
Data protection and data collection can take up a lot of time, a year of the project can be lost for this
Planning and piloting are critical for the success of the project (reduces the number of problems)
When you write the project, think also of the DMP, it makes the process easier
Build a contact already with the local ethical committee
Technical requirements
Data protection laws also prescribe technical aspects of how personal and sensitive data should be processed and stored. Using a trusted research environment would be one way to fulfill these requirements (see our CLARIN-CH Webinar on the Protection of Personal and Sensitive data: Technical aspects for more information).
See also the section on Data access and security to learn more about how you can store and process your data securely on your own computer (not just personal/sensitive data).
Resources
If you are dealing with questions regarding data protection, chances are high other researchers have encountered similar isues. We have gathered them here together with answers from our data protection experts:
FAQ
What data protection laws apply for my research project?
In order to know which data protection laws apply in the context of your project, it is useful to ask yourself the following five questions:
Are you processing personal data? If you are handling (collecting, analysing, etc.) data or information that relates to an identifiable or identified natural person, you are indeed processing personal data and data protection laws apply. If not, data protection laws don’t apply. Anonymised data is not considered personal data, pseudonymized data is still considered personal data.
Where are you established/affiliated? Researchers are subject to the laws of the country in which their affiliated institution is based. This is independent of the place where the data are collected. As a linguist employed by a Swiss institution, you are bound by Swiss data protection laws.
What is your legal status? The legal status of the data controller determines whether federal or cantonal laws apply. If you are a private person unaffiliated to an institution, or employed by a federal body (e.g. ETH), you are bound by federal laws. If you are affiliated to a cantonal body (e.g. University), you are bound by cantonal data protection laws.
Where is the data collection done (and who is it targeting)? If you collect the data in a different country than where you live, the laws of that country applies (i.e. where participants are residents). If you are collecting data from a group of people with a specific nationality or origin, the laws of those countries may apply (e.g. collecting data from German-speaking people in Zurich = data protection laws of Germany apply).
Are you subject to sector-specific laws? Subject-specific laws apply if you are working on diseases, the structure or the function of the human body (regulated by Human Research Act)
Am I allowed to process personal data?
Yes, you are allowed to process personal data. You can do so by either having a legal basis (e.g. informed consent from participants) or as part of “research privilege” where you don’t need informed consent and can process the data also for purposes other than the original ones.
If you want to make use of “research privilege” the data has to be processed for research purposes only and 1) it has to be anonymized once the purpose of the project is achieved, and 2) results must be published in a form that does not allow the identification of individuals.
Do I always need informed consent?
If you process the data as part of “research privilege” you don’t need informed consent from participants. However, it is always necessary to inform participants about the purpose of the project in one way or another.
If you cannot comply with the requirements for processing the data under “research privilege” (anonymization, no identification), informed consent is the only legal basis for you to collect personal and sensitive data.
How to ensure that the consent is valid?
To be valid, informed consent needs to be freely given and informed. It can be given either oral or written and it’s always useful to have proof of given consents. That is, the person should not be pressurized in participating (e.g. with excessive compensation) and should receive all the necessary information.
This might be particularly difficult to achieve if working with indigenous communities where information about the handling and processing of the data might not be understood by the participants in the same way as it is by the research team. Where sensitive data are concerned, consent must be explicit.
What information do I need to provide in the informed consent?
Minimal requirements include:
information about the researcher in charge of the project
understandable statements describing the purpose of the research
the nature and duration of participation and the research methods
a clear description of possible risks and benefits of participation
nature of the data collected and their usefulness
a description of the protection/security measures
the guarantee of being free to decide not to participate in the project, to withdraw without losing acquired rights and to have the possibility at any time to continue or not to participate
the right to access and rectify the data
the existence of any conflict of interest
description of how data will be preserved, re-used and possibly shared on platforms
contracts with third parties
the possibility of being informed of the results
If you are drafting up an informed consent, get in touch with your institution’s data protection office or ethics committee for possible templates and help.
Can I keep personal or sensitive data I have collected indefinitely?
If you wish to take advantage of research privilege you need to anonymise personal data as the as the purpose of the project has been achieved. At that point (once anonymized) the data is no longer considered personal data. If long-term preservation or sharing via repositories is envisaged, participants must have given their consent to do so.
Can I share personal or sensitive data?
Yes, it is possible to share personal and sensitive data. You are allowed to disclose personal data to third parties for research purposes (even without prior consent). Nevertheless, participants have to be informed about this disclosure. You are also allowed to share personal data publicly if participants have given their consent to do so.
If you plan to disclose the data in another country (e.g. upload to a repository abroad), that country needs to have an adequate level of protection (the Federal Protection and Information Commissioner maintains a list of countries offering such guarantees). If you adequate protection is provided, contractual measures must be taken or consent (from participants) obtained.
If I collect data in another country, and the law of that country is very different from Swiss law, which one prevails?
It is usually the stricter law that applies to provide as much protection of people’s privacy as possible.
These questions were addressed by Brian Kleiner (FORS) in the Webinar "Protection of personal and sensitive linguistic data: Legal aspects" organized by the CLARIN-CH Working Group on Managing Sensitive and Personal Data. You can access the recording and the presentation slides here:
CLARIN-CH Working Group Kickoff: Outcomes
In September 2023, CLARIN-CH organized an event focusing on data collection, protection and preservation and their associated procedures, with respect to different types of linguistic data (e.g., multimodal, historical, experimental, sociolinguistics, data from social media, data from different age groups). The event was a kickoff of the Working Group: Management of Sensitive and Personal data, Ethical and Legal issues for linguistic data.
Data collection: talk by Dagmar Jung - linguist specialized in the collection of naturalistic data in the field, metadata collection, secure file handling, workflows useful for the archiving process. You will learn about the fact that data collection is the key to a successful management of sensitive data. Access the
slides and the
recording*.
Data protection: talk by Violaine Michel Lange - data scientist, neurolinguist and NLP expert specialized in experimental data and developing NLP pipelines for data protection. You will find a discussion about the European GDPR vs. the New Federal Act on Data Protection (nFDAP), and an example of an NLP pipeline for data anonymisation. Access the
slides and the
recording*.
Data protection: talk by Johanna Miecznikowski-Fuenfschilling - professor at the Institute of Italian Studies and the Institute of argumentation, linguistics and semiotics of USI Università della Svizzera italiana, and Nina Profazi - research assistant in the project “Data-sharing skills in corpus-based research on talk-in-interaction”, which is part of the ORD program funded by swissuniversities. You will find a discussion about the process of de-identification of data. Access the
slides and the
recording*.
Data preservation: talk by Thomas Schmidt - computer scientist specialized in the field of methodology and technology for working with audiovisual language data and in computer-assisted lexicography. You will find a discussion about data management and preservation, a use-case about working with sensitive data (the FOLK project), and a method for data anonymisation. Access the
slides and the
recording*.
*The recordings are password protected, contact us if you are interested in getting access.
The DARIAH ELDAH Consent Form Wizard is an online tool that enables researchers to quickly generate a GDPR-compliant consent form for collecting personal data for research purposes, but which can also be used, for example, for creating mailing lists or organizing academic events. Currently the tool is available in English, German, Italian and Croatian, although there are plans to have it translated to other languages. The tools is created by the members of the CLARIN Committee for Legal and Ethical Issues and of the DARIAH ELDAH Ethics and Legality in Digital Arts and Humanities Working Group.
The Università delle Svizzera italiana (USI) and the University of Neuchâtel (UNINE), both members of CLARIN-CH, have developed a tool that guides researchers through the most relevant legal aspects of research data management and proposes possible solution approaches to copyright and data protection issues.