Data protection

Back to the overview

When it comes to Open Research Data, the management of personal and sensitive data can be very challenging. The laws that regulate personal and sensitive data protection are the General Data Protection Regulation (GDPR) and New Federal Act on Data Protection (nFADP). In Switzerland, the nFADP, which is effective since 1 September 2023, was triggered by technological and social developments, such as the Internet, smartphones, and social networks. The nFADP aims at maintaining compatibility with the GDPR to ensure the free flow of data with the EU.

PICOL CC BY 3.0

Personal data: In GDPR, Personal data is any information relating to an […] identifiable natural person. Examples of personal data include, surname, first name, pseudonym, date and place of birth; photos, sound recordings of voices; fixed or mobile telephone number, postal address, email address; IP address, computer connection identifier or cookie identifier; license plate number, social security number, ID number; application usage data, comments, etc.

Sensitive data: Sensitive data, on the other hand, refers to data relating to the health of individuals, sexual life or sexual orientation, data revealing an alleged racial or ethnic origin, political opinions, religious beliefs, philosophical beliefs or trade union membership, genetic and biometric data used for the purpose of uniquely identifying an individual.

Jump to FAQ section ⬇️

Data protection rights

According to European and Swiss data protection laws, data subjects have rights such as:

to be informed (i.e. to know how their data will be used)
the right of access (i.e. to request copies of their personal data)
the right to rectification (i.e., to correct incorrect or incomplete data)
the right “to be forgotten” (i.e., to have their data deleted)
right to restrict processing (i.e., to have their data processed only in certain conditions)
rights related to automated decision-making and profiling (i.e., to decide not to be subjects to decisions based solely on automated processing, including profiling)

Privacy by Design and Privacy by Default

Compared to the previous law, the nFADP introduced the “Privacy by Design“ and “Privacy by Default” principles, which originate in the GDPR.

Privacy by design refers to the fact that the processing, already at the conception phase, has to be designed in such a way as to respect the data protection principles set forth in Article 5 of the GDPR (such as lawfulness, data minimization, purpose and storage limitation, integrity and confidentiality).
Privacy by default means that when a system or service includes choices for the individual on how much personal data he/she shares with others, the default settings should be the most privacy friendly ones.

Best practices for research data

The best practices for collecting and sharing multimodal data are:

Explicit consent
Withdrawal of consent
Data minimization
Anonymization
Encryption
Access control
Blurring of voices and faces (biometric data)
Film people from behind
Store personal information separately from the research data and ensure that only authorized personnel can link the two
Data protection and data collection can take up a lot of time, a year of the project can be lost for this
Planning and piloting are critical for the success of the project (reduces the number of problems)
When you write the project, think also of the DMP, it makes the process easier
Build a contact already with the local ethical committee

Technical requirements

Data protection laws also prescribe technical aspects of how personal and sensitive data should be processed and stored. Using a trusted research environment would be one way to fulfill these requirements (see our CLARIN-CH Webinar on the Protection of Personal and Sensitive data: Technical aspects for more information).

See also the section on Data access and security to learn more about how you can store and process your data securely on your own computer (not just personal/sensitive data).

Resources

If you are dealing with questions regarding data protection, chances are high other researchers have encountered similar isues. We have gathered them here together with answers from our data protection experts:

FAQ

What data protection laws apply for my research project?

Am I allowed to process personal data?

Do I always need informed consent?

How to ensure that the consent is valid?

What information do I need to provide in the informed consent?

Can I keep personal or sensitive data I have collected indefinitely?

Can I share personal or sensitive data?

If I collect data in another country, and the law of that country is very different from Swiss law, which one prevails?

These questions were addressed by Brian Kleiner (FORS) in the Webinar "Protection of personal and sensitive linguistic data: Legal aspects" organized by the CLARIN-CH Working Group on Managing Sensitive and Personal Data. You can access the recording and the presentation slides here:

💻 Recording 📄 Slides

CLARIN-CH Working Group Kickoff: Outcomes

In September 2023, CLARIN-CH organized an event focusing on data collection, protection and preservation and their associated procedures, with respect to different types of linguistic data (e.g., multimodal, historical, experimental, sociolinguistics, data from social media, data from different age groups). The event was a kickoff of the Working Group: Management of Sensitive and Personal data, Ethical and Legal issues for linguistic data.

Data collection: talk by Dagmar Jung - linguist specialized in the collection of naturalistic data in the field, metadata collection, secure file handling, workflows useful for the archiving process. You will learn about the fact that data collection is the key to a successful management of sensitive data. Access the slides and the recording*.

Data protection: talk by Violaine Michel Lange - data scientist, neurolinguist and NLP expert specialized in experimental data and developing NLP pipelines for data protection. You will find a discussion about the European GDPR vs. the New Federal Act on Data Protection (nFDAP), and an example of an NLP pipeline for data anonymisation. Access the slides and the recording*.

Data protection: talk by Johanna Miecznikowski-Fuenfschilling - professor at the Institute of Italian Studies and the Institute of argumentation, linguistics and semiotics of USI Università della Svizzera italiana, and Nina Profazi - research assistant in the project “Data-sharing skills in corpus-based research on talk-in-interaction”, which is part of the ORD program funded by swissuniversities. You will find a discussion about the process of de-identification of data. Access the slides and the recording*.

Data preservation: talk by Thomas Schmidt - computer scientist specialized in the field of methodology and technology for working with audiovisual language data and in computer-assisted lexicography. You will find a discussion about data management and preservation, a use-case about working with sensitive data (the FOLK project), and a method for data anonymisation. Access the slides and the recording*.

*The recordings are password protected, contact us if you are interested in getting access.

DARIAH ELDAH Consent Form wizard

The DARIAH ELDAH Consent Form Wizard is an online tool that enables researchers to quickly generate a GDPR-compliant consent form for collecting personal data for research purposes, but which can also be used, for example, for creating mailing lists or organizing academic events. Currently the tool is available in English, German, Italian and Croatian, although there are plans to have it translated to other languages. The tools is created by the members of the CLARIN Committee for Legal and Ethical Issues and of the DARIAH ELDAH Ethics and Legality in Digital Arts and Humanities Working Group.

DARIAH ELDAH Consent Form Wizard

DMLawTool

The Università delle Svizzera italiana (USI) and the University of Neuchâtel (UNINE), both members of CLARIN-CH, have developed a tool that guides researchers through the most relevant legal aspects of research data management and proposes possible solution approaches to copyright and data protection issues.

DMLawTool