CLARIN-CH Day 2024: Open Research Data – Challenges and Opportunities

September 9th, University of Neuchâtel

The event is the first of a series of annual meetings of the CLARIN-CH community. It is organized by the CLARIN-CH consortium in cooperation with its member institutions and aims to support the scientific community in their challenges when it comes to Open Research data. It seeks to foster exchange and to enable the encounter between researchers and data management experts.

The 2024 edition aims to bring together experts and researchers to discuss challenges and opportunities, and to open the dialogue on standards and practices of open research data as well as the legal and ethical aspects of processing and sharing linguistic data. The event builds on the work done by two CLARIN-CH Working Groups, which address essential topics related to Open Research Data.

The full-day event will be divided into three parts: To open the day, the two working groups of CLARIN-CH will present the results of their collaborative work and the documentation of best practices to the linguistic community in Switzerland. The morning will continue with pitches by researchers, in which they will present their challenges on the management and sharing of linguistic data. The afternoon will be dedicated to a World Café Workshop, where researchers and participants can meet experts on four different topics (see 1 to 4 below) to discuss their challenges and explore potential solutions.

Call for participation

For the morning sessions, we welcome submissions for data pitches on one or several of the four topics listed below. Data pitches should address challenges you face(d) related to the chosen theme(s) and if possible, present solutions. Each data pitch should be no longer than 5-10 minutes and held in English. Presentation slides are recommended.

Topics

  1. Copyright describes the rights that creators have over their literary and artistic works, including their data. Researchers can encounter several challenges related to copyright when handling their data. These challenges can impact data sharing and reuse. Some key issues include questions around what can be shared, how to attribute sources, and whether specific data can be used freely or requires permission. Copyright considerations also come into play when deciding how to license the data for sharing.

  2. When it comes to data protection and the management of personal and sensitive data, several critical issues arise: Researchers need to find the right balance between sharing data for research purposes and safeguarding individuals’ privacy. De-identification techniques can help here. They come with their own risks, however: It is quite impossible to render a dataset completely anonymous without also jeopardizing data utility, other datasets and additional information can potentially lead to re-identification of individuals and some types of linguistic data are less suited for these techniques. Security measures need to be taken to safe-guard personal and sensitive data, which poses additional problems, e.g. in collaborative projects. Also, linguists collecting data in other countries or from specific target groups need to navigate the complex legal landscape to ensure that the management and sharing of linguistic data comply with all relevant laws and regulations.

  3. With regard to data formats (e.g. audio, video, text) and their technical aspects, linguists can also encounter various challenges. These include integrating diverse data formats for comprehensive linguistic analyses, harmonizing multimodal data, with annotating linguistic data consistently across formats, maintaining uniformity in transcription conventions, in part-of-speech tagging and in semantic labeling, with storing large-scale data (especially video) efficiently and having appropriate retrieval solutions, with optimizing storage formats, indexing and query performance, and with applying standardized formats across various systems.

  4. To store and share their data, linguists are further presented with the issue of selecting appropriate repositories that align with their data type (and the intended audience), of storing the data in appropriate (standardized) formats, and of working collaboratively while effectively with other team members. Additionally, they are faced with the challenges of safeguarding sensitive and personal data while also making it accessible and taking into account expectations and value-systems of participants, as well as ensuring the researchers’ responsibility towards the population from which data has been gathered (according to the CARE principles).

Information on submission

Please submit a 200-word abstract (approximately) via our submission form. In the submission form, you need to indicate the topic (see 1-4 above) of your contribution and paste your abstract directly into the required field. Please be aware that abstracts of data pitches accepted for the CLARIN-CH Day will be made available in a book of abstract.

Call opening: 24 April 2024
Deadline for submission: 14 June 2024
Notification of acceptance: 15 July 2024
Registration opens: 15 July 2024
Registration closes: 01 September 2024

Organizing committee:

  • Anita Auer (UNIL)
  • Cristina Grisot (UZH, CLARIN-CH national coordinator)
  • Martin Hilpert (UNINE)
  • Julia Krasselt (ZHAW)
  • Martin LuginbĂĽhl (UNIBAS)
  • Johanna Miecznikowski-FĂĽnfschilling (USI)
  • Seraina Nadig (CLARIN-CH)
  • Melanie Röthlisberger (UZH)
  • Simon van Rekum (ZHAW)

The event is organised with the financial support of the CLARIN-CH Consortium, the Swiss Academy for Humanities and Social Sciences, the Zurich University of Applied Sciences and hosted by the University of Neuchâtel.






clarin-day-2024.txt · Last modified: 2024/05/03 16:12 by Seraina Nadig