Data sharing

How to share your language resources?

When sharing language data, the FAIR principles can serve you as a guide in the process of making your resource available to other researchers in a useful way and thereby contribute to facilitating knowledge discovery.

You have several options to increase the FAIR-ness of your data:

For corpora

1️⃣ Publish and archive the data with LaRS@SWISSUbase

2️⃣ Include the corpus on the Linguistic Corpus Platform (LCP)

3️⃣ Add the corpus to the SSH Open Marketplace

4️⃣ Add your corpus on the webpage of the CLARIN Resource Families


For tools

1️⃣ Add your tool to the CLARIN Switchboard

2️⃣ Add your tool to the SSH Open Marketplace

3️⃣ Add your tool on the webpage of the CLARIN Resource Families


For lexical resources

1️⃣ Add your lexical resource on the SSH Open Marketplace

2️⃣ Add your lexical resource on the webpage of the CLARIN Resource Families


What are the recommended standard data formats?

Using standardized formats ensures that the data can be read/processed with widely used software. This makes your data easier to be integrated into various existing linguistic analysis tools or workflows, enhancing the accessibility and utility of your data.

Additionally, standardized data formats facilitate collaboration among researchers and institutions by reducing compatibility issues and promoting interoperability. This seamless exchange of linguistic data in a common format fosters a more open and collaborative research environment, accelerating the progress of linguistic studies and advancing our understanding of language in diverse contexts.

➡️ Researchers are encouraged to prioritize the use of standardized formats to maximize the impact of their work and contribute to the advancement of their field.

You can consult this CLARIN page on format recommendations to check whether you are using one of the standardized formats. More information can be found here: Standard data formats. For converting data or file formats, consider the SSH Conversion Hub in order to find a suitable tool.


I want to share my data. How can I find a suitable repository?

While there are innumerable options for sharing research data, it makes sense to follow recommendations for repositories that ensure the FAIRness of your data and support open research data practices, such as this list given by the Swiss National Science Foundation (SNSF).

CLARIN-CH recommends the Language Repository of Switzerland (LaRS@SWISSUBase) and the Linguistic Corpus Platform (LCP), which are specifically tailored to linguistic data and free for members of CLARIN-CH institutions.

➡️ More options can be found here: How to find a suitable repository