Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
documentation-platform:data-sharing [2024/01/12 14:19] – removed - external edit (Unknown date) 127.0.0.1documentation-platform:data-sharing [2024/04/10 13:38] (current) Seraina Nadig
Line 1: Line 1:
 +<WRAP twothirds column>
 +====== Data sharing ======
 +</WRAP>
  
 +<WRAP colsmall><WRAP rightalign>[[documentation-platform:start|Back to the overview]]</WRAP></WRAP>
 +<WRAP clear/>
 +
 +==== How to share your language resources? ====
 +
 +When sharing  language data, the [[documentation-platform:fair-data|FAIR principles]] can serve you as a guide in the process of making your resource available to other researchers in a useful way and thereby contribute to facilitating knowledge discovery.
 +
 +You have several options to increase the FAIR-ness of your data:
 +
 +==== For corpora ====
 +
 +++++ 1️⃣ Publish and archive the data with LaRS@SWISSUbase |
 +
 +**LaRS@SWISSUbase** offers an easy-to-use and reliable platform for sharing your data. It has been established as a cross-disciplinary and **FAIR-compliant national research data service** in 2022. It includes a searchable catalogue with a growing number of studies and research data sets, for which SWISSUbase provides a solution for long-term storing.
 +
 +[[https://www.swissubase.ch/en/|{{ :documentation-platform:swissubase.png?nolink&400 |}}]]
 +
 +<WRAP centeralign>➡️ Go to the [[https://www.swissubase.ch/en/|SWISSUbase]] website here.</WRAP>
 +
 +In September 2022, the **Language Repository of Switzerland (LaRS)** was introduced as a discipline-specific data service unit (DSU) of SWISSUbase. Researchers from CLARIN-CH institutions are invited to share their publications and datasets on the platform to benefit from this new infrastructure and make their research accessible to the community.
 +
 +Data can be published at various degrees of openness to account for data including [[documentation-platform:data-protection|sensitive information]]. If your corpus cannot be shared openly, you can either publish the metadata of your corpus or the data itself and choose a closed-type of licence.
 +
 +Learn more on the [[https://resources.swissubase.ch/help/benutzungsleitfaden/linguistics-resources/|SWISSUbase page for linguistic resources]], where you can read the User Guide and find out more about the process as well as [[test:documentation-platform:metadata]] requirements. Additionally, the FORS Center provides a detailed [[https://forscenter.ch/wp-content/uploads/2022/11/the-preparation-of-social-science-data-for-swissubase-in-detail_en.pdf|practical guide on sharing SSH data on SWISSUbase]].
 +
 +++++
 +
 +++++ 2️⃣ Include the corpus on the Linguistic Corpus Platform (LCP) |
 +
 +**The Linguistic Corpus Platform (LCP)** is being developed at [[https://www.liri.uzh.ch/en.html|LiRI]] as a tool to make corpora searchable through a web interface:
 +
 +[[https://lcp.linguistik.uzh.ch/|{{ :documentation-platform:lcp.png?nolink&400 |}}]]
 +
 +<WRAP centeralign>➡️ Check out the **pre-released version** of the [[https://lcp.linguistik.uzh.ch/|LCP]] here.</WRAP>
 +
 +**The LCP can be accessed by all CLARIN-CH institutions** and will offer the option to upload your own corpus for data exploration and analysis. The LCP uses its own query language which allows for powerful, complex queries on **text data and time-aligned multimodal data**, such as video recordings of sign language and interactional data.
 +
 +If you want to find out more about how to use the LCP, have a look at the [[https://liri.linguistik.uzh.ch/wiki/langtech/lcp/start|LCP documentation page]]. ++++
 +
 +++++ 3️⃣ Add the corpus to the SSH Open Marketplace |
 +
 +The **SSH Open Marketplace** is a European discovery platform for resources from the Social Sciences and Humanities (SSH) field.
 +
 +[[https://marketplace.sshopencloud.eu/|{{ :documentation-platform:ssh-open-marketplace.png?nolink&400 |}}]]
 +
 +<WRAP centeralign>➡️ Discover the [[https://marketplace.sshopencloud.eu/|SSH Open Marketplace]] here.</WRAP>
 +
 +In order to register your corpus, you can follow [[https://marketplace.sshopencloud.eu/contribute/create-an-individual-item|these steps]] (choose the //dataset// item category).
 +++++
 +
 +++++ 4️⃣ Add your corpus on the webpage of the CLARIN Resource Families |
 +
 +The **CLARIN Resource Families** website provides an overview of the available language resources in the CLARIN infrastructure per data type. The following types of corpora are listed:
 +
 +  * Computer-Mediated Communication Corpora
 +  * Corpora of Academic Texts
 +  * Historical Corpora
 +  * L2 Learner Corpora
 +  * Legal Corpora
 +  * Literary Corpora
 +  * Manually Annotated Corpora
 +  * Multimodal Corpora
 +  * Newspaper Corpora
 +  * Oral History Corpora
 +  * Parallel Corpora
 +  * Parliamentary Corpora
 +  * Reference Corpora
 +  * Sign Language Resources
 +  * Spoken Corpora
 +
 +Discover the <wrap button>[[https://www.clarin.eu/resource-families|CLARIN Resource Families]]</wrap>
 +
 +[[mailto:contact@clarin-ch.ch|Contact us]] if you want to list your corpus in one of these categories!
 +
 +++++
 +
 +----
 +
 +==== For tools ====
 +
 +++++ 1️⃣ Add your tool to the CLARIN Switchboard |
 +
 +The **CLARIN Language Resource Switchboard** is a tool that helps researchers to find a matching language processing web application for their data. After uploading a file or entering a URL, the Switchboard provides a list of available CLARIN tools to perform the task indicated by the researcher (e.g. Named Entity Recognition, lemmatization, POS-tagging).
 +
 +[[https://switchboard.clarin.eu/|{{ :documentation-platform:switchboard.png?nolink&400 |}}]]
 +
 +<WRAP centeralign> ➡️ Discover the [[https://switchboard.clarin.eu/|CLARIN Switchboard]] here. </WRAP>
 +
 +/*There is a list of the [[https://switchboard.clarin.eu/tools|currently listed tools]] on the website.*/
 +Information on **how to add your tool to the Switchboard Tool Registry** is available on the [[https://github.com/clarin-eric/switchboard-tool-registry#how-to-add-a-tool-to-the-switchboard|GitHub page]]. See the CLARIN Switchboard website for a list of the [[https://switchboard.clarin.eu/tools|currently available tools]].
 +++++
 +
 +++++ 2️⃣ Add your tool to the SSH Open Marketplace |
 +
 +The **SSH Open Marketplace** is a European discovery platform for resources from the Social Sciences and Humanities (SSH) field.
 +
 +[[https://marketplace.sshopencloud.eu/|{{ :documentation-platform:ssh-open-marketplace.png?nolink&400 |}}]]
 +
 +<WRAP centeralign> ➡️ Discover the [[https://marketplace.sshopencloud.eu/|SSH Open Marketplace]] here. </WRAP>
 +
 +In order to register your tool, you can follow [[https://marketplace.sshopencloud.eu/contribute/create-an-individual-item|these steps]] (choose the //Tools & services// item category).
 +++++
 +
 +++++ 3️⃣ Add your tool on the webpage of the CLARIN Resource Families |
 +
 +The **CLARIN Resource Families** website provides an overview of the available language resources in the CLARIN infrastructure per data type. The following types of tools are listed:
 +
 +  * Corpus Query Tools
 +  * Normalisation
 +  * Named Entity Recognition
 +  * Part-of-Speech Tagging and Lemmatisation
 +  * Tools for Sentiment Analysis
 +
 +Discover the <wrap button>[[https://www.clarin.eu/resource-families|CLARIN Resource Families]]</wrap>
 +
 +[[mailto:contact@clarin-ch.ch|Contact us]] if you want to list your tool in one of these categories!
 +++++
 +
 +----
 +
 +==== For lexical resources ====
 +
 +++++ 1️⃣ Add your lexical resource on the SSH Open Marketplace |
 +
 +The **SSH Open Marketplace** is a European discovery platform for resources from the Social Sciences and Humanities (SSH) field.
 +
 +[[https://marketplace.sshopencloud.eu/|{{ :documentation-platform:ssh-open-marketplace.png?nolink&400 |}}]]
 +
 +<WRAP centeralign>➡️ Discover the [[https://marketplace.sshopencloud.eu/|SSH Open Marketplace]] here.</WRAP>
 +
 +In order to register your lexical resource, you can follow [[https://marketplace.sshopencloud.eu/contribute/create-an-individual-item|these steps]] (choose the //Dataset// item category).
 +
 +++++
 +
 +
 +++++ 2️⃣ Add your lexical resource on the webpage of the CLARIN Resource Families |
 +
 +The **CLARIN Resource Families** website provides a user-friendly overview per data type of the available language resources in the CLARIN infrastructure. The following types of lexical resources are listed:
 +
 +  * Language Models
 +  * Lexica
 +  * Dictionaries
 +  * Conceptual Resources
 +  * Glossaries
 +  * Wordlists
 +
 +Discover the <wrap button>[[https://www.clarin.eu/resource-families|CLARIN Resource Families]]</wrap>
 +
 +[[mailto:contact@clarin-ch.ch|Contact us]] if you want to add your lexical resource in one of these categories!
 +++++
 +
 +----
 +
 +==== What are the recommended standard data formats? ==== 
 +
 +Using standardized formats ensures that the data can be read/processed with widely used software. This makes your data easier to be integrated into various existing linguistic analysis tools or workflows, **enhancing the accessibility and utility of your data**.
 +
 +Additionally, standardized data formats **facilitate collaboration** among researchers and institutions by reducing compatibility issues and promoting interoperability. This seamless exchange of linguistic data in a common format fosters a more **open and collaborative research environment**, accelerating the progress of linguistic studies and advancing our understanding of language in diverse contexts.
 +
 +➡️ Researchers are encouraged to **prioritize the use of standardized formats** to maximize the impact of their work and contribute to the advancement of their field.
 +
 +You can consult this [[https://clarin.ids-mannheim.de/standards/views/recommended-formats-with-search.xq|CLARIN page on format recommendations]] to **check whether you are using one of the standardized formats**. More information can be found here: [[documentation-platform:standard-data-formats]]. For converting data or file formats, consider the [[https://conversion-hub.sshopencloud.eu/|SSH Conversion Hub]] in order to find a suitable tool.
 +
 +----
 +
 +==== I want to share my data. How can I find a suitable repository? ====
 +
 +While there are innumerable options for sharing research data, it makes sense to follow recommendations for repositories that ensure the FAIRness of your data and support open research data practices, such as [[https://www.snf.ch/en/WtezJ6qxuTRnSYgF/topic/open-research-data-which-data-repositories-can-be-used|this list]] given by the Swiss National Science Foundation (SNSF).
 +
 +CLARIN-CH recommends the [[https://www.lars.uzh.ch/en.html|Language Repository of Switzerland]] (LaRS@SWISSUBase) and the [[https://lcp.linguistik.uzh.ch/|Linguistic Corpus Platform]] (LCP), which are specifically tailored to linguistic data and free for members of CLARIN-CH institutions.
 +
 +➡️ More options can be found here: <wrap button> [[documentation-platform:data-archiving#I want to archive my research data. How can I find a suitable repository?|How to find a suitable repository]]</wrap>