This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
documentation-platform:data-sharing [2024/01/12 14:19] – removed - external edit (Unknown date) 127.0.0.1 | documentation-platform:data-sharing [2024/04/10 13:38] (current) – Seraina Nadig | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | <WRAP twothirds column> | ||
+ | ====== Data sharing ====== | ||
+ | </ | ||
+ | <WRAP colsmall>< | ||
+ | <WRAP clear/> | ||
+ | |||
+ | ==== How to share your language resources? ==== | ||
+ | |||
+ | When sharing | ||
+ | |||
+ | You have several options to increase the FAIR-ness of your data: | ||
+ | |||
+ | ==== For corpora ==== | ||
+ | |||
+ | ++++ 1️⃣ Publish and archive the data with LaRS@SWISSUbase | | ||
+ | |||
+ | **LaRS@SWISSUbase** offers an easy-to-use and reliable platform for sharing your data. It has been established as a cross-disciplinary and **FAIR-compliant national research data service** in 2022. It includes a searchable catalogue with a growing number of studies and research data sets, for which SWISSUbase provides a solution for long-term storing. | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | <WRAP centeralign> | ||
+ | |||
+ | In September 2022, the **Language Repository of Switzerland (LaRS)** was introduced as a discipline-specific data service unit (DSU) of SWISSUbase. Researchers from CLARIN-CH institutions are invited to share their publications and datasets on the platform to benefit from this new infrastructure and make their research accessible to the community. | ||
+ | |||
+ | Data can be published at various degrees of openness to account for data including [[documentation-platform: | ||
+ | |||
+ | Learn more on the [[https:// | ||
+ | |||
+ | ++++ | ||
+ | |||
+ | ++++ 2️⃣ Include the corpus on the Linguistic Corpus Platform (LCP) | | ||
+ | |||
+ | **The Linguistic Corpus Platform (LCP)** is being developed at [[https:// | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | <WRAP centeralign> | ||
+ | |||
+ | **The LCP can be accessed by all CLARIN-CH institutions** and will offer the option to upload your own corpus for data exploration and analysis. The LCP uses its own query language which allows for powerful, complex queries on **text data and time-aligned multimodal data**, such as video recordings of sign language and interactional data. | ||
+ | |||
+ | If you want to find out more about how to use the LCP, have a look at the [[https:// | ||
+ | |||
+ | ++++ 3️⃣ Add the corpus to the SSH Open Marketplace | | ||
+ | |||
+ | The **SSH Open Marketplace** is a European discovery platform for resources from the Social Sciences and Humanities (SSH) field. | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | <WRAP centeralign> | ||
+ | |||
+ | In order to register your corpus, you can follow [[https:// | ||
+ | ++++ | ||
+ | |||
+ | ++++ 4️⃣ Add your corpus on the webpage of the CLARIN Resource Families | | ||
+ | |||
+ | The **CLARIN Resource Families** website provides an overview of the available language resources in the CLARIN infrastructure per data type. The following types of corpora are listed: | ||
+ | |||
+ | * Computer-Mediated Communication Corpora | ||
+ | * Corpora of Academic Texts | ||
+ | * Historical Corpora | ||
+ | * L2 Learner Corpora | ||
+ | * Legal Corpora | ||
+ | * Literary Corpora | ||
+ | * Manually Annotated Corpora | ||
+ | * Multimodal Corpora | ||
+ | * Newspaper Corpora | ||
+ | * Oral History Corpora | ||
+ | * Parallel Corpora | ||
+ | * Parliamentary Corpora | ||
+ | * Reference Corpora | ||
+ | * Sign Language Resources | ||
+ | * Spoken Corpora | ||
+ | |||
+ | Discover the <wrap button> | ||
+ | |||
+ | [[mailto: | ||
+ | |||
+ | ++++ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== For tools ==== | ||
+ | |||
+ | ++++ 1️⃣ Add your tool to the CLARIN Switchboard | | ||
+ | |||
+ | The **CLARIN Language Resource Switchboard** is a tool that helps researchers to find a matching language processing web application for their data. After uploading a file or entering a URL, the Switchboard provides a list of available CLARIN tools to perform the task indicated by the researcher (e.g. Named Entity Recognition, | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | <WRAP centeralign> | ||
+ | |||
+ | /*There is a list of the [[https:// | ||
+ | Information on **how to add your tool to the Switchboard Tool Registry** is available on the [[https:// | ||
+ | ++++ | ||
+ | |||
+ | ++++ 2️⃣ Add your tool to the SSH Open Marketplace | | ||
+ | |||
+ | The **SSH Open Marketplace** is a European discovery platform for resources from the Social Sciences and Humanities (SSH) field. | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | <WRAP centeralign> | ||
+ | |||
+ | In order to register your tool, you can follow [[https:// | ||
+ | ++++ | ||
+ | |||
+ | ++++ 3️⃣ Add your tool on the webpage of the CLARIN Resource Families | | ||
+ | |||
+ | The **CLARIN Resource Families** website provides an overview of the available language resources in the CLARIN infrastructure per data type. The following types of tools are listed: | ||
+ | |||
+ | * Corpus Query Tools | ||
+ | * Normalisation | ||
+ | * Named Entity Recognition | ||
+ | * Part-of-Speech Tagging and Lemmatisation | ||
+ | * Tools for Sentiment Analysis | ||
+ | |||
+ | Discover the <wrap button> | ||
+ | |||
+ | [[mailto: | ||
+ | ++++ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== For lexical resources ==== | ||
+ | |||
+ | ++++ 1️⃣ Add your lexical resource on the SSH Open Marketplace | | ||
+ | |||
+ | The **SSH Open Marketplace** is a European discovery platform for resources from the Social Sciences and Humanities (SSH) field. | ||
+ | |||
+ | [[https:// | ||
+ | |||
+ | <WRAP centeralign> | ||
+ | |||
+ | In order to register your lexical resource, you can follow [[https:// | ||
+ | |||
+ | ++++ | ||
+ | |||
+ | |||
+ | ++++ 2️⃣ Add your lexical resource on the webpage of the CLARIN Resource Families | | ||
+ | |||
+ | The **CLARIN Resource Families** website provides a user-friendly overview per data type of the available language resources in the CLARIN infrastructure. The following types of lexical resources are listed: | ||
+ | |||
+ | * Language Models | ||
+ | * Lexica | ||
+ | * Dictionaries | ||
+ | * Conceptual Resources | ||
+ | * Glossaries | ||
+ | * Wordlists | ||
+ | |||
+ | Discover the <wrap button> | ||
+ | |||
+ | [[mailto: | ||
+ | ++++ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== What are the recommended standard data formats? ==== | ||
+ | |||
+ | Using standardized formats ensures that the data can be read/ | ||
+ | |||
+ | Additionally, | ||
+ | |||
+ | ➡️ Researchers are encouraged to **prioritize the use of standardized formats** to maximize the impact of their work and contribute to the advancement of their field. | ||
+ | |||
+ | You can consult this [[https:// | ||
+ | |||
+ | ---- | ||
+ | |||
+ | ==== I want to share my data. How can I find a suitable repository? ==== | ||
+ | |||
+ | While there are innumerable options for sharing research data, it makes sense to follow recommendations for repositories that ensure the FAIRness of your data and support open research data practices, such as [[https:// | ||
+ | |||
+ | CLARIN-CH recommends the [[https:// | ||
+ | |||
+ | ➡️ More options can be found here: <wrap button> [[documentation-platform: |