LCP workshop at SwissText 2025: Bring your own data!

The LiRI Corpus Platform (LCP) is a new web-based tool designed to handle and analyze linguistic data. It allows users to carry out various tasks on corpora, such as querying and performing analyses across multiple modalities: text (via the interface catchphrase), audio (via the interface soundscript) and video (via the interface videoscope). LCP is designed to support a range of linguistic research needs, from corpus import to complex analysis, offering both user-friendly interfaces and advanced query options for researchers. Users can query corpora directly from their browser and import their own corpora using a command-line interface. In this workshop, researchers from various disciplines who use language data are invited to come with their own corpora and relevant research questions to learn how they can use LCP to extract the information they seek from their corpus.

Workshop by LiRI and CLARIN-CH: Join us at SwissText 2025 and learn how to use the LCP

The SwissText 2025 conference takes place on May 14–15th, 2025. Our workshop will take place on May 14th from 14:00 to 17:30 with a 30-minute coffee break. You can find the official programme of the conference here.

To participate in the workshop, please register with this sign-up form. Note that CLARIN-CH consortium members benefit from 20% discount for the conference registration fees.

Intended Audience and Call for Contributions

We invite researchers from different disciplines (linguistics, media science, political sciences, digital humanities, film studies, etc.) working with language and/or multimodal data (audio, video) to submit a data pitch proposal. The proposal should contain a brief description of the data and include tractable questions that the researchers intend to address by querying their data (see below for an example) as well as a sample of the data to be imported into LCP.

Example of a corpus with a related research question: The Text+Berg corpus gathers journal articles about mountain exploration across over 100 years, which can be queried to answer the question “how did the proportion of mentions of male vs. female explorers evolve over time?”

Participants will need to provide their data beforehand, according to technical specifications that will be included in the call for contributions, so it can be pre- processed and imported into a private group of corpora in LCP , which will then be shared with the other participants. In addition, participants are invited to take copyright and data-sensitivity questions into consideration if sending non-open research data (ORD). Participants with and without NLP knowledge are equally welcome. No prior NLP knowledge is required from learning how to use the LCP. Participants who have NLP skills will benefit from the workshop by focusing on more advanced aspects of data pre- processing, building of data models and data query. The content of session 1 (presentation of LCP and demo of data upload) will be adapted to the level of participants’ NLP knowledge.