The Language Data Across Disciplines — LaDaD
Mission
The Language Data Across Disciplines (LaDaD) project aims to strengthen Open Research Data (ORD) practices in Switzerland for all researchers working with language data. By bridging disciplinary boundaries, LaDaD promotes FAIR-aligned, ethical, and sustainable approaches to collecting, processing, analyzing, sharing, and preserving language data. The mission is to empower researchers with the competencies, tools, and collaborative networks needed to conduct transparent, reproducible, and innovative research using language-based evidence.
Institutions and people
Description
LaDaD is a one-year ORD project (January-December 2026) co-funded through swissuniversities Open Science II programme that brings together researchers and data stewards across Switzerland who rely on language data — whether in medicine, business, finance, journalism, history, theology, law, or other disciplines. The project evaluates current skills and practices related to handling language data, identifies gaps in knowledge, and develops tailored open training materials following the Open-by-design approach.
Combining community building with empirical assessment, the project integrates national surveys, case studies, and focus groups to understand real-world practices, challenges, and expectations. It then transforms these insights into practical resources that support both independent learning and teaching. LaDaD also highlights how NLP and data-driven methods can complement qualitative approaches and open new research perspectives.
Assess Needs and Identify ORD Skill Gaps
- Conduct a national survey and disciplinary case studies
- Map actual vs. required ORD practices in managing language data
- Assess competencies and workflows across disciplines
- Identify challenges in handling text, audio, video, multimodal, and sensitive data
- Produce an inventory of ORD skills, tools, formats, and training needs
Build a National Community of Researchers and Data
- Connect researchers and data stewards working with language data across Switzerland
- Collaborate with the Swiss Research Data Support Network (SRDSN)
- Organize focus groups to share insights and discuss challenges
- Co-develop recommendations for ORD practices and training priorities
Build Capacities and Develop Training Materials
Open-by-design training resources will cover:
- Data life cycle: collection, cleaning, structuring, annotation, sharing, archiving
- Ethical & legal compliance: consent, data protection, copyright, NDPA, ethics approval
- Tools & platforms: LiRI Corpus Platform, Swiss-AL, Swissdox@LiRI, LaRS
- Methods: combining qualitative and quantitative analyses; introducing NLP workflows
- Discipline-specific workflows for medicine, theology, finance, journalism, history, and more
Target Audience and Disciplines
- Researchers from any field who work with language data
- Data stewards and ORD specialists
- Educators seeking reusable, open teaching materials
- Graduate students and early-career researchers interested in ORD and NLP
- Research support staff and service platform operators
Expected Outcomes
National Landscape
A national overview of existing ORD practices and skill gaps related to language data across Switzerland.
Disciplinary Insights
Deep-dives into specific challenges, workflows, and required competencies for different academic fields.
Networked Community
A robust community of researchers and data stewards engaged in FAIR-aligned language data practices.
Actionable Strategy
Concrete recommendations for future ORD training and policy implementation at higher education institutions.
Open Materials
Open, reusable, and adaptable training materials covering the full data life cycle (FAIR & Open Science).
Ethical Excellence
Strengthened ORD culture with enhanced reproducibility, transparency, and ethical compliance.