Data processing and analysis

As every field has its own ways of analysing data, the best practices for data processing heavily depend on the methods you choose for your research. However, some things are relevant for all kinds of research:

  • Keep several copies of your data:
    It is important to have both physical and virtual copies of your research data as back-up. It is also advisable to work with a systematic versioning system:
  • Ensure the integrity of your data
    Take measures to make sure your data is accurate, consistent and complete, e.g. using automation to prevent mistakes arising from manually entered data. Chapter 3 in the CESSDA Data Management Expert Guide contains a detailed guide on this topic: Data entry and integrity
  • Choose interoperable file formats:
    When processing data, you may have to decide on file formats for the output of your analysis. Make sure to use file formats that have high compatibility and are widely used (see Standard data formats).
  • Be careful with personal/sensitive data:
    If your data contains personal information, use anonymization / de-identification procedures before carrying out data analysis (see Data protection).
  • Implement data security measures:
    Make sure your data is stored securely and can only be accessed by authorized users (see Data access and security).

Resources

We recommend familiarizing yourself with the tools that could be useful for processing your research data:

SSH Open Marketplace
The SSH Open Marketplace is a European discovery platform for resources from the Social Sciences and Humanities (SSH) field. It does not only offer language resources but also workflows that are carefully described in a step-by-step guide. For example, you can find a workflow on linguistic annotation of corpora here.


CLARIN Tools

CLARIN centers offer a wide variety of tools that help researchers explore and analyse language data. An interface has been created that combines all these tools:

The CLARIN Language Resource Switchboard is a tool that helps you to find a matching language processing web application for your data. After uploading a file or entering a URL, you can select which task to perform. The Switchboard will then provide you with a list of available CLARIN tools to analyse the input.

Have you developed your own tool which could be useful for other researchers? You can add it to the Switchboard Tool Registry. Find out more about sharing your tools here.


Forschungsdaten.info
This website designed for researchers from DACH countries discusses a lot of topics on research data management in great detail. You might find specific information that is relevant for your research project, for example here:

Useful tools for research data management
Working with large amounts of data
Visualizing data
Data transfer when working with sensitive data
(see also the CLARIN-CH working group on this topic)

documentation-platform/data-processing.txt ยท Last modified: 2024/04/10 10:17 (external edit)