Webinar: Case study on anonymisation of textual data

Publication Date: 2024-05-07

We are happy to finally announce the date for the webinar Case study on anonymisation of textual data, which is organised within the CLARIN-CH Working Group Management of Sensitive and Personal data, Ethical and Legal issues for linguistic data.


Abstract: We describe our case study of anonymisation that we have implemented for the Swiss Federal Archive (Bundesarchiv, BAR). We automatically redact person names, company names, dates and social security numbers in titles, using an ensemble approach combining several methods, ranging from logistic regression and conditional random fields to neural networks. Our approach gives precedence to high recall. At the token level, person name recognition attains above 98% recall at above 92% precision.

Speakers: Gerold Schneider is Titulary Professor of Computational Linguistics and co-coordinator of LiRI's service area “Natural Language Processing”. Tilia Ellendorff works as Data Scientist with focus on NLP and Human Language Technology at UZH's Linguistic Research Infrastructure (LiRI).

The slides and the recording of the webinar can be accessed here: 💻 Recording 📄 Slides

This webinar is part of a series of webinars, which are taking place monthly during spring 2024 and the autumn semester. Discover the list of planned webinars here.