This is an old revision of the document!
Back to the overview
Publication Date: 2024-05-07
We are happy to announce the webinar Case study on anonymisation of textual data, which is organised within the CLARIN-CH Working Group Management of Sensitive and Personal data, Ethical and Legal issues for linguistic data.
Abstract: We describe our case study of anonymisation that we have implemented for the Swiss Federal Archive (Bundesarchiv, BAR). We automatically redact person names, company names, dates and social security numbers in titles, using an ensemble approach combining several methods, ranging from logistic regression and conditional random fields to neural networks. Our approach gives precedence to high recall. At the token level, person name recognition attains above 98% recall at above 92% precision.