Titre
Accelerating Clinical Text Annotation in Underrepresented Languages: A Case Study on Text De-Identification
Type
chapitre
Institution
UNIL/CHUV/Unisanté + institutions partenaires
Série
Studies in Health Technology and Informatics
Auteur(s)
Xu, He A.
Auteure/Auteur
Loftsson, Valentin
Auteure/Auteur
Kulynych, Bogdan
Auteure/Auteur
Kaabachi, Bayrem
Auteure/Auteur
Raisaro, Jean Louis
Auteure/Auteur
Liens vers les personnes
Liens vers les unités
Maison d’édition
IOS Press
Titre du livre ou conférence/colloque
Digital Health and Informatics Innovations for Sustainable Health Care Systems
ISBN du livre
9781643685335
Statut éditorial
Publié
Date de publication
2024-08-22
Volume
316
Première page
853
Dernière page/numéro d’article
857
Peer-reviewed
Oui
Langue
anglais
Résumé
Clinical notes contain valuable information for research and monitoring quality of care. Named Entity Recognition (NER) is the process for identifying relevant pieces of information such as diagnoses, treatments, side effects, etc., and bring them to a more structured form. Although recent advancements in deep learning have facilitated automated recognition, particularly in English, NER can still be challenging due to limited specialized training data. This exacerbated in hospital settings where annotations are costly to obtain without appropriate incentives and often dependent on local specificities. In this work, we study whether this annotation process can be effectively accelerated by combining two practical strategies. First, we convert usually passive annotation tasks into a proactive contest to motivate human annotators in performing a task often considered tedious and time-consuming. Second, we provide pre-annotations for the participants to evaluate how recall and precision of the pre-annotations can boost or deteriorate annotation performance. We applied both strategies to a text de-identification task on French clinical notes and discharge summaries at a large Swiss university hospital. Our results show that proactive contest and average quality pre-annotations can significantly speed up annotation time and increase annotation quality, enabling us to develop a text de-identification model for French clinical notes with high performance (F1 score 0.94).
PID Serval
serval:BIB_9D47A8BCBA47
PMID
Open Access
Oui
Date de création
2024-08-30T08:45:04.010Z
Date de création dans IRIS
2025-05-21T01:13:41Z
Fichier(s)![Vignette d'image]()
En cours de chargement...
Nom
39176927.pdf
Version du manuscrit
published
Licence
https://creativecommons.org/licenses/by-nc/4.0
Taille
189.18 KB
Format
Adobe PDF
PID Serval
serval:BIB_9D47A8BCBA47.P001
URN
urn:nbn:ch:serval-BIB_9D47A8BCBA473
Somme de contrôle
(MD5):2f09fe7b7212320b1009123b4ce343eb