Assisted annotation of medical free text using RapTAT.

Gobbel GT, Garvin J, Reeves R, Cronin RM, Heavirland J, Williams J, Weaver A, Jayaramaraja S, Giuse D, Speroff T, Brown SH, Xu H, Matheny ME
J Am Med Inform Assoc. 2014 21 (5): 833-41

PMID: 24431336 · PMCID: PMC4147611 · DOI:10.1136/amiajnl-2013-002255

OBJECTIVE - To determine whether assisted annotation using interactive training can reduce the time required to annotate a clinical document corpus without introducing bias.

MATERIALS AND METHODS - A tool, RapTAT, was designed to assist annotation by iteratively pre-annotating probable phrases of interest within a document, presenting the annotations to a reviewer for correction, and then using the corrected annotations for further machine learning-based training before pre-annotating subsequent documents. Annotators reviewed 404 clinical notes either manually or using RapTAT assistance for concepts related to quality of care during heart failure treatment. Notes were divided into 20 batches of 19-21 documents for iterative annotation and training.

RESULTS - The number of correct RapTAT pre-annotations increased significantly and annotation time per batch decreased by ~50% over the course of annotation. Annotation rate increased from batch to batch for assisted but not manual reviewers. Pre-annotation F-measure increased from 0.5 to 0.6 to >0.80 (relative to both assisted reviewer and reference annotations) over the first three batches and more slowly thereafter. Overall inter-annotator agreement was significantly higher between RapTAT-assisted reviewers (0.89) than between manual reviewers (0.85).

DISCUSSION - The tool reduced workload by decreasing the number of annotations needing to be added and helping reviewers to annotate at an increased rate. Agreement between the pre-annotations and reference standard, and agreement between the pre-annotations and assisted annotations, were similar throughout the annotation process, which suggests that pre-annotation did not introduce bias.

CONCLUSIONS - Pre-annotations generated by a tool capable of interactive training can reduce the time required to create an annotated document corpus by up to 50%.

Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to

MeSH Terms (5)

Artificial Intelligence Electronic Health Records Heart Failure Humans Natural Language Processing

Connections (1)

This publication is referenced by other Labnodes entities: