Document Detail


Applying active learning to assertion classification of concepts in clinical text.
MedLine Citation:
PMID:  22127105     Owner:  NLM     Status:  Publisher    
Abstract/OtherAbstract:
Supervised machine learning methods for clinical natural language processing (NLP) research require a large number of annotated samples, which are very expensive to build because of the involvement of physicians. Active learning, an approach that actively samples from a large pool, provides an alternative solution. Its major goal in classification is to reduce the annotation effort while maintaining the quality of the predictive model. However, few studies have investigated its uses in clinical NLP. This paper reports an application of active learning to a clinical text classification task: to determine the assertion status of clinical concepts. The annotated corpus for the assertion classification task in the 2010 i2b2/VA Clinical NLP Challenge was used in this study. We implemented several existing and newly developed active learning algorithms and assessed their uses. The outcome is reported in the global ALC score, based on the Area under the average Learning Curve of the AUC (Area Under the Curve) score. Results showed that when the same number of annotated samples was used, active learning strategies could generate better classification models (best ALC - 0.7715) than the passive learning method (random sampling) (ALC - 0.7411). Moreover, to achieve the same classification performance, active learning strategies required fewer samples than the random sampling method. For example, to achieve an AUC of 0.79, the random sampling method used 32 samples, while our best active learning algorithm required only 12 samples, a reduction of 62.5% in manual annotation effort.
Authors:
Yukun Chen; Subramani Mani; Hua Xu
Related Documents :
20841115 - Training consultation to the cairo family planning association, egypt.
20594385 - The impact of education on care practices: an exploratory study of the influence of "ac...
959505 - Advisory committees: free advice--for a price.
12314145 - Peculiarities in the formation of populated places on the seaboard of the ussr.
19874495 - When only the real thing will do: junior medical students' learning from real patients.
17594495 - Northstar, a support tool for the design and evaluation of quality improvement interven...
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2011-11-22
Journal Detail:
Title:  Journal of biomedical informatics     Volume:  -     ISSN:  1532-0480     ISO Abbreviation:  -     Publication Date:  2011 Nov 
Date Detail:
Created Date:  2011-11-30     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  100970413     Medline TA:  J Biomed Inform     Country:  -    
Other Details:
Languages:  ENG     Pagination:  -     Citation Subset:  -    
Copyright Information:
Copyright © 2011. Published by Elsevier Inc.
Affiliation:
Department of Biomedical Informatics, Vanderbilt University, School of Medicine, Nashville, TN, USA.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Comparison of bioassay responses to the potential fungal biopesticide Metarhizium anisopliae in Rhip...
Next Document:  Symmetry preservation in a new noncentrosymmetric lattice comprised of acentric POM clusters residin...