Document Detail


Using rule-based natural language processing to improve disease normalization in biomedical text.
MedLine Citation:
PMID:  23043124     Owner:  NLM     Status:  Publisher    
Abstract/OtherAbstract:
BACKGROUND AND OBJECTIVE: In order for computers to extract useful information from unstructured text, a concept normalization system is needed to link relevant concepts in a text to sources that contain further information about the concept. Popular concept normalization tools in the biomedical field are dictionary-based. In this study we investigate the usefulness of natural language processing (NLP) as an adjunct to dictionary-based concept normalization. METHODS: We compared the performance of two biomedical concept normalization systems, MetaMap and Peregrine, on the Arizona Disease Corpus, with and without the use of a rule-based NLP module. Performance was assessed for exact and inexact boundary matching of the system annotations with those of the gold standard and for concept identifier matching. RESULTS: Without the NLP module, MetaMap and Peregrine attained F-scores of 61.0% and 63.9%, respectively, for exact boundary matching, and 55.1% and 56.9% for concept identifier matching. With the aid of the NLP module, the F-scores of MetaMap and Peregrine improved to 73.3% and 78.0% for boundary matching, and to 66.2% and 69.8% for concept identifier matching. For inexact boundary matching, performances further increased to 85.5% and 85.4%, and to 73.6% and 73.3% for concept identifier matching. CONCLUSIONS: We have shown the added value of NLP for the recognition and normalization of diseases with MetaMap and Peregrine. The NLP module is general and can be applied in combination with any concept normalization system. Whether its use for concept types other than disease is equally advantageous remains to be investigated.
Authors:
Ning Kang; Bharat Singh; Zubair Afzal; Erik M van Mulligen; Jan A Kors
Related Documents :
23179674 - Molecular markers for the study of streptococcal epidemiology.
23405844 - Aortic stenosis: a general overview of clinical, pathophysiological and therapeutic asp...
21087544 - Detecting new and emerging diseases on livestock farms using an early detection system.
21266204 - Effect of statin therapy on disease progression in pediatric adpkd: design and baseline...
14610914 - Targeting the issues in chronic obstructive lung disease.
22189514 - Current concepts in multiple sclerosis: autoimmunity versus oligodendrogliopathy.
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2012-10-6
Journal Detail:
Title:  Journal of the American Medical Informatics Association : JAMIA     Volume:  -     ISSN:  1527-974X     ISO Abbreviation:  J Am Med Inform Assoc     Publication Date:  2012 Oct 
Date Detail:
Created Date:  2012-10-8     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  9430800     Medline TA:  J Am Med Inform Assoc     Country:  -    
Other Details:
Languages:  ENG     Pagination:  -     Citation Subset:  -    
Affiliation:
Department of Medical Informatics, Erasmus University Medical Center, Rotterdam, The Netherlands.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Presentation of clinical laboratory results: an experimental comparison of four visualization techni...
Next Document:  A cross-sectional study of the association of age, race and ethnicity, and body mass index with sex ...