Document Detail


Disambiguation in the biomedical domain: the role of ambiguity type.
MedLine Citation:
PMID:  20816855     Owner:  NLM     Status:  In-Process    
Abstract/OtherAbstract:
Word Sense Disambiguation (WSD), the automatic identification of the meanings of ambiguous terms in a document, is an important stage in text processing. We describe a WSD system that has been developed specifically for the types of ambiguities found in biomedical documents. This system uses a range of knowledge sources. It employs both linguistic features, such as local collocations, and features derived from domain-specific knowledge sources, the Unified Medical Language System (UMLS) and Medical Subject Headings (MeSH). This system is applied to three types of ambiguities found in Medline abstracts: ambiguous terms, abbreviations with multiple expansions and names that are ambiguous between genes. The WSD system is applied to the standard NLM-WSD data set, which consists of ambiguous terms from Medline abstracts, and was found to perform well in comparison with previously reported results. The system's performance and the contribution of each knowledge source depends upon the type of lexical ambiguity. 87.9% of the ambiguous terms are correctly disambiguated using a combination of linguistic features and MeSH terms, 99% of abbreviations are disambiguated by combining all knowledge sources, while 97.2% of ambiguous gene names are disambiguated using the MeSH terms alone. Analysis reveals that these differences are caused by the nature of each ambiguity type. These results should be taken into account when deciding which information to use for WSD and the level of performance that can be expected.
Authors:
Mark Stevenson; Yikun Guo
Related Documents :
24290915 - Mathematical modeling of brain glioma growth using modified reaction-diffusion equation...
20843505 - Getting past first base: going all the way with cognitive work analysis.
25479765 - Dimensional structure of dsm-5 posttraumatic stress symptoms: support for a hybrid anhe...
19340325 - Measurement of integrated healthcare delivery: a systematic review of methods and futur...
20538065 - Slow eeg pattern predicts reduced intrinsic functional connectivity in the default mode...
21686085 - Clarifying the role of principal stratification in the paired availability design.
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't     Date:  2010-09-09
Journal Detail:
Title:  Journal of biomedical informatics     Volume:  43     ISSN:  1532-0480     ISO Abbreviation:  J Biomed Inform     Publication Date:  2010 Dec 
Date Detail:
Created Date:  2010-11-24     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  100970413     Medline TA:  J Biomed Inform     Country:  United States    
Other Details:
Languages:  eng     Pagination:  972-81     Citation Subset:  IM    
Copyright Information:
Copyright © 2010 Elsevier Inc. All rights reserved.
Affiliation:
Natural Language Processing Group, Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello, Sheffield S14DP, United Kingdom. m.stevenson@dcs.shef.ac.uk
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  IL-2 induces conformational changes in its preassembled receptor core, which then migrates in lipid ...
Next Document:  A lubrication analysis of pharyngeal peristalsis: application to flavour release.