Document Detail

Comparing general and medical texts for information retrieval based on natural language processing: an inquiry into lexical disambiguation.
MedLine Citation:
PMID:  11604745     Owner:  NLM     Status:  MEDLINE    
In this paper we compare two types of corpus, focusing on the lexical ambiguity of each of them. The first corpus consists mainly of general newspaper articles and literature excerpts, while the second belongs to the medical domain. To conduct the study, we have used two different disambiguation tools. First, each tool was validated in its respective application area. We then use these systems in order to assess and compare both the general ambiguity rate and the particularities of each domain. Quantitative results show that medical documents are lexically less ambiguous than unrestricted documents. Our conclusions emphasize the importance of the application area in the design of NLP tools.
P Ruch; R Baud; A Geissbühler; A M Rassinoux
Related Documents :
12463805 - Looking for french-english translations in comparable medical corpora.
9779885 - Acquiring background knowledge for machine learning using function decomposition: a cas...
6409555 - Computer realization of some decision process in neurological diagnostics.
10846345 - The use of instant medical history in a rural clinic. case study of the use of computer...
22117025 - Analysis of australian newspaper coverage of medication errors.
25127535 - Immigrant and refugee health: medical evaluation.
Publication Detail:
Type:  Comparative Study; Journal Article; Research Support, Non-U.S. Gov't    
Journal Detail:
Title:  Studies in health technology and informatics     Volume:  84     ISSN:  0926-9630     ISO Abbreviation:  Stud Health Technol Inform     Publication Date:  2001  
Date Detail:
Created Date:  2001-10-17     Completed Date:  2002-01-08     Revised Date:  2008-07-10    
Medline Journal Info:
Nlm Unique ID:  9214582     Medline TA:  Stud Health Technol Inform     Country:  Netherlands    
Other Details:
Languages:  eng     Pagination:  261-5     Citation Subset:  IM    
Medical Informatics Division, University Hospital of Geneva,1211 Geneva, Switzerland.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Information Storage and Retrieval*
Medical Records Systems, Computerized*
Natural Language Processing*
Vocabulary, Controlled

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Development of a template model to represent the information content of chest radiology reports.
Next Document:  Indexing medical WWW documents by morphemes.