Document Detail


Comparing syntactic complexity in medical and non-medical corpora.
MedLine Citation:
PMID:  11825160     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
With the growing use of Natural Language Processing (NLP) techniques as solutions in Medical Informatics, the need to quickly and efficiently create the knowledge structures used by these systems has grown concurrently. Automatic discovery of a lexicon for use by an NLP system through machine learning will require information about the syntax of medical language. Understanding the syntactic differences between medical and non-medical corpora may allow more efficient acquisition of a lexicon. Three experiments designed to quantify the syntactic differences in medical and non-medical corpora were conducted. The results show that the syntax of medical language shows less variation than non-medical language and is likely simpler. The differences were great enough to question the applicability of general language tools on medical language. These differences may reduce the difficulty of some free text machine learning problems by capitalizing on the simpler nature of narrative medical syntax.
Authors:
D A Campbell; S B Johnson
Publication Detail:
Type:  Journal Article; Research Support, U.S. Gov't, P.H.S.    
Journal Detail:
Title:  Proceedings / AMIA ... Annual Symposium. AMIA Symposium     Volume:  -     ISSN:  1531-605X     ISO Abbreviation:  Proc AMIA Symp     Publication Date:  2001  
Date Detail:
Created Date:  2002-02-04     Completed Date:  2002-05-24     Revised Date:  2009-11-18    
Medline Journal Info:
Nlm Unique ID:  100883449     Medline TA:  Proc AMIA Symp     Country:  United States    
Other Details:
Languages:  eng     Pagination:  90-4     Citation Subset:  IM    
Affiliation:
Department of Medical Informatics, Columbia University, USA.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Artificial Intelligence
Linguistics*
Literature*
Natural Language Processing*
Grant Support
ID/Acronym/Agency:
LM07079/LM/NLM NIH HHS
Comments/Corrections

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Mobile technologies in the management of disasters: the results of a telemedicine solution.
Next Document:  What's so special about medications: a pharmacist's observations from the POE study.