Document Detail

Development of a medical-text parsing algorithm based on character adjacent probability distribution for Japanese radiology reports.
MedLine Citation:
PMID:  19057808     Owner:  NLM     Status:  MEDLINE    
OBJECTIVES: The objectives of this study were to investigate the transitional probability distribution of medical term boundaries between characters and to develop a parsing algorithm specifically for medical texts. METHODS: Medical terms in Japanese computed tomography (CT) reports were identified using the ChaSen morphological analysis system. MeSH-based medical terms (51,385 entries), obtained from the metathesaurus in the Unified Medical Language System (UMLS, 2005AA), were added as a medical dictionary for ChaSen. A radiographer corrected the set of results containing 300 parsed CT reports. In addition, two radiologists checked the medical term parsing of 200 CT sentences. RESULTS: We obtained modified inter-annotator agreement scores for the text corrected by the radiologists. We retrieved the transitional probability as the conditional probability of a uni-gram, bi-gram, and tri-gram. The highest transitional probability P(Ci | Ci- 2(*)Ci- 1) was 1.00. For an example of anatomical location, the term "pulmonary hilum" was parsed as a tri-gram. CONCLUSIONS: Retrieval of transitional probability will improve the accuracy of parsing compound medical terms.
N Nishimoto; S Terae; M Uesugi; K Ogasawara; T Sakurai
Related Documents :
16160358 - Breaking the language barrier: machine assisted diagnosis using the medical speech tran...
19057808 - Development of a medical-text parsing algorithm based on character adjacent probability...
7961618 - Evaluation of an interactive computer video tutorial on malignant melanoma.
3591268 - Computer registration and processing of census data for an anesthesiology department.
24585748 - To what extent did the 1858 medical act bring unity to the british medical profession?
2341598 - The impact of medication resistance and continuation pharmacotherapy on relapse followi...
Publication Detail:
Type:  Journal Article    
Journal Detail:
Title:  Methods of information in medicine     Volume:  47     ISSN:  0026-1270     ISO Abbreviation:  Methods Inf Med     Publication Date:  2008  
Date Detail:
Created Date:  2008-12-05     Completed Date:  2009-02-06     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  0210453     Medline TA:  Methods Inf Med     Country:  Germany    
Other Details:
Languages:  eng     Pagination:  513-21     Citation Subset:  IM    
Department of Medical Informatics, Graduate School of Medicine, Hokkaido University, Sapporo, Hokkaido, Japan.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Access to Information
Markov Chains
Medical Informatics / methods,  organization & administration*
Models, Statistical
Models, Theoretical
Natural Language Processing*
Radiology / methods*,  organization & administration
Terminology as Topic*
Tomography, X-Ray Computed*
Unified Medical Language System*

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Usage and Usability of Standard Operating Procedures (SOPs) among the Coordination Centers for Clini...
Next Document:  Prediction of Disease-associated Single Nucleotide Polymorphisms Using Virtual Genomes Constructed f...