Document Detail


Automatic classification of sentences to support Evidence Based Medicine.
MedLine Citation:
PMID:  21489224     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
AIM: Given a set of pre-defined medical categories used in Evidence Based Medicine, we aim to automatically annotate sentences in medical abstracts with these labels.
METHOD: We constructed a corpus of 1,000 medical abstracts annotated by hand with specified medical categories (e.g. Intervention, Outcome). We explored the use of various features based on lexical, semantic, structural, and sequential information in the data, using Conditional Random Fields (CRF) for classification.
RESULTS: For the classification tasks over all labels, our systems achieved micro-averaged f-scores of 80.9% and 66.9% over datasets of structured and unstructured abstracts respectively, using sequential features. In labeling only the key sentences, our systems produced f-scores of 89.3% and 74.0% over structured and unstructured abstracts respectively, using the same sequential features. The results over an external dataset were lower (f-scores of 63.1% for all labels, and 83.8% for key sentences).
CONCLUSIONS: Of the features we used, the best for classifying any given sentence in an abstract were based on unigrams, section headings, and sequential information from preceding sentences. These features resulted in improved performance over a simple bag-of-words approach, and outperformed feature sets used in previous work.
Authors:
Su Nam Kim; David Martinez; Lawrence Cavedon; Lars Yencken
Related Documents :
11624234 - Puerto cabello and the bubonic plague epidemic (1903-1908).
16238164 - A national population study of the prevalence of multiple chemical sensitivity.
2682174 - American physicians' earliest writings about homosexuals, 1880-1900.
16351364 - "where are you really from?": asian americans and identity denial.
19343554 - A limited effect on performance indicators from resident-initiated chart audits and cli...
12414464 - Medical errors-what and when: what do patients want to know?
Publication Detail:
Type:  Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't     Date:  2011-03-29
Journal Detail:
Title:  BMC bioinformatics     Volume:  12 Suppl 2     ISSN:  1471-2105     ISO Abbreviation:  BMC Bioinformatics     Publication Date:  2011  
Date Detail:
Created Date:  2011-04-14     Completed Date:  2011-05-23     Revised Date:  2011-07-28    
Medline Journal Info:
Nlm Unique ID:  100965194     Medline TA:  BMC Bioinformatics     Country:  England    
Other Details:
Languages:  eng     Pagination:  S5     Citation Subset:  IM    
Affiliation:
NICTA VRL, The University of Melbourne, 3010, Australia.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Abstracting and Indexing as Topic / methods*
Automatic Data Processing / methods
Evidence-Based Medicine*
Information Storage and Retrieval / methods*
Natural Language Processing
Semantics
Comments/Corrections

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Word sense disambiguation for event trigger word detection in biomedicine.
Next Document:  Processing SPARQL queries with regular expressions in RDF databases.