| Automatic classification of sentences to support Evidence Based Medicine. | |
| | |
MedLine Citation:
|
PMID: 21489224 Owner: NLM Status: MEDLINE |
Abstract/OtherAbstract:
|
AIM: Given a set of pre-defined medical categories used in Evidence Based Medicine, we aim to automatically annotate sentences in medical abstracts with these labels. METHOD: We constructed a corpus of 1,000 medical abstracts annotated by hand with specified medical categories (e.g. Intervention, Outcome). We explored the use of various features based on lexical, semantic, structural, and sequential information in the data, using Conditional Random Fields (CRF) for classification. RESULTS: For the classification tasks over all labels, our systems achieved micro-averaged f-scores of 80.9% and 66.9% over datasets of structured and unstructured abstracts respectively, using sequential features. In labeling only the key sentences, our systems produced f-scores of 89.3% and 74.0% over structured and unstructured abstracts respectively, using the same sequential features. The results over an external dataset were lower (f-scores of 63.1% for all labels, and 83.8% for key sentences). CONCLUSIONS: Of the features we used, the best for classifying any given sentence in an abstract were based on unigrams, section headings, and sequential information from preceding sentences. These features resulted in improved performance over a simple bag-of-words approach, and outperformed feature sets used in previous work. |
| | |
Authors:
|
Su Nam Kim; David Martinez; Lawrence Cavedon; Lars Yencken |
Related Documents
:
|
11624234 - Puerto cabello and the bubonic plague epidemic (1903-1908). 16238164 - A national population study of the prevalence of multiple chemical sensitivity. 2682174 - American physicians' earliest writings about homosexuals, 1880-1900. 16351364 - "where are you really from?": asian americans and identity denial. 19343554 - A limited effect on performance indicators from resident-initiated chart audits and cli... 12414464 - Medical errors-what and when: what do patients want to know? |
Publication Detail:
|
Type: Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't Date: 2011-03-29 |
Journal Detail:
|
Title: BMC bioinformatics Volume: 12 Suppl 2 ISSN: 1471-2105 ISO Abbreviation: BMC Bioinformatics Publication Date: 2011 |
Date Detail:
|
Created Date: 2011-04-14 Completed Date: 2011-05-23 Revised Date: 2011-07-28 |
Medline Journal Info:
|
Nlm Unique ID: 100965194 Medline TA: BMC Bioinformatics Country: England |
Other Details:
|
Languages: eng Pagination: S5 Citation Subset: IM |
Affiliation:
|
NICTA VRL, The University of Melbourne, 3010, Australia. |
Export Citation:
|
APA/MLA Format Download EndNote Download BibTex |
| MeSH Terms | |
Descriptor/Qualifier:
|
Abstracting and Indexing as Topic
/
methods* Automatic Data Processing / methods Evidence-Based Medicine* Information Storage and Retrieval / methods* Natural Language Processing Semantics |
| Comments/Corrections | |
From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine
Previous Document: Word sense disambiguation for event trigger word detection in biomedicine.
Next Document: Processing SPARQL queries with regular expressions in RDF databases.