Document Detail


Anaphoric relations in the clinical narrative: corpus creation.
MedLine Citation:
PMID:  21459927     Owner:  NLM     Status:  Publisher    
Abstract/OtherAbstract:
Objective The long-term goal of this work is the automated discovery of anaphoric relations from the clinical narrative. The creation of a gold standard set from a cross-institutional corpus of clinical notes and high-level characteristics of that gold standard are described. Methods A standard methodology for annotation guideline development, gold standard annotations, and inter-annotator agreement (IAA) was used. Results The gold standard annotations resulted in 7214 markables, 5992 pairs, and 1304 chains. Each report averaged 40 anaphoric markables, 33 pairs, and seven chains. The overall IAA is high on the Mayo dataset (0.6607), and moderate on the University of Pittsburgh Medical Center (UPMC) dataset (0.4072). The IAA between each annotator and the gold standard is high (Mayo: 0.7669, 0.7697, and 0.9021; UPMC: 0.6753 and 0.7138). These results imply a quality corpus feasible for system development. They also suggest the complementary nature of the annotations performed by the experts and the importance of an annotator team with diverse knowledge backgrounds. Limitations Only one of the annotators had the linguistic background necessary for annotation of the linguistic attributes. The overall generalizability of the guidelines will be further strengthened by annotations of data from additional sites. This will increase the overall corpus size and the representation of each relation type. Conclusion The first step toward the development of an anaphoric relation resolver as part of a comprehensive natural language processing system geared specifically for the clinical narrative in the electronic medical record is described. The deidentified annotated corpus will be available to researchers.
Authors:
Guergana K Savova; Wendy W Chapman; Jiaping Zheng; Rebecca S Crowley
Related Documents :
12647657 - Prescribing: what's all the fuss?
10247477 - Education of independent elderly in the responsible use of prescription medications.
20047247 - Strategies for reducing polypharmacy in older adults.
19395307 - Assessing the value of electronic prescribing in ambulatory care: a focus group study.
18830537 - Abstract and keywords.
2469077 - Treatment of hay fever. allergen avoidance and medication to control symptoms.
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2011-4-1
Journal Detail:
Title:  Journal of the American Medical Informatics Association : JAMIA     Volume:  -     ISSN:  1527-974X     ISO Abbreviation:  -     Publication Date:  2011 Apr 
Date Detail:
Created Date:  2011-4-4     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  9430800     Medline TA:  J Am Med Inform Assoc     Country:  -    
Other Details:
Languages:  ENG     Pagination:  -     Citation Subset:  -    
Affiliation:
Childrens Hospital Boston Informatics Program and Harvard Medical School, Boston, Massachusetts, USA.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  The role of anti-infectives in the treatment of refractory asthma.
Next Document:  Apoptosis of regulatory T lymphocytes is increased in chronic inflammatory bowel disease and reverse...