| Towards a framework for developing semantic relatedness reference standards. | |
| | |
MedLine Citation:
|
PMID: 21044697 Owner: NLM Status: MEDLINE |
Abstract/OtherAbstract:
|
Our objective is to develop a framework for creating reference standards for functional testing of computerized measures of semantic relatedness. Currently, research on computerized approaches to semantic relatedness between biomedical concepts relies on reference standards created for specific purposes using a variety of methods for their analysis. In most cases, these reference standards are not publicly available and the published information provided in manuscripts that evaluate computerized semantic relatedness measurement approaches is not sufficient to reproduce the results. Our proposed framework is based on the experiences of medical informatics and computational linguistics communities and addresses practical and theoretical issues with creating reference standards for semantic relatedness. We demonstrate the use of the framework on a pilot set of 101 medical term pairs rated for semantic relatedness by 13 medical coding experts. While the reliability of this particular reference standard is in the "moderate" range; we show that using clustering and factor analyses offers a data-driven approach to finding systematic differences among raters and identifying groups of potential outliers. We test two ontology-based measures of relatedness and provide both the reference standard containing individual ratings and the R program used to analyze the ratings as open-source. Currently, these resources are intended to be used to reproduce and compare results of studies involving computerized measures of semantic relatedness. Our framework may be extended to the development of reference standards in other research areas in medical informatics including automatic classification, information retrieval from medical records and vocabulary/ontology development. |
| | |
Authors:
|
Serguei V S Pakhomov; Ted Pedersen; Bridget McInnes; Genevieve B Melton; Alexander Ruggieri; Christopher G Chute |
Related Documents
:
|
17238777 - Handling implicit and uncertain temporal information in medical text. 2769117 - The natural language processing of medical databases. 9452987 - Evaluation of the unified medical language system as a medical knowledge source. 16779117 - Medical facts to support inferencing in natural language processing. 23016477 - Errors on a handwritten cardex: is it time for a change? 19035707 - Neurosurgery and industry. |
Publication Detail:
|
Type: Journal Article; Research Support, N.I.H., Extramural Date: 2010-10-31 |
Journal Detail:
|
Title: Journal of biomedical informatics Volume: 44 ISSN: 1532-0480 ISO Abbreviation: J Biomed Inform Publication Date: 2011 Apr |
Date Detail:
|
Created Date: 2011-03-23 Completed Date: 2011-07-22 Revised Date: 2012-09-24 |
Medline Journal Info:
|
Nlm Unique ID: 100970413 Medline TA: J Biomed Inform Country: United States |
Other Details:
|
Languages: eng Pagination: 251-65 Citation Subset: IM |
Copyright Information:
|
Copyright © 2010 Elsevier Inc. All rights reserved. |
Affiliation:
|
College of Pharmacy, University of Minnesota, Twin Cities, Minneapolis, MN 55455, USA. pakh0002@umn.edu |
Export Citation:
|
APA/MLA Format Download EndNote Download BibTex |
| MeSH Terms | |
Descriptor/Qualifier:
|
Clinical Coding Databases, Factual Medical Informatics / methods* Medical Records Systems, Computerized / standards* Reference Standards Semantics* Software |
| Grant Support | |
ID/Acronym/Agency:
|
R01 LM009623-01A2/LM/NLM NIH HHS; R01 LM009623-01A2/LM/NLM NIH HHS; T15 LM07041-19/LM/NLM NIH HHS |
| Comments/Corrections | |
From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine
Previous Document: Digestive enzymes of two freshwater fishes (Limia vittata and Gambusia punctata) with different diet...
Next Document: Real-time lesion assessment using a novel combined ultrasound and radiofrequency ablation catheter.