Document Detail

Collaborative biocuration--text-mining development task for document prioritization for curation.
MedLine Citation:
PMID:  23180769     Owner:  NLM     Status:  MEDLINE    
The Critical Assessment of Information Extraction systems in Biology (BioCreAtIvE) challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems for the biological domain. The 'BioCreative Workshop 2012' subcommittee identified three areas, or tracks, that comprised independent, but complementary aspects of data curation in which they sought community input: literature triage (Track I); curation workflow (Track II) and text mining/natural language processing (NLP) systems (Track III). Track I participants were invited to develop tools or systems that would effectively triage and prioritize articles for curation and present results in a prototype web interface. Training and test datasets were derived from the Comparative Toxicogenomics Database (CTD; and consisted of manuscripts from which chemical-gene-disease data were manually curated. A total of seven groups participated in Track I. For the triage component, the effectiveness of participant systems was measured by aggregate gene, disease and chemical 'named-entity recognition' (NER) across articles; the effectiveness of 'information retrieval' (IR) was also measured based on 'mean average precision' (MAP). Top recall scores for gene, disease and chemical NER were 49, 65 and 82%, respectively; the top MAP score was 80%. Each participating group also developed a prototype web interface; these interfaces were evaluated based on functionality and ease-of-use by CTD's biocuration project manager. In this article, we present a detailed description of the challenge and a summary of the results.
Thomas C Wiegers; Allan Peter Davis; Carolyn J Mattingly
Related Documents :
24903859 - Does the falls efficacy scale international version measure fear of falling: a reassess...
23671059 - Psychometric properties of the internet addiction test in chinese adolescents.
24444059 - Nasbod 2013: design, definitions, and metrics.
23458309 - On latencies in malaria infections and their impact on the disease dynamics.
23733359 - Developing a comprehensive and comparative questionnaire for measuring personality in c...
23316089 - Factor analysis of the mystical experience questionnaire: a study of experiences occasi...
15739809 - Normative data for determining significance of test-retest differences on eight common ...
24845279 - A nonlinear generalization of spectral granger causality.
18651269 - A sensitivity analysis of the volumetric spatial decomposition algorithm.
Publication Detail:
Type:  Journal Article; Research Support, N.I.H., Extramural     Date:  2012-11-22
Journal Detail:
Title:  Database : the journal of biological databases and curation     Volume:  2012     ISSN:  1758-0463     ISO Abbreviation:  Database (Oxford)     Publication Date:  2012  
Date Detail:
Created Date:  2012-11-27     Completed Date:  2013-04-24     Revised Date:  2013-07-11    
Medline Journal Info:
Nlm Unique ID:  101517697     Medline TA:  Database (Oxford)     Country:  England    
Other Details:
Languages:  eng     Pagination:  bas037     Citation Subset:  IM    
Department of Biology, North Carolina State University, Raleigh, NC 27695-7617, USA.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Cooperative Behavior*
Data Mining / methods*
Databases, Genetic*
Documentation / methods*
User-Computer Interface
Grant Support

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Strategies for annotation and curation of translational databases: the eTUMOUR project.
Next Document:  ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species.