Document Detail


GENIA corpus--semantically annotated corpus for bio-textmining.
MedLine Citation:
PMID:  12855455     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
MOTIVATION: Natural language processing (NLP) methods are regarded as being useful to raise the potential of text mining from biological literature. The lack of an extensively annotated corpus of this literature, however, causes a major bottleneck for applying NLP techniques. GENIA corpus is being developed to provide reference materials to let NLP techniques work for bio-textmining. RESULTS: GENIA corpus version 3.0 consisting of 2000 MEDLINE abstracts has been released with more than 400,000 words and almost 100,000 annotations for biological terms.
Authors:
J-D Kim; T Ohta; Y Tateisi; J Tsujii
Related Documents :
15712255 - Disability management practices in education, hotel/motel, and health care workplaces.
8559915 - Identifying the relationship between work and nonwork stress among bank managers.
8376655 - Reflective practice: a critique of the work of argyris and schön.
14570525 - Trust in management as a buffer of the relationships between overload and strain.
23233395 - Review article: video-laryngoscopy: another tool for difficult intubation or a new para...
2717835 - Industrial physician's responsibilities to his patients and to management--confidential...
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't    
Journal Detail:
Title:  Bioinformatics (Oxford, England)     Volume:  19 Suppl 1     ISSN:  1367-4803     ISO Abbreviation:  Bioinformatics     Publication Date:  2003  
Date Detail:
Created Date:  2003-07-11     Completed Date:  2004-10-14     Revised Date:  2007-11-15    
Medline Journal Info:
Nlm Unique ID:  9808944     Medline TA:  Bioinformatics     Country:  England    
Other Details:
Languages:  eng     Pagination:  i180-2     Citation Subset:  IM    
Affiliation:
CREST, Japan Science and Technology Corporation, Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Abstracting and Indexing as Topic / methods*
Biology / methods*
Computational Biology / methods
Database Management Systems
Databases, Bibliographic*
Documentation
Information Storage and Retrieval / methods*
MEDLINE
Natural Language Processing*
Periodicals as Topic*
Terminology as Topic*

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Complexity management in visualizing protein interaction networks.
Next Document:  Predicting phenotype from patterns of annotation.