| Knowledge discovery via machine learning for neurodegenerative disease researchers. | |
| | |
MedLine Citation:
|
PMID: 19623491 Owner: NLM Status: MEDLINE |
Abstract/OtherAbstract:
|
Ever-increasing size of the biomedical literature makes more precise information retrieval and tapping into implicit knowledge in scientific literature a necessity. In this chapter, first, three new variants of the expectation-maximization (EM) method for semisupervised document classification (Machine Learning 39:103-134, 2000) are introduced to refine biomedical literature meta-searches. The retrieval performance of a multi-mixture per class EM variant with Agglomerative Information Bottleneck clustering (Slonim and Tishby (1999) Agglomerative information bottleneck. In Proceedings of NIPS-12) using Davies-Bouldin cluster validity index (IEEE Transactions on Pattern Analysis and Machine Intelligence 1:224-227, 1979), rivaled the state-of-the-art transductive support vector machines (TSVM) (Joachims (1999) Transductive inference for text classification using support vector machines. In Proceedings of the International Conference on Machine Learning (ICML)). Moreover, the multi-mixture per class EM variant refined search results more quickly with more than one order of magnitude improvement in execution time compared with TSVM. A second tool, CRFNER, uses conditional random fields (Lafferty et al. (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of ICML-2001) to recognize 15 types of named entities from schizophrenia abstracts outperforming ABNER (Settles (2004) Biomedical named entity recognition using conditional random fields and rich feature sets. In Proceedings of COLING 2004 International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA)) in biological named entity recognition and reaching F(1) performance of 82.5% on the second set of named entities. |
| | |
Authors:
|
I Burak Ozyurt; Gregory G Brown |
Publication Detail:
|
Type: Journal Article; Research Support, N.I.H., Extramural |
Journal Detail:
|
Title: Methods in molecular biology (Clifton, N.J.) Volume: 569 ISSN: 1064-3745 ISO Abbreviation: Methods Mol. Biol. Publication Date: 2009 |
Date Detail:
|
Created Date: 2009-07-22 Completed Date: 2009-11-09 Revised Date: - |
Medline Journal Info:
|
Nlm Unique ID: 9214969 Medline TA: Methods Mol Biol Country: United States |
Other Details:
|
Languages: eng Pagination: 173-96 Citation Subset: IM |
Affiliation:
|
Department of Psychiatry, University of California - San Diego, La Jolla, CA, USA. |
Export Citation:
|
APA/MLA Format Download EndNote Download BibTex |
| MeSH Terms | |
Descriptor/Qualifier:
|
Algorithms Artificial Intelligence* Cluster Analysis Computational Biology* Databases, Factual Humans Information Storage and Retrieval Knowledge Bases Natural Language Processing Neurodegenerative Diseases* |
| Grant Support | |
ID/Acronym/Agency:
|
1 U24 RR021992/RR/NCRR NIH HHS |
From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine
Previous Document: Applications of bioinformatics to protein structures: how protein structure and bioinformatics overl...
Next Document: Brain model of text animation as a data mining strategy.