Document Detail


Optimal aggregation of binary classifiers for multiclass cancer diagnosis using gene expression profiles.
MedLine Citation:
PMID:  19407356     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
Multiclass classification is one of the fundamental tasks in bioinformatics and typically arises in cancer diagnosis studies by gene expression profiling. There have been many studies of aggregating binary classifiers to construct a multiclass classifier based on one-versus-the-rest (1R), one-versus-one (11), or other coding strategies, as well as some comparison studies between them. However, the studies found that the best coding depends on each situation. Therefore, a new problem, which we call the "optimal coding problem," has arisen: how can we determine which coding is the optimal one in each situation? To approach this optimal coding problem, we propose a novel framework for constructing a multiclass classifier, in which each binary classifier to be aggregated has a weight value to be optimally tuned based on the observed data. Although there is no a priori answer to the optimal coding problem, our weight tuning method can be a consistent answer to the problem. We apply this method to various classification problems including a synthesized data set and some cancer diagnosis data sets from gene expression profiling. The results demonstrate that, in most situations, our method can improve classification accuracy over simple voting heuristics and is better than or comparable to state-of-the-art multiclass predictors.
Authors:
Naoto Yukinawa; Shigeyuki Oba; Kikuya Kato; Shin Ishii
Related Documents :
6211886 - Dermatological problems in the work environment following childhood skin diseases.
16027636 - Modified thurow appliance: a clinical alternative for correcting skeletal open bite.
2382686 - Coexistence of aids and lupus nephritis: a case report.
8891406 - Arthroscopy of the wrist in athletes.
9769356 - Physical assessment of the musculoskeletal system.
557196 - Organizing the properties of impossible figures.
Publication Detail:
Type:  Journal Article; Research Support, Non-U.S. Gov't    
Journal Detail:
Title:  IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM     Volume:  6     ISSN:  1557-9964     ISO Abbreviation:  IEEE/ACM Trans Comput Biol Bioinform     Publication Date:    2009 Apr-Jun
Date Detail:
Created Date:  2009-05-01     Completed Date:  2009-08-28     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  101196755     Medline TA:  IEEE/ACM Trans Comput Biol Bioinform     Country:  United States    
Other Details:
Languages:  eng     Pagination:  333-43     Citation Subset:  IM    
Affiliation:
Graduate School of Information Sciences, Nara Institute of Science and Technology, Ikoma, Nara, Japan. naoto-yu@is.naist.jp
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Algorithms*
Artificial Intelligence
Bayes Theorem
Computer Simulation
Esophageal Neoplasms / classification,  diagnosis,  genetics
Gene Expression Profiling*
Humans
Leukemia / classification,  diagnosis,  genetics
Models, Statistical*
Neoplasms / classification,  diagnosis*,  genetics
Reproducibility of Results
Thyroid Neoplasms / classification,  diagnosis,  genetics

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Multimodal networks: structure and operations.
Next Document:  Parallel clustering algorithm for large data sets with applications in bioinformatics.