Document Detail


Multi-class HingeBoost. Method and Application to the Classification of Cancer Types Using Gene Expression Data.
MedLine Citation:
PMID:  22378240     Owner:  NLM     Status:  Publisher    
Abstract/OtherAbstract:
Background: Multi-class molecular cancer classification has great potential clinical implications. Such applications require statistical methods to accurately classify cancer types with a small subset of genes from thousands of genes in the data. Objectives: This paper presents a new functional gradient descent boosting algorithm that directly extends the HingeBoost algorithm from the binary case to the multi-class case without reducing the original problem to multiple binary problems. Methods: Minimizing a multi-class hinge loss with boosting technique, the proposed HingeBoost has good theoretical properties by implementing the Bayes decision rule and providing a unifying framework with either equal or unequal misclassification costs. Furthermore, we propose Twin HingeBoost which has better feature selection behavior than HingeBoost by reducing the number of ineffective covariates. Simulated data, benchmark data and two cancer gene expression data sets are utilized to evaluate the performance of the proposed approach. Results: Simulations and the benchmark data showed that the multi-class HingeBoost generated accurate predictions when compared with the alternative methods, especially with high-dimensional covariates. The multi-class HingeBoost also produced more accurate prediction or comparable prediction in two cancer classification problems using gene expression data. Conclusions: This work has shown that the HingeBoost provides a powerful tool for multi-classification problems. In many applications, the classification accuracy and feature selection behavior can be further improved when using Twin HingeBoost.
Authors:
Z Wang
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2012-3-01
Journal Detail:
Title:  Methods of information in medicine     Volume:  51     ISSN:  0026-1270     ISO Abbreviation:  -     Publication Date:  2012 Mar 
Date Detail:
Created Date:  2012-3-1     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  0210453     Medline TA:  Methods Inf Med     Country:  -    
Other Details:
Languages:  ENG     Pagination:  170-175     Citation Subset:  -    
Affiliation:
Zhu Wang, Department of Research, Connecticut Children's Medical Center, Department of Pediatrics, University of Connecticut School of Medicine, 282 Washington Street, Hartford, CT 06106, USA, E-mail: zwang@ccmckids.org.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Successful Treatment of Severe Fetal Chylothorax Resistant to Repeated Pleuroamniotic Shunting by OK...
Next Document:  Using health rights to improve programme design: a Papua New Guinea case study.