Document Detail


ROC-based utility function maximization for feature selection and classification with applications to high-dimensional protease data.
MedLine Citation:
PMID:  18363775     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
In medical diagnosis, the diseased and nondiseased classes are usually unbalanced and one class may be more important than the other depending on the diagnosis purpose. Most standard classification methods, however, are designed to maximize the overall accuracy and cannot incorporate different costs to different classes explicitly. In this article, we propose a novel nonparametric method to directly maximize the weighted specificity and sensitivity of the receiver operating characteristic curve. Combining advances in machine learning, optimization theory, and statistics, the proposed method has excellent generalization property and assigns different error costs to different classes explicitly. We present experiments that compare the proposed algorithms with support vector machines and regularized logistic regression using data from a study on HIV-1 protease as well as six public available datasets. Our main conclusion is that the performance of proposed algorithm is significantly better in most cases than the other classifiers tested. Software package in MATLAB is available upon request.
Authors:
Zhenqiu Liu; Ming Tan
Related Documents :
15461085 - Feature selection in mlps and svms based on maximum output information.
19814905 - Novel kernels for error-tolerant graph classification.
19443915 - Classification based on hybridization of parametric and nonparametric classifiers.
11101265 - Athletic footwear: design, performance and selection issues.
19855025 - Algorithm of myogenic differentiation in higher-order organisms.
21644955 - Reproductive character displacement and signal ontogeny in a sympatric assemblage of el...
Publication Detail:
Type:  Journal Article; Research Support, N.I.H., Extramural     Date:  2008-03-24
Journal Detail:
Title:  Biometrics     Volume:  64     ISSN:  1541-0420     ISO Abbreviation:  Biometrics     Publication Date:  2008 Dec 
Date Detail:
Created Date:  2008-11-26     Completed Date:  2008-12-31     Revised Date:  2013-05-20    
Medline Journal Info:
Nlm Unique ID:  0370625     Medline TA:  Biometrics     Country:  United States    
Other Details:
Languages:  eng     Pagination:  1155-61     Citation Subset:  IM    
Affiliation:
Division of Biostatistics, University of Maryland Greenebaum Cancer Center, Baltimore, Maryland 21201, USA. zliu@umm.edu
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Algorithms
Biometry / methods*
Databases, Factual
Diagnosis*
HIV Protease
Humans
ROC Curve*
Software
Statistics, Nonparametric
Grant Support
ID/Acronym/Agency:
1R03CA128102-01/CA/NCI NIH HHS
Chemical
Reg. No./Substance:
EC 3.4.23.-/HIV Protease

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Discriminant analysis for longitudinal data with multiple continuous responses and possibly missing ...
Next Document:  Evaluating candidate principal surrogate endpoints.