ROC-based utility function maximization for feature selection and classification with applications to high-dimensional protease data. | |
MedLine Citation:
|
PMID: 18363775 Owner: NLM Status: MEDLINE |
Abstract/OtherAbstract:
|
In medical diagnosis, the diseased and nondiseased classes are usually unbalanced and one class may be more important than the other depending on the diagnosis purpose. Most standard classification methods, however, are designed to maximize the overall accuracy and cannot incorporate different costs to different classes explicitly. In this article, we propose a novel nonparametric method to directly maximize the weighted specificity and sensitivity of the receiver operating characteristic curve. Combining advances in machine learning, optimization theory, and statistics, the proposed method has excellent generalization property and assigns different error costs to different classes explicitly. We present experiments that compare the proposed algorithms with support vector machines and regularized logistic regression using data from a study on HIV-1 protease as well as six public available datasets. Our main conclusion is that the performance of proposed algorithm is significantly better in most cases than the other classifiers tested. Software package in MATLAB is available upon request. |
Authors:
|
Zhenqiu Liu; Ming Tan |
Related Documents
:
|
15461085 - Feature selection in mlps and svms based on maximum output information. 19814905 - Novel kernels for error-tolerant graph classification. 19443915 - Classification based on hybridization of parametric and nonparametric classifiers. 11101265 - Athletic footwear: design, performance and selection issues. 19855025 - Algorithm of myogenic differentiation in higher-order organisms. 21644955 - Reproductive character displacement and signal ontogeny in a sympatric assemblage of el... |
Publication Detail:
|
Type: Journal Article; Research Support, N.I.H., Extramural Date: 2008-03-24 |
Journal Detail:
|
Title: Biometrics Volume: 64 ISSN: 1541-0420 ISO Abbreviation: Biometrics Publication Date: 2008 Dec |
Date Detail:
|
Created Date: 2008-11-26 Completed Date: 2008-12-31 Revised Date: 2013-05-20 |
Medline Journal Info:
|
Nlm Unique ID: 0370625 Medline TA: Biometrics Country: United States |
Other Details:
|
Languages: eng Pagination: 1155-61 Citation Subset: IM |
Affiliation:
|
Division of Biostatistics, University of Maryland Greenebaum Cancer Center, Baltimore, Maryland 21201, USA. zliu@umm.edu |
Export Citation:
|
APA/MLA Format Download EndNote Download BibTex |
MeSH Terms | |
Descriptor/Qualifier:
|
Algorithms Biometry / methods* Databases, Factual Diagnosis* HIV Protease Humans ROC Curve* Software Statistics, Nonparametric |
Grant Support | |
ID/Acronym/Agency:
|
1R03CA128102-01/CA/NCI NIH HHS |
Chemical | |
Reg. No./Substance:
|
EC 3.4.23.-/HIV Protease |
From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine
Previous Document: Discriminant analysis for longitudinal data with multiple continuous responses and possibly missing ...
Next Document: Evaluating candidate principal surrogate endpoints.