Document Detail


Multiclass gene selection using pareto-fronts.
MedLine Citation:
PMID:  23702546     Owner:  NLM     Status:  In-Data-Review    
Abstract/OtherAbstract:
Filter methods are often used for selection of genes in multiclass sample classification by using microarray data. Such techniques usually tend to bias toward a few classes that are easily distinguishable from other classes due to imbalances of strong features and sample sizes of different classes. It could therefore lead to selection of redundant genes while missing the relevant genes, leading to poor classification of tissue samples. In this manuscript, we propose to decompose multiclass ranking statistics into class-specific statistics and then use Pareto-front analysis for selection of genes. This alleviates the bias induced by class intrinsic characteristics of dominating classes. The use of Pareto-front analysis is demonstrated on two filter criteria commonly used for gene selection: F-score and KW-score. A significant improvement in classification performance and reduction in redundancy among top-ranked genes were achieved in experiments with both synthetic and real-benchmark data sets.
Authors:
Jagath C Rajapakse; Piyushkumar A Mundra
Related Documents :
23527276 - Thermo-regulation of genes mediating motility and plant interactions in pseudomonas syr...
25030026 - Genome-wide analysis reveals divergent patterns of gene expression during zygotic and s...
24737046 - The role of self-organization in developmental evolution.
24884716 - Validation of reference genes for gene expression analysis in olive (olea europaea) mes...
19602036 - Epigenetic programming of mu-opioid receptor gene in mouse brain is regulated by mecp2 ...
17852196 - Screening of genes associated with termination of the critical period of visual cortex ...
Publication Detail:
Type:  Journal Article    
Journal Detail:
Title:  IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM     Volume:  10     ISSN:  1557-9964     ISO Abbreviation:  IEEE/ACM Trans Comput Biol Bioinform     Publication Date:    2013 Jan-Feb
Date Detail:
Created Date:  2013-05-24     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  101196755     Medline TA:  IEEE/ACM Trans Comput Biol Bioinform     Country:  United States    
Other Details:
Languages:  eng     Pagination:  87-97     Citation Subset:  IM    
Affiliation:
Nanyang Technological University, Singapore, Singapore-MIT Alliance and Massachusetts Institute of Technology, Cambridge.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Mining minimal motif pair sets maximally covering interactions in a protein-protein interaction netw...
Next Document:  Parametric estimation of the local false discovery rate for identifying genetic associations.