Document Detail

Network-based regularization for matched case-control analysis of high-dimensional DNA methylation data.
MedLine Citation:
PMID:  23212810     Owner:  NLM     Status:  MEDLINE    
The matched case-control designs are commonly used to control for potential confounding factors in genetic epidemiology studies especially epigenetic studies with DNA methylation. Compared with unmatched case-control studies with high-dimensional genomic or epigenetic data, there have been few variable selection methods for matched sets. In an earlier paper, we proposed the penalized logistic regression model for the analysis of unmatched DNA methylation data using a network-based penalty. However, for popularly applied matched designs in epigenetic studies that compare DNA methylation between tumor and adjacent non-tumor tissues or between pre-treatment and post-treatment conditions, applying ordinary logistic regression ignoring matching is known to bring serious bias in estimation. In this paper, we developed a penalized conditional logistic model using the network-based penalty that encourages a grouping effect of (1) linked Cytosine-phosphate-Guanine (CpG) sites within a gene or (2) linked genes within a genetic pathway for analysis of matched DNA methylation data. In our simulation studies, we demonstrated the superiority of using conditional logistic model over unconditional logistic model in high-dimensional variable selection problems for matched case-control data. We further investigated the benefits of utilizing biological group or graph information for matched case-control data. We applied the proposed method to a genome-wide DNA methylation study on hepatocellular carcinoma (HCC) where we investigated the DNA methylation levels of tumor and adjacent non-tumor tissues from HCC patients by using the Illumina Infinium HumanMethylation27 Beadchip. Several new CpG sites and genes known to be related to HCC were identified but were missed by the standard method in the original paper.
Hokeun Sun; Shuang Wang
Publication Detail:
Type:  Journal Article; Research Support, N.I.H., Extramural     Date:  2012-12-05
Journal Detail:
Title:  Statistics in medicine     Volume:  32     ISSN:  1097-0258     ISO Abbreviation:  Stat Med     Publication Date:  2013 May 
Date Detail:
Created Date:  2013-05-08     Completed Date:  2014-01-02     Revised Date:  2014-06-01    
Medline Journal Info:
Nlm Unique ID:  8215016     Medline TA:  Stat Med     Country:  England    
Other Details:
Languages:  eng     Pagination:  2127-39     Citation Subset:  IM    
Copyright Information:
Copyright © 2012 John Wiley & Sons, Ltd.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Carcinoma, Hepatocellular / genetics
Case-Control Studies*
Computer Simulation
CpG Islands*
DNA Methylation*
Data Interpretation, Statistical
Epigenomics / methods*
Liver Neoplasms / genetics
Logistic Models*
Oligonucleotide Array Sequence Analysis / methods*
Grant Support
R01 ES005116-19A1/ES/NIEHS NIH HHS; R03 CA150140/CA/NCI NIH HHS; R03 CA150140-01/CA/NCI NIH HHS

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Ionic strength affects tertiary structure and aggregation propensity of a monoclonal antibody adsorb...
Next Document:  Authentic early experience in Medical Education: a socio-cultural analysis identifying important var...