Document Detail

Exploiting Genome Structure in Association Analysis.
MedLine Citation:
PMID:  21548809     Owner:  NLM     Status:  Publisher    
Abstract A genome-wide association study involves examining a large number of single-nucleotide polymorphisms (SNPs) to identify SNPs that are significantly associated with the given phenotype, while trying to reduce the false positive rate. Although haplotype-based association methods have been proposed to accommodate correlation information across nearby SNPs that are in linkage disequilibrium, none of these methods directly incorporated the structural information such as recombination events along chromosome. In this paper, we propose a new approach called stochastic block lasso for association mapping that exploits prior knowledge on linkage disequilibrium structure in the genome such as recombination rates and distances between adjacent SNPs in order to increase the power of detecting true associations while reducing false positives. Following a typical linear regression framework with the genotypes as inputs and the phenotype as output, our proposed method employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a first-order Markov process along the sequence of SNPs that incorporates the prior information on the linkage disequilibrium structure. The Markov-chain prior models the structural dependencies between a pair of adjacent SNPs, and allows us to look for association SNPs in a coupled manner, combining strength from multiple nearby SNPs. Our results on HapMap-simulated datasets and mouse datasets show that there is a significant advantage in incorporating the prior knowledge on linkage disequilibrium structure for marker identification under whole-genome association.
Seyoung Kim; Eric P Xing
Related Documents :
21124799 - Thyroid stimulating hormone receptor (tshr) intron 1 variants are major risk factors fo...
17436249 - Measuring european population stratification with microarray genotype data.
17189289 - Human sult1a1 gene: copy number differences and functional implications.
21312059 - Association between the m268t polymorphism in the angiotensinogen gene and essential hy...
21546449 - The meta-analysis of genome-wide association studies.
21360499 - Gdf5 single-nucleotide polymorphism rs143383 is associated with lumbar disc degeneratio...
19705089 - Cyp46 polymorphisms in alzheimer's disease: a review.
24247349 - Genetic analysis of some seed quality characters in upland cotton (gossypium hirsutum l.).
23174939 - Sdha mutations in adult and pediatric wild-type gastrointestinal stromal tumors.
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2011-5-6
Journal Detail:
Title:  Journal of computational biology : a journal of computational molecular cell biology     Volume:  -     ISSN:  1557-8666     ISO Abbreviation:  -     Publication Date:  2011 May 
Date Detail:
Created Date:  2011-5-9     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  9433358     Medline TA:  J Comput Biol     Country:  -    
Other Details:
Languages:  ENG     Pagination:  -     Citation Subset:  -    
School of Computer Science, Carnegie Mellon University , Pittsburgh, Pennsylvania.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Counting RNA pseudoknotted structures.
Next Document:  Gene Expression Complex Networks: Synthesis, Identification, and Analysis.