Document Detail

Catching the Genomic Wave in Oligonucleotide Single-Nucleotide Polymorphism Arrays by Modeling Sequence Binding.
MedLine Citation:
PMID:  23763671     Owner:  NLM     Status:  Publisher    
Abstract The genomic wave has been identified as a major artifact in genome data and is highly correlated with the sequence GC content. Although statistical methods have been developed to filter this artifact, the mechanism underlying the genomic wave has not been studied yet. Understanding of the artifact, specifically the sources of the artifact, may lead to successful separation of biological signals from the artifact and improve array design, modeling, and association studies. We develop an approach to catching the genomic wave in the oligonucleotide single-nucleotide polymorphism (SNP) arrays by separating biological signals from the array baseline background through modeling sequence binding with a newly developed probe intensity composite representation (PICR) model. The PICR model decomposes the probe intensity of each SNP probe set into the target sequence concentrations, SNP-specific background (nonsignal) and measurement error, and identifies the biological signals through the target concentration for each allele. We demonstrate with the Affymetrix GeneChip 500K HapMap data and the Wellcome Trust Case-Control Study data that the genomic wave is captured through the SNP-specific background term of the PICR model, and is separated successfully from the allelic target concentrations-the biological signals. We further identify two important sources of the genomic waves, the GC content and the fragment length (FL) of the sequence, and conclude that (1) the genomic wave artifact can be removed from the genome data with the PICR model, and (2) in addition to the GC content, the genomic wave also has a component of nonlinear effect of the FL.
Yalu Wen; Ming Li; Wenjiang J Fu
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2013-6-13
Journal Detail:
Title:  Journal of computational biology : a journal of computational molecular cell biology     Volume:  -     ISSN:  1557-8666     ISO Abbreviation:  J. Comput. Biol.     Publication Date:  2013 Jun 
Date Detail:
Created Date:  2013-6-14     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  9433358     Medline TA:  J Comput Biol     Country:  -    
Other Details:
Languages:  ENG     Pagination:  -     Citation Subset:  -    
The Computational Genomics Lab, Department of Epidemiology and Biostatistics, Michigan State University , East Lansing, Michigan.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Rice protein exerts a hypocholesterolemic effect through regulating cholesterol metabolism-related g...
Next Document:  Cellular localization of dieldrin and structure-activity relationship of dieldrin analogs in dopamin...