Document Detail

Modeling the evolution of regulatory elements by simultaneous detection and alignment with phylogenetic pair HMMs.
MedLine Citation:
PMID:  21187896     Owner:  NLM     Status:  MEDLINE    
The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.
William H Majoros; Uwe Ohler
Related Documents :
15772716 - Modeling the ecological consequences of land-use policies in an urbanizing region.
17742226 - Rapid formation of ontong java plateau by aptian mantle plume volcanism.
18494116 - Development of a risk-based environmental management tool for drilling discharges. summ...
20553386 - Effect of 7 yr of experimental drought on vegetation dynamics and biomass storage of an...
18851016 - Impact of quantum phase transitions on excited-level dynamics.
10404626 - A bayesian statistical algorithm for rna secondary structure prediction.
Publication Detail:
Type:  Journal Article; Research Support, N.I.H., Extramural     Date:  2010-12-16
Journal Detail:
Title:  PLoS computational biology     Volume:  6     ISSN:  1553-7358     ISO Abbreviation:  PLoS Comput. Biol.     Publication Date:  2010  
Date Detail:
Created Date:  2010-12-28     Completed Date:  2011-03-31     Revised Date:  2013-07-03    
Medline Journal Info:
Nlm Unique ID:  101238922     Medline TA:  PLoS Comput Biol     Country:  United States    
Other Details:
Languages:  eng     Pagination:  e1001037     Citation Subset:  IM    
Institute for Genome Sciences & Policy, Duke University, Durham, North Carolina, United States of America.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Base Sequence
Computational Biology / methods*
Computer Simulation
Drosophila melanogaster / genetics
Evolution, Molecular*
Gene Expression Regulation
Markov Chains*
Molecular Sequence Data
ROC Curve
Regulatory Elements, Transcriptional / genetics*
Sequence Alignment / methods*
Sequence Analysis, DNA
Grant Support

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  A quantitative systems approach reveals dynamic control of tRNA modifications during cellular stress...
Next Document:  HIV-1 envelope subregion length variation during disease progression.