| Tree-based position weight matrix approach to model transcription factor binding site profiles. | |
| | |
MedLine Citation:
|
PMID: 21912677 Owner: NLM Status: MEDLINE |
Abstract/OtherAbstract:
|
Most of the position weight matrix (PWM) based bioinformatics methods developed to predict transcription factor binding sites (TFBS) assume each nucleotide in the sequence motif contributes independently to the interaction between protein and DNA sequence, usually producing high false positive predictions. The increasing availability of TF enrichment profiles from recent ChIP-Seq methodology facilitates the investigation of dependent structure and accurate prediction of TFBSs. We develop a novel Tree-based PWM (TPWM) approach to accurately model the interaction between TF and its binding site. The whole tree-structured PWM could be considered as a mixture of different conditional-PWMs. We propose a discriminative approach, called TPD (TPWM based Discriminative Approach), to construct the TPWM from the ChIP-Seq data with a pre-existing PWM. To achieve the maximum discriminative power between the positive and negative datasets, the cutoff value is determined based on the Matthew Correlation Coefficient (MCC). The resulting TPWMs are evaluated with respect to accuracy on extensive synthetic datasets. We then apply our TPWM discriminative approach on several real ChIP-Seq datasets to refine the current TFBS models stored in the TRANSFAC database. Experiments on both the simulated and real ChIP-Seq data show that the proposed method starting from existing PWM has consistently better performance than existing tools in detecting the TFBSs. The improved accuracy is the result of modelling the complete dependent structure of the motifs and better prediction of true positive rate. The findings could lead to better understanding of the mechanisms of TF-DNA interactions. |
| | |
Authors:
|
Yingtao Bi; Hyunsoo Kim; Ravi Gupta; Ramana V Davuluri |
Related Documents
:
|
21867577 - Wing shape as a potential discriminator of morphologically similar pest taxa within the... 20968977 - Death whistle. 19237287 - On the dead time problem of a gm counter. 21436567 - Uasb-polishing ponds design parameters: contributions from a pilot scale study in south... 21264277 - Correction: a second-generation device for automated training and quantitative behavior... 22689647 - Sift web server: predicting effects of amino acid substitutions on proteins. |
Publication Detail:
|
Type: Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't Date: 2011-09-02 |
Journal Detail:
|
Title: PloS one Volume: 6 ISSN: 1932-6203 ISO Abbreviation: PLoS ONE Publication Date: 2011 |
Date Detail:
|
Created Date: 2011-09-13 Completed Date: 2011-12-29 Revised Date: 2013-03-07 |
Medline Journal Info:
|
Nlm Unique ID: 101285081 Medline TA: PLoS One Country: United States |
Other Details:
|
Languages: eng Pagination: e24210 Citation Subset: IM |
Affiliation:
|
Molecular and Cellular Oncogenesis Program, Center for Systems and Computational Biology, The Wistar Institute, Philadelphia, Pennsylvania, United States of America. |
Export Citation:
|
APA/MLA Format Download EndNote Download BibTex |
| MeSH Terms | |
Descriptor/Qualifier:
|
Base Sequence Binding Sites Chromatin Immunoprecipitation Computational Biology / methods* Nucleotide Motifs / genetics Position-Specific Scoring Matrices* Transcription Factors / metabolism* |
| Grant Support | |
ID/Acronym/Agency:
|
P30 CA010815/CA/NCI NIH HHS; R01HG003362/HG/NHGRI NIH HHS |
| Chemical | |
Reg. No./Substance:
|
0/Transcription Factors |
| Comments/Corrections | |
From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine
Previous Document: Enzyme-nanoporous gold biocomposite: excellent biocatalyst with improved biocatalytic performance an...
Next Document: Get phases from arsenic anomalous scattering: de novo SAD phasing of two protein structures crystall...