Document Detail

Modeling RNA degradation for RNA-Seq with applications.
MedLine Citation:
PMID:  22353193     Owner:  NLM     Status:  MEDLINE    
RNA-Seq is widely used in biological and biomedical studies. Methods for the estimation of the transcript's abundance using RNA-Seq data have been intensively studied, many of which are based on the assumption that the short-reads of RNA-Seq are uniformly distributed along the transcripts. However, the short-reads are found to be nonuniformly distributed along the transcripts, which can greatly reduce the accuracies of these methods based on the uniform assumption. Several methods are developed to adjust the biases induced by this nonuniformity, utilizing the short-read's empirical distribution in transcript. As an alternative, we found that RNA degradation plays a major role in the formation of the short-read's nonuniform distribution and thus developed a new approach that quantifies the short-read's nonuniform distribution by precisely modeling RNA degradation. Our model of RNA degradation fits RNA-Seq data quite well, and based on this model, a new statistical method was further developed to estimate transcript expression level, as well as the RNA degradation rate, for individual genes and their isoforms. We showed that our method can improve the accuracy of transcript isoform expression estimation. The RNA degradation rate of individual transcript we estimated is consistent across samples and/or experiments/platforms. In addition, the RNA degradation rate from our model is independent of the RNA length, consistent with previous studies on RNA decay rate.
Lin Wan; Xiting Yan; Ting Chen; Fengzhu Sun
Related Documents :
18762273 - The heterozygous sod2(+/-) mouse: modeling the mitochondrial role in drug toxicity.
25150043 - Testis tissue xenografting: twelve years of an in vivo spermatogenesis system.
23041433 - Effects of seasonal variation patterns on recurrent outbreaks in epidemic models.
Publication Detail:
Type:  Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't     Date:  2012-02-21
Journal Detail:
Title:  Biostatistics (Oxford, England)     Volume:  13     ISSN:  1468-4357     ISO Abbreviation:  Biostatistics     Publication Date:  2012 Sep 
Date Detail:
Created Date:  2012-09-13     Completed Date:  2013-02-21     Revised Date:  2013-09-03    
Medline Journal Info:
Nlm Unique ID:  100897327     Medline TA:  Biostatistics     Country:  England    
Other Details:
Languages:  eng     Pagination:  734-47     Citation Subset:  IM    
Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089, USA.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Computer Simulation
Data Interpretation, Statistical*
Models, Genetic*
RNA / genetics*,  metabolism
Sequence Analysis, RNA / methods*
Transcription, Genetic / genetics*
Grant Support
Reg. No./Substance:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Graphical modeling of binary data using the LASSO: a simulation study.
Next Document:  Identification of genetic risk variants for deep vein thrombosis by multiplexed next-generation sequ...