Document Detail

The Sequence Structures of Human MicroRNA Molecules and Their Implications.
MedLine Citation:
PMID:  23349828     Owner:  NLM     Status:  In-Data-Review    
The count of the nucleotides in a cloned, short genomic sequence has become an important criterion to annotate such a sequence as a miRNA molecule. While the majority of human mature miRNA sequences consist of 22 nucleotides, there exists discrepancy in the characteristic lengths of the miRNA sequences. There is also a lack of systematic studies on such length distribution and on the biological factors that are related to or may affect this length. In this paper, we intend to fill this gap by investigating the sequence structure of human miRNA molecules using statistics tools. We demonstrate that the traditional discrete probability distributions do not model the length distribution of the human mature miRNAs well, and we obtain the statistical distribution model with a decent fit. We observe that the four nucleotide bases in a miRNA sequence are not randomly distributed, implying that possible structural patterns such as dinucleotide (trinucleotide or higher order) may exist. Furthermore, we study the relationships of this length distribution to multiple important factors such as evolutionary conservation, tumorigenesis, the length of precursor loop structures, and the number of predicted targets. The association between the miRNA sequence length and the distributions of target site counts in corresponding predicted genes is also presented. This study results in several novel findings worthy of further investigation that include: (1) rapid evolution introduces variation to the miRNA sequence length distribution; (2) miRNAs with extreme sequence lengths are unlikely to be cancer-related; and (3) the miRNA sequence length is positively correlated to the precursor length and the number of predicted target genes.
Zhide Fang; Ruofei Du; Andrea Edwards; Erik K Flemington; Kun Zhang
Related Documents :
3003688 - Spinach plastid genes coding for initiation factor if-1, ribosomal protein s11 and rna ...
3689388 - Amino acid sequence analysis of the neuronal type ii calmodulin-dependent protein kinas...
2015308 - Nucleotide sequence of the murine prothymosin alpha cdna and its deduced primary and se...
12864918 - Fat body expressed yolk protein genes in hyphantria cunea are related to the yp4 follic...
12414178 - Efficient capture of unique sequences from eukaryotic genomes.
24414128 - Deep re-sequencing of a widely used maintainer line of hybrid rice for discovery of dna...
Publication Detail:
Type:  Journal Article     Date:  2013-01-18
Journal Detail:
Title:  PloS one     Volume:  8     ISSN:  1932-6203     ISO Abbreviation:  PLoS ONE     Publication Date:  2013  
Date Detail:
Created Date:  2013-01-25     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  101285081     Medline TA:  PLoS One     Country:  United States    
Other Details:
Languages:  eng     Pagination:  e54215     Citation Subset:  IM    
Biostatistics Program, School of Public Health, Louisiana State University Health Sciences Center, New Orleans, Louisiana, United States of America.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  A Lectin with Highly Potent Inhibitory Activity toward Breast Cancer Cells from Edible Tubers of Dio...
Next Document:  Predicting the Impact of Climate Change on Threatened Species in UK Waters.