Document Detail

MaSC: Mappability-Sensitive Cross-Correlation for Estimating Mean Fragment Length of Single-End Short Read Sequencing Data.
MedLine Citation:
PMID:  23300135     Owner:  NLM     Status:  Publisher    
MOTIVATION: Reliable estimation of the mean fragment length for nextgeneration short-read sequencing data is an important step in NGS analysis pipelines, most notably because of its impact on the accuracy of the enriched regions identified by peak-calling algorithms. Although many peak-calling algorithms include a fragment-length estimation subroutine, the problem has not been adequately solved, as demonstrated by the variability of the estimates returned by different algorithms. RESULTS: In this paper, we investigate the use of strand cross-correlation to estimate mean fragment length of single-end data and show that traditional estimation approaches have mixed reliability. We observe that the mappability of different parts of the genome can introduce an artificial bias into cross-correlation computations, resulting in incorrect fragment-length estimates. We propose a new approach, called Mappability-Sensitive Cross-Correlation (MaSC), which removes this bias and allows for accurate and reliable fragment-length estimation. We analyze the computational complexity of this approach, and evaluate its performance on a test suite of NGS datasets, demonstrating its superiority to traditional cross-correlation analysis. AVAILABILITY: An open-source Perl implementation of our approach is available at CONTACT:
Parameswaran Ramachandran; Gareth A Palidwor; Christopher J Porter; Theodore J Perkins
Related Documents :
19430985 - Inpatient length of stay: a finite mixture modeling analysis.
19541685 - Mutation prediction models in lynch syndrome: evaluation in a clinical genetic setting.
24146005 - The density ratio dependence of self-similar rayleigh-taylor mixing.
24746125 - Evaluation of udder firmness by palpation and a dynamometer.
20835565 - Ph-cycling models for in vitro evaluation of the efficacy of fluoridated dentifrices fo...
11863975 - Neural network model for apparent deterministic chaos in spontaneously bursting hippoca...
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2013-1-7
Journal Detail:
Title:  Bioinformatics (Oxford, England)     Volume:  -     ISSN:  1367-4811     ISO Abbreviation:  Bioinformatics     Publication Date:  2013 Jan 
Date Detail:
Created Date:  2013-1-9     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  9808944     Medline TA:  Bioinformatics     Country:  -    
Other Details:
Languages:  ENG     Pagination:  -     Citation Subset:  -    
Ottawa Hospital Research Institute, Regenerative Medicine Program, K1H 8L6, Ottawa, Canada.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  BroadPeak: a novel algorithm for identifying broad peaks in diffuse ChIP-seq datasets.
Next Document:  A genome wide association study of genetic loci that influence tumour biomarkers cancer antigen 19-9...