Document Detail


MaSC: Mappability-Sensitive Cross-Correlation for Estimating Mean Fragment Length of Single-End Short Read Sequencing Data.
MedLine Citation:
PMID:  23300135     Owner:  NLM     Status:  Publisher    
Abstract/OtherAbstract:
MOTIVATION: Reliable estimation of the mean fragment length for nextgeneration short-read sequencing data is an important step in NGS analysis pipelines, most notably because of its impact on the accuracy of the enriched regions identified by peak-calling algorithms. Although many peak-calling algorithms include a fragment-length estimation subroutine, the problem has not been adequately solved, as demonstrated by the variability of the estimates returned by different algorithms. RESULTS: In this paper, we investigate the use of strand cross-correlation to estimate mean fragment length of single-end data and show that traditional estimation approaches have mixed reliability. We observe that the mappability of different parts of the genome can introduce an artificial bias into cross-correlation computations, resulting in incorrect fragment-length estimates. We propose a new approach, called Mappability-Sensitive Cross-Correlation (MaSC), which removes this bias and allows for accurate and reliable fragment-length estimation. We analyze the computational complexity of this approach, and evaluate its performance on a test suite of NGS datasets, demonstrating its superiority to traditional cross-correlation analysis. AVAILABILITY: An open-source Perl implementation of our approach is available at http://www.perkinslab.ca/Software.html. CONTACT: tperkins@ohri.ca.
Authors:
Parameswaran Ramachandran; Gareth A Palidwor; Christopher J Porter; Theodore J Perkins
Related Documents :
24309175 - Computational studies on the anastrozole and letrozole, effective chemotherapy drugs ag...
12443745 - A bayesian hierarchical approach to comparative audit for carotid surgery.
24825425 - Nomograms for predicting changes in semen parameters in infertile men after varicocele ...
23483015 - Evaluation of the predicted error of the soil moisture retrieval from c-band sar by com...
15364515 - Biofilm quantification on stone surfaces: comparison of various methods.
285265 - A laboratory technique for teaching root resection.
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2013-1-7
Journal Detail:
Title:  Bioinformatics (Oxford, England)     Volume:  -     ISSN:  1367-4811     ISO Abbreviation:  Bioinformatics     Publication Date:  2013 Jan 
Date Detail:
Created Date:  2013-1-9     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  9808944     Medline TA:  Bioinformatics     Country:  -    
Other Details:
Languages:  ENG     Pagination:  -     Citation Subset:  -    
Affiliation:
Ottawa Hospital Research Institute, Regenerative Medicine Program, K1H 8L6, Ottawa, Canada.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  BroadPeak: a novel algorithm for identifying broad peaks in diffuse ChIP-seq datasets.
Next Document:  A genome wide association study of genetic loci that influence tumour biomarkers cancer antigen 19-9...