| CLEVER: Clique-Enumerating Variant Finder. | |
| | |
MedLine Citation:
|
PMID: 23060616 Owner: NLM Status: Publisher |
Abstract/OtherAbstract:
|
MOTIVATION: Next-generation sequencing techniques have facilitated large scale analysis of human genetic variation. Despite the advances in sequencing speed, the computational discovery of structural variants is not yet standard. It is likely that many variants have remained undiscovered in most sequenced individuals. RESULTS: Here we present a novel internal segment size based approach, which organizes all, including concordant, reads into a read alignment graph where max-cliques represent maximal contradiction-free groups of alignments. A novel algorithm then enumerates all max-cliques and statistically evaluates them for their potential to reflect insertions or deletions. For the first time in the literature, we compare a large range of state-of-the-art approaches using simulated Illumina reads from a fully annotated genome and present relevant performance statistics. We achieve superior performance in particular for indels of length 20-100nt. This has been previously identified as a remaining major challenge in structural variation discovery, in particular for insert size based approaches. In this size range we outperform even split read aligners. We achieve competitive results also on biological data where our method is the only one to make a substantial amount of correct predictions, which, additionally, are disjoint from those by split-read aligners. AVAILABILITY: CLEVER is open source (GPL) and available from http://clever-sv.googlecode.com. CONTACT: tobias.marschall@tu-dortmund.de. |
| | |
Authors:
|
Tobias Marschall; Ivan Costa; Stefan Canzar; Markus Bauer; Gunnar W Klau; Alexander Schliep; Alexander Schönhuth |
Related Documents
:
|
23226196 - Genealogy-based methods for inference of historical recombination and gene flow and the... 23239846 - Radishbase: a database for genomics and genetics of radish. 23658666 - Correction: mabsbase: a mycobacterium abscessus genome and annotation database. 15862946 - Cloning and characterization of the 5'-flanking region of the rat estrogen receptor bet... 18849526 - Overlapping euchromatin/heterochromatin- associated marks are enriched in imprinted gen... 10534406 - Human eukaryotic initiation factor eif2c1 gene: cdna sequence, genomic organization, lo... |
Publication Detail:
|
Type: JOURNAL ARTICLE Date: 2012-10-11 |
Journal Detail:
|
Title: Bioinformatics (Oxford, England) Volume: - ISSN: 1367-4811 ISO Abbreviation: Bioinformatics Publication Date: 2012 Oct |
Date Detail:
|
Created Date: 2012-10-12 Completed Date: - Revised Date: - |
Medline Journal Info:
|
Nlm Unique ID: 9808944 Medline TA: Bioinformatics Country: - |
Other Details:
|
Languages: ENG Pagination: - Citation Subset: - |
Affiliation:
|
Centrum Wiskunde & Informatica, Amsterdam, the Netherlands, Federal University of Pernambuco, Recife, Brazil, Illumina, Cambridge, UK, Rutgers, The State University of New Jersey, Piscataway, NJ, USA. |
Export Citation:
|
APA/MLA Format Download EndNote Download BibTex |
| MeSH Terms | |
Descriptor/Qualifier:
|
|
From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine
Previous Document: A High-performance Computing Toolset for Relatedness and Principal Component Analysis of SNP Data.
Next Document: Transcriptome Assembly and Isoform Expression Level Estimation from Biased RNA-Seq Reads.