Document Detail


A High-performance Computing Toolset for Relatedness and Principal Component Analysis of SNP Data.
MedLine Citation:
PMID:  23060615     Owner:  NLM     Status:  Publisher    
Abstract/OtherAbstract:
SUMMARY: Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed gdsfmt and SNPRelate (R packages for multi-core symmetric multiprocessing computer architectures) to accelerate two key computations on SNP data: principal component analysis (PCA) and relatedness analysis using identity-by-descent (IBD) measures. The kernels of our algorithms are written in C/C++ and highly optimized. Benchmarks show the uniprocessor implementations of PCA and IBD are ~8 to 50 times faster than the implementations provided in the popular EIGENSTRAT (v3.0) and PLINK (v1.07) programs respectively, and can be sped up to 30~300 fold by utilizing eight cores. SNPRelate can analyze tens of thousands of samples with millions of SNPs. For example, our package was used to perform PCA on 55,324 subjects from the "Gene-Environment Association Studies" (GENEVA) consortium studies. AVAILABILITY: gdsfmt and SNPRelate are available from R CRAN (http://cran.r-project.org), including avignette.A tutorial can be found athttps://www.genevastudy.org/Accomplishments/software CONTACT: Xiuwen Zheng (zhengx@u.washington.edu).
Authors:
Xiuwen Zheng; David Levine; Jess Shen; Stephanie M Gogarten; Cathy Laurie; Bruce S Weir
Related Documents :
10204395 - Graph-theoretic description of the interplay between non-linearity and connectivity in ...
21889105 - Measuring the validity and reliability of forensic likelihood-ratio systems.
16252815 - Probabilistic approaches to fault detection in networked discrete event systems.
22100835 - Gaussian wavelet transform and classifier to reliably estimate latency of multifocal vi...
8178785 - Telephone sampling in epidemiologic research: to reap the benefits, avoid the pitfalls.
21149695 - Thalamocortical model for a propofol-induced alpha-rhythm associated with loss of consc...
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2012-10-11
Journal Detail:
Title:  Bioinformatics (Oxford, England)     Volume:  -     ISSN:  1367-4811     ISO Abbreviation:  Bioinformatics     Publication Date:  2012 Oct 
Date Detail:
Created Date:  2012-10-12     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  9808944     Medline TA:  Bioinformatics     Country:  -    
Other Details:
Languages:  ENG     Pagination:  -     Citation Subset:  -    
Affiliation:
Departments of Biostatistics, University of Washington, Box 357232, Seattle, Washington 98195-7232, USA.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Tools for mapping high-throughput sequencing data.
Next Document:  CLEVER: Clique-Enumerating Variant Finder.