Document Detail

A high-performance computing toolset for relatedness and principal component analysis of SNP data.
MedLine Citation:
PMID:  23060615     Owner:  NLM     Status:  MEDLINE    
Genome-wide association studies are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed gdsfmt and SNPRelate (R packages for multi-core symmetric multiprocessing computer architectures) to accelerate two key computations on SNP data: principal component analysis (PCA) and relatedness analysis using identity-by-descent measures. The kernels of our algorithms are written in C/C++ and highly optimized. Benchmarks show the uniprocessor implementations of PCA and identity-by-descent are ∼8-50 times faster than the implementations provided in the popular EIGENSTRAT (v3.0) and PLINK (v1.07) programs, respectively, and can be sped up to 30-300-fold by using eight cores. SNPRelate can analyse tens of thousands of samples with millions of SNPs. For example, our package was used to perform PCA on 55 324 subjects from the 'Gene-Environment Association Studies' consortium studies.
Xiuwen Zheng; David Levine; Jess Shen; Stephanie M Gogarten; Cathy Laurie; Bruce S Weir
Related Documents :
12725965 - Gis based malaria information management system for urban malaria scheme in india.
21967955 - Smile esthetics from patients' perspectives for faces of varying attractiveness.
23325225 - Salience signals in the right temporoparietal junction facilitate value-based decisions.
24765935 - Measurement-based classical computation.
21452095 - Parameters affecting seat belt use in greece.
25217575 - Eurocarbdb(ccrc): a eurocarbdb node for storing glycomics standard data.
Publication Detail:
Type:  Journal Article; Research Support, N.I.H., Extramural     Date:  2012-10-11
Journal Detail:
Title:  Bioinformatics (Oxford, England)     Volume:  28     ISSN:  1367-4811     ISO Abbreviation:  Bioinformatics     Publication Date:  2012 Dec 
Date Detail:
Created Date:  2012-12-10     Completed Date:  2013-07-29     Revised Date:  2013-12-04    
Medline Journal Info:
Nlm Unique ID:  9808944     Medline TA:  Bioinformatics     Country:  England    
Other Details:
Languages:  eng     Pagination:  3326-8     Citation Subset:  IM    
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Genome-Wide Association Study*
Polymorphism, Single Nucleotide*
Principal Component Analysis*
Grant Support

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Tools for mapping high-throughput sequencing data.
Next Document:  CLEVER: Clique-Enumerating Variant Finder.