Document Detail


ALCHEMY: a reliable method for automated SNP genotype calling for small batch sizes and highly homozygous populations.
MedLine Citation:
PMID:  20926420     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
MOTIVATION: The development of new high-throughput genotyping products requires a significant investment in testing and training samples to evaluate and optimize the product before it can be used reliably on new samples. One reason for this is current methods for automated calling of genotypes are based on clustering approaches which require a large number of samples to be analyzed simultaneously, or an extensive training dataset to seed clusters. In systems where inbred samples are of primary interest, current clustering approaches perform poorly due to the inability to clearly identify a heterozygote cluster.
RESULTS: As part of the development of two custom single nucleotide polymorphism genotyping products for Oryza sativa (domestic rice), we have developed a new genotype calling algorithm called 'ALCHEMY' based on statistical modeling of the raw intensity data rather than modelless clustering. A novel feature of the model is the ability to estimate and incorporate inbreeding information on a per sample basis allowing accurate genotyping of both inbred and heterozygous samples even when analyzed simultaneously. Since clustering is not used explicitly, ALCHEMY performs well on small sample sizes with accuracy exceeding 99% with as few as 18 samples.
AVAILABILITY: ALCHEMY is available for both commercial and academic use free of charge and distributed under the GNU General Public License at http://alchemy.sourceforge.net/
CONTACT: mhw6@cornell.edu
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors:
Mark H Wright; Chih-Wei Tung; Keyan Zhao; Andy Reynolds; Susan R McCouch; Carlos D Bustamante
Publication Detail:
Type:  Evaluation Studies; Journal Article; Research Support, U.S. Gov't, Non-P.H.S.     Date:  2010-10-05
Journal Detail:
Title:  Bioinformatics (Oxford, England)     Volume:  26     ISSN:  1367-4811     ISO Abbreviation:  Bioinformatics     Publication Date:  2010 Dec 
Date Detail:
Created Date:  2010-11-18     Completed Date:  2011-03-22     Revised Date:  2013-07-03    
Medline Journal Info:
Nlm Unique ID:  9808944     Medline TA:  Bioinformatics     Country:  England    
Other Details:
Languages:  eng     Pagination:  2952-60     Citation Subset:  IM    
Affiliation:
Department of Biological Statistics and Computational Biology, 102 Weill Hall, Cornell University, Ithaca, NY 14853, USA. mhw6@cornell.edu
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Algorithms*
Cluster Analysis
Genotype
Homozygote
Models, Statistical
Oligonucleotide Array Sequence Analysis / methods*
Oryza sativa / genetics
Polymorphism, Single Nucleotide*
Software
Comments/Corrections

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop.
Next Document:  MEDELLER: homology-based coordinate generation for membrane proteins.