Document Detail


Gene index analysis of the human genome estimates approximately 120,000 genes.
MedLine Citation:
PMID:  10835646     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
Although sequencing of the human genome will soon be completed, gene identification and annotation remains a challenge. Early estimates suggested that there might be 60,000-100,000 (ref. 1) human genes, but recent analyses of the available data from EST sequencing projects have estimated as few as 45,000 (ref. 2) or as many as 140, 000 (ref. 3) distinct genes. The Chromosome 22 Sequencing Consortium estimated a minimum of 45,000 genes based on their annotation of the complete chromosome, although their data suggests there may be additional genes. The nearly 2,000,000 human ESTs in dbEST provide an important resource for gene identification and genome annotation, but these single-pass sequences must be carefully analysed to remove contaminating sequences, including those from genomic DNA, spurious transcription, and vector and bacterial sequences. We have developed a highly refined and rigorously tested protocol for cleaning, clustering and assembling EST sequences to produce high-fidelity consensus sequences for the represented genes (F.L. et al., manuscript submitted) and used this to create the TIGR Gene Indices-databases of expressed genes for human, mouse, rat and other species (http://www.tigr.org/tdb/tgi.html). Using highly refined and tested algorithms for EST analysis, we have arrived at two independent estimates indicating the human genome contains approximately 120,000 genes.
Authors:
F Liang; I Holt; G Pertea; S Karamycheva; S L Salzberg; J Quackenbush
Related Documents :
15136916 - Bioinformatic mining of type i microsatellites from expressed sequence tags of channel ...
18428726 - Grailexp and genome analysis pipeline for genome annotation.
16757806 - Est-based identification of genes expressed in brain and spinal cord of gekko japonicus...
14656966 - Genomic analysis in the sting-2 quantitative trait locus for defensive behavior in the ...
8125336 - Saccharomyces carlsbergensis contains two functional met2 alleles similar to homologues...
19271196 - Population genomics and the bacterial species concept.
Publication Detail:
Type:  Journal Article; Research Support, U.S. Gov't, Non-P.H.S.    
Journal Detail:
Title:  Nature genetics     Volume:  25     ISSN:  1061-4036     ISO Abbreviation:  Nat. Genet.     Publication Date:  2000 Jun 
Date Detail:
Created Date:  2000-06-29     Completed Date:  2000-06-29     Revised Date:  2006-11-15    
Medline Journal Info:
Nlm Unique ID:  9216904     Medline TA:  Nat Genet     Country:  UNITED STATES    
Other Details:
Languages:  eng     Pagination:  239-40     Citation Subset:  IM    
Affiliation:
The Institute for Genomic Research, Rockville, Maryland, USA.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Algorithms
Chromosomes, Human, Pair 22 / genetics
Computational Biology
Consensus Sequence / genetics
Databases, Factual
Expressed Sequence Tags*
Genes*
Genome, Human*
Humans
Internet
Physical Chromosome Mapping
Reproducibility of Results
Software
Comments/Corrections
Comment In:
Nat Genet. 2000 Jun;25(2):127-8   [PMID:  10835616 ]
Nat Genet. 2000 Jun;25(2):129-30   [PMID:  10835617 ]
Erratum In:
Nat Genet 2000 Dec;26(4):501

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequ...
Next Document:  Receptor-dependent cell stress and amyloid accumulation in systemic amyloidosis.