Document Detail


Analysis of the quality and utility of random shotgun sequencing at low redundancies.
MedLine Citation:
PMID:  9799794     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
The currently favored approach for sequencing the human genome involves selecting representative large-insert clones (100-200 kb), randomly shearing this DNA to construct shotgun libraries, and then sequencing many different isolates from the library. This method, entitled directed random shotgun sequencing, requires highly redundant sequencing to obtain a complete and accurate finished consensus sequence. Recently it has been suggested that a rapidly generated lower redundancy sequence might be of use to the scientific community. Low-redundancy sequencing has been examined previously using simulated data sets. Here we utilize trace data from a number of projects submitted to GenBank to perform reconstruction experiments that mimic low-redundancy sequencing. These low-redundancy sequences have been examined for the completeness and quality of the consensus product, information content, and usefulness for interspecies comparisons. The data presented here suggest three different sequencing strategies, each with different utilities. (1) Nearly complete sequence data can be obtained by sequencing a random shotgun library at sixfold redundancy. This may therefore represent a good point to switch from a random to directed approach. (2) Sequencing can be performed with as little as twofold redundancy to find most of the information about exons, EST hits, and putative exon similarity matches. (3) To obtain contiguity of coding regions, sequencing at three- to fourfold redundancy would be appropriate. From these results, we suggest that a useful intermediate product for genome sequencing might be obtained by three- to fourfold redundancy. Such a product would allow a large amount of biologically useful data to be extracted while postponing the majority of work involved in producing a high quality consensus sequence.
Authors:
J Bouck; W Miller; J H Gorrell; D Muzny; R A Gibbs
Related Documents :
16026604 - Evaluation of glycine max mrna clusters.
15505804 - Gene expression analysis in the hippocampal formation of tree shrews chronically treate...
12424524 - The evolutionarily conserved single-copy gene for murine tpr encodes one prevalent isof...
12651724 - Tigr gene indices clustering tools (tgicl): a software system for fast clustering of la...
10943394 - David hopwood and the emergence of streptomyces genetics.
6117824 - Structure and organization of the gene coding for the dna binding protein of adenovirus...
Publication Detail:
Type:  Comparative Study; Journal Article; Research Support, U.S. Gov't, P.H.S.    
Journal Detail:
Title:  Genome research     Volume:  8     ISSN:  1088-9051     ISO Abbreviation:  Genome Res.     Publication Date:  1998 Oct 
Date Detail:
Created Date:  1998-12-04     Completed Date:  1998-12-04     Revised Date:  2009-11-18    
Medline Journal Info:
Nlm Unique ID:  9518021     Medline TA:  Genome Res     Country:  UNITED STATES    
Other Details:
Languages:  eng     Pagination:  1074-84     Citation Subset:  IM    
Affiliation:
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030 USA. jbouck@bcm.tmc.edu
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Animals
Contig Mapping
Expressed Sequence Tags
Gene Library*
Genome, Human
Humans
Mice
Quality Control
Retrospective Studies
Sequence Analysis, DNA / methods*,  standards
Grant Support
ID/Acronym/Agency:
HG01459/HG/NHGRI NIH HHS; LM05110/LM/NLM NIH HHS
Comments/Corrections

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Large-scale sequencing of two regions in human chromosome 7q22: analysis of 650 kb of genomic sequen...
Next Document:  Multivariate analysis of factors influencing repeat expansion detection.