| Analysis of the quality and utility of random shotgun sequencing at low redundancies. | |
| | |
MedLine Citation:
|
PMID: 9799794 Owner: NLM Status: MEDLINE |
Abstract/OtherAbstract:
|
The currently favored approach for sequencing the human genome involves selecting representative large-insert clones (100-200 kb), randomly shearing this DNA to construct shotgun libraries, and then sequencing many different isolates from the library. This method, entitled directed random shotgun sequencing, requires highly redundant sequencing to obtain a complete and accurate finished consensus sequence. Recently it has been suggested that a rapidly generated lower redundancy sequence might be of use to the scientific community. Low-redundancy sequencing has been examined previously using simulated data sets. Here we utilize trace data from a number of projects submitted to GenBank to perform reconstruction experiments that mimic low-redundancy sequencing. These low-redundancy sequences have been examined for the completeness and quality of the consensus product, information content, and usefulness for interspecies comparisons. The data presented here suggest three different sequencing strategies, each with different utilities. (1) Nearly complete sequence data can be obtained by sequencing a random shotgun library at sixfold redundancy. This may therefore represent a good point to switch from a random to directed approach. (2) Sequencing can be performed with as little as twofold redundancy to find most of the information about exons, EST hits, and putative exon similarity matches. (3) To obtain contiguity of coding regions, sequencing at three- to fourfold redundancy would be appropriate. From these results, we suggest that a useful intermediate product for genome sequencing might be obtained by three- to fourfold redundancy. Such a product would allow a large amount of biologically useful data to be extracted while postponing the majority of work involved in producing a high quality consensus sequence. |
| | |
Authors:
|
J Bouck; W Miller; J H Gorrell; D Muzny; R A Gibbs |
Related Documents
:
|
16026604 - Evaluation of glycine max mrna clusters. 15505804 - Gene expression analysis in the hippocampal formation of tree shrews chronically treate... 12424524 - The evolutionarily conserved single-copy gene for murine tpr encodes one prevalent isof... 12651724 - Tigr gene indices clustering tools (tgicl): a software system for fast clustering of la... 10943394 - David hopwood and the emergence of streptomyces genetics. 6117824 - Structure and organization of the gene coding for the dna binding protein of adenovirus... |
Publication Detail:
|
Type: Comparative Study; Journal Article; Research Support, U.S. Gov't, P.H.S. |
Journal Detail:
|
Title: Genome research Volume: 8 ISSN: 1088-9051 ISO Abbreviation: Genome Res. Publication Date: 1998 Oct |
Date Detail:
|
Created Date: 1998-12-04 Completed Date: 1998-12-04 Revised Date: 2009-11-18 |
Medline Journal Info:
|
Nlm Unique ID: 9518021 Medline TA: Genome Res Country: UNITED STATES |
Other Details:
|
Languages: eng Pagination: 1074-84 Citation Subset: IM |
Affiliation:
|
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030 USA. jbouck@bcm.tmc.edu |
Export Citation:
|
APA/MLA Format Download EndNote Download BibTex |
| MeSH Terms | |
Descriptor/Qualifier:
|
Animals Contig Mapping Expressed Sequence Tags Gene Library* Genome, Human Humans Mice Quality Control Retrospective Studies Sequence Analysis, DNA / methods*, standards |
| Grant Support | |
ID/Acronym/Agency:
|
HG01459/HG/NHGRI NIH HHS; LM05110/LM/NLM NIH HHS |
| Comments/Corrections | |
From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine
Previous Document: Large-scale sequencing of two regions in human chromosome 7q22: analysis of 650 kb of genomic sequen...
Next Document: Multivariate analysis of factors influencing repeat expansion detection.