Document Detail

Viral genome analysis and knowledge management.
MedLine Citation:
PMID:  23192551     Owner:  NLM     Status:  In-Data-Review    
One of the challenges of genetic data analysis is to combine information from sources that are distributed around the world and accessible through a wide array of different methods and interfaces. The HIV database and its footsteps, the hepatitis C virus (HCV) and hemorrhagic fever virus (HFV) databases, have made it their mission to make different data types easily available to their users. This involves a large amount of behind-the-scenes processing, including quality control and analysis of the sequences and their annotation. Gene and protein sequences are distilled from the sequences that are stored in GenBank; to this end, both submitter annotation and script-generated sequences are used. Alignments of both nucleotide and amino acid sequences are generated, manually curated, distilled into an alignment model, and regenerated in an iterative cycle that results in ever better new alignments. Annotation of epidemiological and clinical information is parsed, checked, and added to the database. User interfaces are updated, and new interfaces are added based upon user requests. Vital for its success, the database staff are heavy users of the system, which enables them to fix bugs and find opportunities for improvement. In this chapter we describe some of the infrastructure that keeps these heavily used analysis platforms alive and vital after nearly 25 years of use.The database/analysis platforms described in this chapter can be accessed at
Carla Kuiken; Hyejin Yoon; Werner Abfalterer; Brian Gaschen; Chienchi Lo; Bette Korber
Publication Detail:
Type:  Journal Article    
Journal Detail:
Title:  Methods in molecular biology (Clifton, N.J.)     Volume:  939     ISSN:  1940-6029     ISO Abbreviation:  Methods Mol. Biol.     Publication Date:  2013  
Date Detail:
Created Date:  2012-11-29     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  9214969     Medline TA:  Methods Mol Biol     Country:  United States    
Other Details:
Languages:  eng     Pagination:  253-61     Citation Subset:  IM    
Los Alamos National Laboratory, Theoretical Biology and Biophysics (MS K710), Los Alamos, NM, USA,
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Genome-wide association studies.
Next Document:  Molecular Network Analysis of Diseases and Drugs in KEGG.