Document Detail


Comparison of the NCI open database with seven large chemical structural databases.
MedLine Citation:
PMID:  11410049     Owner:  NLM     Status:  MEDLINE    
Abstract/OtherAbstract:
Eight large chemical databases have been analyzed and compared to each other. Central to this comparison is the open National Cancer Institute (NCI) database, consisting of approximately 250 000 structures. The other databases analyzed are the Available Chemicals Directory ("ACD," from MDL, release 1.99, 3D-version); the ChemACX ("ACX," from CamSoft, Version 4.5); the Maybridge Catalog and the Asinex database (both as distributed by CamSoft as part of ChemInfo 4.5); the Sigma-Aldrich Catalog (CD-ROM, 1999 Version); the World Drug Index ("WDI," Derwent, version 1999.03); and the organic part of the Cambridge Crystallographic Database ("CSD," from Cambridge Crystallographic Data Center, 1999 Version 5.18). The database properties analyzed are internal duplication rates; compounds unique to each database; cumulative occurrence of compounds in an increasing number of databases; overlap of identical compounds between two databases; similarity overlap; diversity; and others. The crystallographic database CSD and the WDI show somewhat less overlap with the other databases than those with each other. In particular the collections of commercial compounds and compilations of vendor catalogs have a substantial degree of overlap among each other. Still, no database is completely a subset of any other, and each appears to have its own niche and thus "raison d'être". The NCI database has by far the highest number of compounds that are unique to it. Approximately 200 000 of the NCI structures were not found in any of the other analyzed databases.
Authors:
J H Voigt; B Bienfait; S Wang; M C Nicklaus
Related Documents :
8250629 - Plasmodium falciparum sporozoite and entomological inoculation rates at the ahero rice ...
17239139 - True bipolar defibrillator leads have increased sensing latency and threshold compared ...
8558089 - Exchange of oxidized cholesteryl linoleate between ldl and hdl mediated by cholesteryl ...
8362759 - Effect of tool shape and work location on perceived exertion for work on horizontal sur...
2572639 - Control of an outbreak of nosocomial aspergillosis by laminar air-flow isolation.
8145909 - The prevalence of multiple sclerosis in the sanitary district of vélez-málaga, southe...
7475119 - Anti-giardial activity of gastrointestinal remedies of the luo of east africa.
17765709 - The dentate gyrus: fundamental neuroanatomical organization (dentate gyrus for dummies).
18251769 - Which naturalism for bioethics? a defense of moderate (pragmatic) naturalism.
Publication Detail:
Type:  Comparative Study; Journal Article    
Journal Detail:
Title:  Journal of chemical information and computer sciences     Volume:  41     ISSN:  0095-2338     ISO Abbreviation:  J Chem Inf Comput Sci     Publication Date:    2001 May-Jun
Date Detail:
Created Date:  2001-06-18     Completed Date:  2001-07-26     Revised Date:  2007-11-15    
Medline Journal Info:
Nlm Unique ID:  7505012     Medline TA:  J Chem Inf Comput Sci     Country:  United States    
Other Details:
Languages:  eng     Pagination:  702-12     Citation Subset:  IM    
Affiliation:
Laboratory of Medicinal Chemistry, Center for Cancer Research, National Cancer Institute, National Institutes of Health, NCI at Frederick, 376 Boyles Street, Frederick, Maryland 21702, USA.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Cluster Analysis
Databases, Factual*
National Institutes of Health (U.S.)
Quantitative Structure-Activity Relationship
Terminology as Topic
United States

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Quantitative structure-property relationships (QSPRs) for the estimation of vapor pressure: a hierar...
Next Document:  Neural network based chemical structure indexing.