Document Detail


Visualization of Molecular Fingerprints.
MedLine Citation:
PMID:  21696145     Owner:  NLM     Status:  Publisher    
Abstract/OtherAbstract:
A visualization plot of a dataset of molecular data is a useful tool for gaining insight into a set of molecules. In chemoinformatics, most visualization plots are of molecular descriptors, and the statistical model most often used to produce a visualization is principal component analysis (PCA). This paper takes PCA, together with four other statistical models (NeuroScale, GTM, LTM and LTM-LIN), and evaluates their ability to produce clustering in visualizations not of molecular descriptors, but of molecular fingerprints. Two different tasks are addressed: understanding structural information (particularly combinatorial libraries), and relating structure to activity. The quality of the visualizations is compared both subjectively (by visual inspection), and objectively (with global distance comparisons and local k-nearest-neighbor predictors). On the datasets used to evaluate clustering by structure, LTM is found to perform significantly better than the other models. In particular, the clusters in LTM visualization space are consistent with the relationships between the core scaffolds that define the combinatorial sublibraries. On the datasets used to evaluate clustering by activity, LTM again gives the best performance, but by a smaller margin. The results of this paper demonstrate the value of using both a nonlinear projection map, and a Bernoulli noise model, for modeling binary data.
Authors:
Ian Nabney; J R Owen; Jose Luis Medina-Franco; Fabian Lopez-Vallejo
Related Documents :
21889515 - Statistical potentials for hairpin and internal loops improve the accuracy of the predi...
21769245 - Neutral zone classifiers using a decision-theoretic approach with application to dna ar...
22438515 - A comparison study of validity indices on swarm-intelligence-based clustering.
22598925 - Die pyelonephritis.
3816175 - Multi-forte, a microcomputer program for modelling and simulation of pharmacokinetic data.
8947705 - A comparison of three techniques for rapid model development: an application in patient...
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2011-6-22
Journal Detail:
Title:  Journal of chemical information and modeling     Volume:  -     ISSN:  1549-960X     ISO Abbreviation:  -     Publication Date:  2011 Jun 
Date Detail:
Created Date:  2011-6-23     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  101230060     Medline TA:  J Chem Inf Model     Country:  -    
Other Details:
Languages:  ENG     Pagination:  -     Citation Subset:  -    
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine


Previous Document:  Accelerating Chemical Database Searching Using Graphics Processing Units (GPUs).
Next Document:  1-Alkynyl- and 1-Alkenyl-3-arylimidazo[1,5-a]pyridines: Synthesis, Photophysical Properties, and Obs...