Document Detail

Visualization of Molecular Fingerprints.
MedLine Citation:
PMID:  21696145     Owner:  NLM     Status:  Publisher    
A visualization plot of a dataset of molecular data is a useful tool for gaining insight into a set of molecules. In chemoinformatics, most visualization plots are of molecular descriptors, and the statistical model most often used to produce a visualization is principal component analysis (PCA). This paper takes PCA, together with four other statistical models (NeuroScale, GTM, LTM and LTM-LIN), and evaluates their ability to produce clustering in visualizations not of molecular descriptors, but of molecular fingerprints. Two different tasks are addressed: understanding structural information (particularly combinatorial libraries), and relating structure to activity. The quality of the visualizations is compared both subjectively (by visual inspection), and objectively (with global distance comparisons and local k-nearest-neighbor predictors). On the datasets used to evaluate clustering by structure, LTM is found to perform significantly better than the other models. In particular, the clusters in LTM visualization space are consistent with the relationships between the core scaffolds that define the combinatorial sublibraries. On the datasets used to evaluate clustering by activity, LTM again gives the best performance, but by a smaller margin. The results of this paper demonstrate the value of using both a nonlinear projection map, and a Bernoulli noise model, for modeling binary data.
Ian Nabney; J R Owen; Jose Luis Medina-Franco; Fabian Lopez-Vallejo
Related Documents :
22050495 - Staghorn morphometry: a new tool for clinical classification and prediction model for p...
21740915 - Co-dominance and succession in forest dynamics: the role of interspecific differences i...
21779325 - Problems with using the normal distribution--and ways to improve quality and efficiency...
22017125 - Trophic levels of fish species of commercial importance in the colombian caribbean.
19380125 - Eeg-based classification for elbow versus shoulder torque intentions involving stroke s...
24340025 - Coral reef habitat response to climate change scenarios.
Publication Detail:
Type:  JOURNAL ARTICLE     Date:  2011-6-22
Journal Detail:
Title:  Journal of chemical information and modeling     Volume:  -     ISSN:  1549-960X     ISO Abbreviation:  -     Publication Date:  2011 Jun 
Date Detail:
Created Date:  2011-6-23     Completed Date:  -     Revised Date:  -    
Medline Journal Info:
Nlm Unique ID:  101230060     Medline TA:  J Chem Inf Model     Country:  -    
Other Details:
Languages:  ENG     Pagination:  -     Citation Subset:  -    
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Previous Document:  Accelerating Chemical Database Searching Using Graphics Processing Units (GPUs).
Next Document:  1-Alkynyl- and 1-Alkenyl-3-arylimidazo[1,5-a]pyridines: Synthesis, Photophysical Properties, and Obs...