| Structure-based virtual screening for drug discovery: a problem-centric review. | |
| | |
| Jump to Full Text | |
MedLine Citation:
|
PMID: 22281989 Owner: NLM Status: MEDLINE |
Abstract/OtherAbstract:
|
Structure-based virtual screening (SBVS) has been widely applied in early-stage drug discovery. From a problem-centric perspective, we reviewed the recent advances and applications in SBVS with a special focus on docking-based virtual screening. We emphasized the researchers' practical efforts in real projects by understanding the ligand-target binding interactions as a premise. We also highlighted the recent progress in developing target-biased scoring functions by optimizing current generic scoring functions toward certain target classes, as well as in developing novel ones by means of machine learning techniques. |
| | |
Authors:
|
Tiejun Cheng; Qingliang Li; Zhigang Zhou; Yanli Wang; Stephen H Bryant |
Related Documents
:
|
15168159 - Efficacy of tacrolimus in treatment of polymyositis associated with myasthenia gravis. 2061599 - Termination of paroxysmal supraventricular tachycardia with oral diltiazem. 2095399 - Prevention of gastrointestinal absorption of phenobarbital by activated carbon beads as... 21375469 - Challenges and advances in the development of inhalable drug formulations for cystic fi... 9663939 - Neurotoxicity risk assessment of mptp (n-methyl-4-phenyl-1,2,3,6-tetrahydropyridine) as... 1502779 - Oral administration of calcium chloride-containing products: testing for deleterious si... 10874519 - Tris lipidation: a chemically flexible technology for modifying the delivery of drugs a... 17896769 - Biodegradable microparticles and fiber fabrics for sustained delivery of cisplatin to t... 16640819 - Fatal interstitial pneumonitis related to rituximab-containing regimen. |
Publication Detail:
|
Type: Journal Article; Research Support, N.I.H., Intramural; Review Date: 2012-01-27 |
Journal Detail:
|
Title: The AAPS journal Volume: 14 ISSN: 1550-7416 ISO Abbreviation: AAPS J Publication Date: 2012 Mar |
Date Detail:
|
Created Date: 2012-02-21 Completed Date: 2012-06-18 Revised Date: 2013-04-18 |
Medline Journal Info:
|
Nlm Unique ID: 101223209 Medline TA: AAPS J Country: United States |
Other Details:
|
Languages: eng Pagination: 133-41 Citation Subset: IM |
Affiliation:
|
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA. |
Export Citation:
|
APA/MLA Format Download EndNote Download BibTex |
| MeSH Terms | |
Descriptor/Qualifier:
|
Computer-Aided Design* Drug Design* Drug Discovery / methods* Drug Industry / methods Humans Ligands Protein Binding Structure-Activity Relationship |
| Chemical | |
Reg. No./Substance:
|
0/Ligands |
| Comments/Corrections | |
| Full Text | |
|
Journal Information Journal ID (nlm-ta): AAPS J ISSN: 1550-7416 Publisher: Springer US, Boston |
Article Information Download PDF ![]() © The Author(s) 2012 Received Day: 17 Month: 8 Year: 2011 Accepted Day: 4 Month: 1 Year: 2012 Electronic publication date: Day: 27 Month: 1 Year: 2012 pmc-release publication date: Day: 27 Month: 1 Year: 2012 collection publication date: Month: 3 Year: 2012 Volume: 14 Issue: 1 First Page: 133 Last Page: 141 ID: 3282008 PubMed Id: 22281989 Publisher Id: 9322 DOI: 10.1208/s12248-012-9322-0 |
| Structure-Based Virtual Screening for Drug Discovery: a Problem-Centric Review | |
| Tiejun ChengAff1 | |
| Qingliang LiAff1 | |
| Zhigang ZhouAff1 | |
| Yanli WangAff1 |
Address: +1-301-4357811 ywang@ncbi.nlm.nih.gov |
| Stephen H. BryantAff1 |
Address: +1-301-4357792 bryant@ncbi.nlm.nih.gov |
| National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, Maryland 20894 USA |
|
| Communicated by footnote: Guest Editor: Xiang-Qun Xie |
|
The discovery of innovative leads with potential interaction to specific targets is of central importance to the early-stage drug discovery. This is conventionally achieved by wet-lab high-throughput screening (HTS), an established technology adopted by pharmaceutical industry. On the other hand, the high cost and low hit rate associated with HTS have stimulated the development of computational alternatives and the broad application of the cheaper and faster screening in silico (1,2). The completion of the Human Genome Project has revealed a wealth of attractive druggable targets (3). Meanwhile, structure biology advances in X-ray crystallography and nuclear magnetic resonance spectroscopy have further opened doors to structure-based virtual screening (SBVS) by offering in-depth structural details of these targets as well as their interactions with ligands (4,5).
There have been a mounting number of success stories reported by use of SBVS (4,6), among which docking-based virtual screening (DBVS) is arguably the most widely applied one in practice (7). Here, we reviewed the recent advances and applications in SBVS from a problem-centric perspective with a focus on DBVS, such as the practical aspects about enriching screening library before docking, considering target flexibility, metal ions, water molecules, and other key ligand–target interactions and environmental factors during docking and improving pose/compound selection after docking. We emphasized the importance of profound knowledge of the targets and/or their interactions with ligands to a successful project. We also highlighted the recent progress in developing target-biased scoring function and the trend in applying machine learning techniques to build scoring functions. As the area of DBVS is often actively reviewed, we confined our survey to the primary publications since 2007 within a 5-year time frame.
The basic inputs of a typical DBVS workflow are a target structure, either experimentally solved or computationally modeled, and a compound library of small molecules available via purchase or synthesis (Fig. 1). Often, both the target and the compound library require preparations, such as assigning proper tautomeric, stereoisomeric, and protonation states (8,9). Each compound in the library is virtually docked into the target binding site through a docking program, which computationally models the ligand–target interaction to achieve an optimal complementarity of steric and physicochemical properties. A mathematical algorithm (referred to as “scoring function”) is then used to evaluate the fitness between the docked compound and the target. This is often followed by a post-processing step, in which compounds were ranked and selected on the basis of calculated binding scores and/or other criteria, and usually only a small group of top-ranked compounds will be chosen as candidates for later experimental assays. During the past decades, a large number of docking programs have been developed (10–18). Among the most popular ones are AutoDock, Dock, FlexX, Glide, Gold, Surflex, ICM, LigandFit, and eHiTS, to name only a few (Table I).
Substantial process in DBVS requires a deep knowledge of the nature of the designated target system and/or the ligand–target binding mechanism (6). It thus seems more appropriate in many applications to view DBVS from a problem-centric than a method-centric perspective (19). In this work, we provided a review by focusing on the knowledge-based practices and efforts that were adopted by researchers throughout the workflow of DBVS (Fig. 1). General advances in the ligand conformational sampling algorithms of docking programs have been extensively reviewed elsewhere (7,20–24) and were thus not covered here.
It is well accepted that the content and quality of a compound library have pivotal effects on the success of a DBVS project (25). Table II summarizes an incomplete list of public and commercial chemical databases that are commonly screened in real practices. These databases often contain a vast amount of small-molecule compounds varying from several tens of thousands to several millions. Despite the increasing power of modern computers, a blind docking with all library compounds often leads to a waste of time and computer resource. Moreover, it will impose a great burden on later compound selection. Therefore, it would be always wise to remove undesirable compounds and select only relevant ones from a library before the cost-intensive docking. A common strategy is to apply fast physicochemical filters inspired by the rule of five (26) or ligand-based similarity search seeded with known active ligands (27).
A more object-oriented and efficient approach might be designing a focused library for specific targets. For example, Gozalbes et al. have enriched a kinase-targeted compound library using kinase-specific filters, which were derived from systematic docking and scoring of 123 diverse ligands against three kinases with known crystal structures (28). For each kinase, the filter is constructed in two steps. First, the highest score given by a certain scoring function among all docking poses of a known ligand is used as the score for this ligand. Second, the lowest score among all known ligands is selected as the threshold for the current scoring function. Combining all thresholds from six scoring functions comprises the final filter. This method was validated by testing 60 compounds, which were split evenly into two groups including those passed all the thresholds and the rest. An overall 6.7-fold higher hit rate was obtained for the first group. Likewise, Sage et al. (29) have introduced the GA-focused descriptor active space (GAFDAS) method to design a focused chemical space for G-protein coupled receptors by selecting target-specific descriptors through genetic algorithm. Though their method was validated in the context of ligand-based virtual screening, it could be applied in SBVS to design enriched library as well.
Structural details from observed ligand–target complexes are useful to derive pharmacophoric filters, which may be used for enriching a library with compounds that satisfy specific geometric and/or physicochemical constraints. For instance, Kireev et al. (30) have applied the Discovery Studio software to construct a pharmacophore model including a hydrogen bond donor (HBD), a hydrogen bond acceptor (HBA), and an amine cation involved in an ionic bond with the Asp355 residue that are observed in the crystal structure of L3MBTL1 protein in complex with H4K20me2 ligand. With these pharmacophoric constraints, the original 5,888,263 compounds were dramatically reduced to 20,078 compounds, which were subsequently subject to docking analysis. Similarly, Lee et al. have constructed two pharmacophore models for vascular endothelial growth factor kinase 2 (VEGFR2) using a crystal complex structure and validated them with 15 known VEGFR2 inhibitors (31). In their study, a set of 59,600 compounds was narrowed down to 16,000 and 19,100 compounds using the above two pharmacophore models as queries, respectively. In the absence of experimental structure of target, a homology model can also be indicative for analyzing the key ligand–target interactions. For example, in an attempt to discover novel inhibitors of protein arginine methyltransferase 1 (PRMT1), Heinke et al. have defined a structure-based pharmacophore model based on a homology structure of PRMT1 in complex with S-adenosylhomocysteine (32). The 6,232 compounds that matched the pharmacophoric features (one HBD, one HBA, and two hydrophobic/aromatic constraints) were enriched from the initial 189,000 compounds for subsequent docking study.
Molecular targets are dynamic in their physiological environment, which are often crucial for various biological functions. The target binding pocket often adapts upon ligand binding to fit the ligands through various conformational changes ranging from small side-chain flip to large loop shift. Nevertheless, the experimentally solved target structures or ligand–target complex structures are basically static snapshots. Though previous works have shown that proper consideration of target flexibility can improve DBVS results (33), it still represents one of the greatest challenges for current docking programs (34) and becomes a hot issue in recent DBVS studies (35–39).
Ensemble docking that takes advantage of multiple target conformers has emerged as a partial solution to account for target flexibility in docking. The MultiCopyMD method developed by Okamoto et al. can generate a target ensemble through molecular dynamics (MD) with multiple ligands in the target binding site simultaneously (40). Applying this target ensemble in their SBVS for novel inhibitors of death-associated protein kinase (DAPK), they discovered a highly potent (IC50 = 69 nM) and selective inhibitor for DAPK1. To select appropriate target conformers, Rueda et al. have suggested a simple recipe by choosing the target conformers co-crystallized with the largest ligands (41), providing higher selectivity and better results than randomly picked ones when combined in ensemble. Using cyclin-dependent kinase 2 (CDK2) as a test example, Sperandio et al. have demonstrated normal mode analysis as an effective tool to select relevant target conformations with diverse binding sites (42). Generally in ensemble docking, an individual docking run is required for each target conformation, which is thus computationally inefficient. To address this issue, Bottegoni et al. have proposed a 4D docking approach that allows fast and accurate account of target conformational ensembles in a single docking simulation (43). This is achieved by merging 3D grids from optimally superimposed multiple target conformers into a single 4D object.
Some targets, such as metalloproteins, contain transition metal ions in their binding sites. The binding of ligand to these targets can be substantially distinct from other target types since such metal ions often coordinate ligand polar atoms, which may help to place and orient the ligand correctly in the binding sites. However, it is nontrivial to take metal ions into account accurately in current docking/scoring algorithms. The neglection of them would inevitably lead to underestimation of the metal–ligand interaction or even incorrectly docked ligands. Therefore, increasing attentions are being paid to metal ions in recent DBVS.
Röhrig et al. have studied the irons in heme proteins and demonstrated their importance for DBVS (44). Two docking runs were performed in parallel by using a test set of 50 heme-containing complexes with iron–ligand contact. In one standard docking using EADock, a success rate of only 28% was achieved, clearly indicating the underestimation of the role of iron–ligand interactions. They then introduced the Morse-like metal binding potentials into EADock, which were fitted to reproduce density functional theory calculations. As a result, the success rate was doubled to 62%. To evaluate the reliability of the chosen docking protocol for screening potent cytochrome P450 aromatase inhibitors (AIs), Caporuscio et al. investigated a set of known imidazole and triazole AIs and found that the Glide docking program failed to predict a correct binding mode in all cases where the azole nitrogen coordinates the heme iron (45). This observation inspired them to set up a metal constraint in Glide, which requires that a ligand atom lies within a certain region of the binding site in order to interact with specific target functionalities. Their structure-based design efforts eventually resulted in several novel AIs with IC50 activity in the range of 21.7 μM to 9.4 nM.
Missing parameters of zinc ions is another common barrier for docking many metalloenzymes including histone deacetylases (HDACs). In seek of novel HDAC inhibitors, Park et al. derived potential parameters for zinc ions following a standard procedure (46), in which geometry optimization of a simplified structural model was conducted for the active-site zinc ion cluster in complex with a hydroxamate-based inhibitor at the B3LYP/6–31 G** theory level. With these zinc parameters, they discovered six novel HDAC inhibitors with IC50 value ranging from 1 to 100 μM.
There is a recognition that active-site water molecules play an important role in ligand-target binding (47). Such water molecules can significantly contribute enthalpically and entropically to ligand–target binding. The most known role of water molecules is to mediate the ligand–target interaction by forming hydrogen bonds at the interface between the ligand and the target. On the other hand, the presence (or absence) and the location of water molecules may vary largely among ligands (48). Despite their critical role, accounting for water molecules accurately in docking is a long-standing challenge. Several very recent studies directly targeted this issue.
Abel and coworkers have developed a unique approach WaterMap (49) to account for the contribution of the displacement of water molecules by ligand to binding free energy. It first identifies “hydration sites” in the active site by clustering the trajectories from MD simulation of a solvated target with explicit water molecules. Inhomogeneous solvation theory is then applied to compute the thermodynamic properties of these active-site solvents including enthalpic and entropic changes. A displaced solvent functional is derived to estimate the relative binding free energies of a series of congeneric ligands based on their measured free energies by displacing active-site water molecules. This feature has made WaterMap particularly suitable for (and thus also limited to) lead optimization by providing insightful guidance to medicinal chemistry. More recently, WaterMap has been augmented by the introduction of an additional term attributable to the occupation of the dry regions in the target active site by ligand atoms (50).
Lie et al. have proposed a very interesting approach that attached water molecules to ligand during docking (51). In their method, ligand polar atoms are solvated with maximum number of water molecules, which are then retained or displaced depending on energy contributions during docking simulation. The novelty of their method is that each water molecule is treated as a flexible on/off part of the ligand, instead of being a static part of the target. In such a manner, water molecules are sampled with the same flexibility as the ligand itself. Their method has been evaluated with considerable improvement by using 12 structurally diverse complexes, where several water molecules bridge the ligand and the target.
Rossato et al. have introduced a directional approach, AcquaAlta, to consider the solvation of ligand–target complexes (52). Through an extensive analysis of the Cambridge Structural Database, they derived a geometric criteria defining interactions of water molecules with ligand and target. They also evaluated the propensity of ligand hydration through ab initio calculations. AcquaAlta has been validated with 20 crystal structures and reproduced 76% of the positions of water molecules that were experimentally observed.
Understanding of the interactions essential for ligand–target binding is critical to the success of lead discovery and optimization. For example, in a recent attempt to identify novel inhibitors of trihydroxynaphthalene reductase (3HNR) (53), the authors first overlaid the known 3HNR inhibitors and then constructed a pharmacophore model that consists of several key interaction points within the active site: H-bonds with Ser149, Tyr163, Met200, and Tyr201 and π-stacking with Tyr208. In accordance to these interactions, the docking experiment was conducted in such a way that it only considered docking solutions that predicted π-stacking with Tyr208 and an optional H-bond with Ser149. The most potent hit compound they found exhibited a Ki of 5.3 ± 0.3 μM against 3HNR.
As revealed by the crystal structures of kinases in complex with ATP-competing inhibitors, such inhibitors typically form at least one hydrogen bond with backbone amide or carbonyl groups in the hinge region. Therefore, introducing relevant constraints with the hinge region for the molecules docked into the ATP sites of kinases would improve the chance of finding active compounds. This has been practiced by Ravindranathan et al. in the hit discovery of fibroblast growth factor receptor 1 (FGFR1) (54). Among the 23 purchasable compounds suggested by a virtual screening experiment against 2.2 million compounds, two were identified to inhibit FGFR1 kinase with medium potency (IC50 = 23 and 50 μM, respectively).
For certain target or ligand system, specifically designed methods may be more efficient. For example, Lang et al. recently have optimized DOCK 6 for docking small molecules to RNA targets (55) and obtained a success rate of 70% for the ligands with less than seven rotatable bonds at the 2-Å heavy-atom root-mean-squared deviation threshold. The BALLDock/SLICK developed by Kerzmann is a ligand-specific docking approach for docking carbohydrate or carbohydrate-like compounds, which are often problematic for standard docking programs (56).
Due to the poor performance of current scoring functions in estimating binding affinity and hence in ranking docked ligands, it is recognized that compound selection based on calculated scores is not sufficient and visual inspection is often necessary. However, a practical concern arises if one needs to manually inspect thousands of docking poses. Therefore, huge efforts have been devoted to automating this procedure based on the indications gained from ligand–target interactions (57).
The molecular interaction fingerprints (IFPs), which are simple bit strings that encode 3D information about ligand–target interactions into 1D binary vector, have been extended by Marcou and Rognan as a post-docking filter to prioritize the most relevant poses of low molecular weight fragments (58). In their study, IFPs were evaluated with four popular docking tools (FlexX, Glide, Gold, and Surflex) for extracting the scaffolds of true CDK2 inhibitors. They observed that scoring by the Tanimoto similarity of IFPs to a given reference was statistically superior to conventional scoring functions in placing the low molecular weight fragment in the CDK2 binding site.
Based on the assumption that active compounds should have specific contacts with their target to display activity and also to tackle the inefficiency of traditional clustering of docking poses, Bouvier et al. have proposed the Automatic analysis of Poses using Self-Organizing Map (AuPosSOM) method for pose ranking with careful analysis of interatomic contacts between the docked ligand and the target (59). They have demonstrated that it is possible to differentiate active compounds from inactive ones using only mean protein contacts’ footprints calculated from the multiple conformations given by docking software.
Protein-specific structural filtration has been introduced by Novikov et al. to improve the performance of DBVS (60). The filter was defined by a set of crucial ligand–target interactions that are structurally conserved in the available ligand-bound target structures. The application of this method achieved a substantial improvement of enrichment factor ranging from several folds to several hundreds folds against a set of ten diverse protein targets. The authors demonstrated that the structural filtration had effectively repaired the deficiencies of scoring functions, resulting in a considerably lower false positive rate.
Wei et al. have demonstrated that binding energy landscape analysis could help to discriminate true hits from high-scoring decoys in virtual screening (61). In their work, two parameters (i.e., the energy gap and the number of local binding wells in the landscape) were used to account for the kinetic accessibility. With a linear combination of the two parameters, they obtained, in a five-fold cross-validation, the areas under the receiver operator characteristic curves (AUC) of 0.878 for neuraminidase and 0.776 for cyclooxygenase 2 (COX2), respectively. In a more independent test using the directory of useful decoys (DUD) set, the enrichment ratio given by these two parameters when combined with docking scores was improved to 200–300% as compared to that using scoring function alone.
Scoring function is at the heart of molecular docking by assisting a docking program to efficiently explore the binding space of a ligand. It is also responsible for evaluating the binding affinity once the correct binding pose is identified. Therefore, the predictability of scoring functions has a significant impact on the productivity of DBVS.
A multitude of scoring functions have been reported in the past decades (10–15,62–71) (Table III), and new ones are still emerging. Current scoring functions, as reviewed in other works (23,72), can be roughly classified into three types: (a) Force field-based scoring functions employ classic force field to compute the noncovalent ligand–target interactions, such as van der Waals and electrostatic energies. They are often augmented by a GB/SA or PB/SA term in order to account for solvation effects. (b) Empirical scoring functions calculate the overall binding free energy from several energetic terms, including hydrogen bond interaction and hydrophobic interaction. The weighting factors of all terms are calibrated from a set of known complexes with experimentally determined structures and binding affinities. (c) Knowledge-based scoring functions compute the ligand–target interactions as a sum of distance-dependent statistical potentials between the ligand and the target. It is notable that the deduction of such potentials needs only the structural information of ligand–target complexes, which is being accumulated rapidly due to structural biology advances.
The performance of various scoring functions has been investigated by several comparative studies (73–77), with respect to the ability of reproducing known binding pose, predicting binding affinity and rank-ordering a compound library. The state-of-the-art scoring functions are at different levels of accuracy, and it is clear that no single scoring function consistently outperforms others in all cases. It is concluded from previous comparative studies that today’s scoring functions are often capable of identifying the correct binding pose of a ligand, while binding affinity prediction with high accuracy is still far from reach (73). Therefore, considerable efforts have been made to improve the performance of current scoring functions. Common strategies include adding additional factors to account for solvation and entropic effects (71), deriving more accurate energy terms by high-level quantum calculations (78), and consensus scoring by combination of multiple scoring functions (79,80). In this review, we highlighted the recent progress in developing target-biased scoring functions as well as those employed machine learning techniques.
Most of the today’s scoring functions are generic models derived from the large-scale experimental data of ligand–target complexes and are presumably applicable to all sorts of target classes. However, previous comparative studies have revealed that a universally accurate scoring function is still out of reach. A practical remedy to this might be developing target-biased alternatives for specific targets or tasks (81).
The most straightforward way to obtain a target-biased scoring function is, probably, to re-calibrate an existing all-purpose scoring function directly on certain target classes. For example, DrugScore-RNA (82) adopts the same framework as DrugScore (69) but is derived from 670 crystal structures of nucleic acid–ligand and nucleic acid–protein complexes. Similar idea has been implemented in the kinase family-specific potential of mean force (kinase-PMF) (68), a kinase-targeted scoring function adjusted from the original PMF04 (67).
Tweaking the parameters in original scoring functions toward specific targets is also a prevalent strategy to derive target-biased scoring functions. For example, Teramoto and Fukunishi have applied a supervised scoring model to tailor the FlexX scoring function (F-score), which outperformed its former version on three of the five tested targets (83). The TOP approach suggested by Seifert (84) have employed iterative taboo search to optimize the scoring function in ProPose and the original Böhm scoring function against three targets, including CDK2, estrogen receptor, and COX2. By adding negative data of ligands that are known not to bind particular target, Pham and Jain have tuned the scoring function in Surflex-Dock and observed substantially enhanced screening enrichment for HIV protease and poly(ADP-ribose) polymerase (85). An augmented Flo+ scoring function has been developed by Catana and Stouten using N-way partial least squares (PLS) (86), which significantly improved the correlation between observed and calculated pKi values from R2 = 0.5 to 0.8 on a relatively diverse set of ligand–target complexes spanning seven protein families. Therefore, it would be attractive if scoring functions offer extendable or customizable features.
The above-mentioned target-biased scoring functions typically require re-parameterization or special treatment of established scoring functions. Too often, existing scoring functions are available to end-users as black boxes, hence it is not readily possible to adjust their parameters by any optimization algorithm. Several approaches have been proposed to address this issue. One of the earliest examples is the MultiScore that employs the raw scores from eight scoring functions to characterize the observed pKi (87), which has been found to work better for matrix metalloproteinases. The implied idea is slightly different from that of consensus scoring (79,80) in that it assumes uneven contributions from individual scoring functions. In a similar way, the AutoShim method has incorporated the original Flo+ score as well as additional target-specific pharmacophore points (shims) as descriptors in PLS analysis (88). More recently, Cheng et al. have proposed a knowledge-guided strategy (KGS) based on the similarity principle aiming to improve the accuracy of binding affinity prediction of current scoring functions (89). The KGS strategy computes the binding affinity of a query ligand–target complex based on the known binding affinity of an appropriate reference complex, which is required to share a similar pattern of key ligand–target interactions to that of the query complex of interest. The KGS strategy has been validated with both observed and docked ligand–target complex structures. Moreover, it can in principle work in concert with any scoring method, and its application is not limited to specific classes of ligand–target complexes.
Machine learning techniques are powerful to construct and optimize predictive models. In recent years, there is an increasing interest in developing novel scoring functions by means of machine learning (90). A notable feature is that they take into account the commonly observed ligand–target binding interactions in an implicit manner, which obviates the need of explicitly modeling the error-prone interactions, including solvation and entropic effects. Moreover, machine learning techniques such as neural networks (NN), support vector machines (SVM), and random forest (RF) are able to account for the nonlinear dependence among the various interactions involved in ligand–target binding. As a result, despite being less concrete on the physicochemical basis, they often demonstrated a superior or at least comparable performance to that of classic scoring functions in binding affinity estimation.
The NNScore scoring function developed by Durrant and McCammon is based on NN (91), which attempts to computationally simulate the microscopic organization of human brain. The input layer consists of 194 neurodes that are related to ligand–target interactions. Kinnings et al. (92) have applied SVM to train a new scoring function for identifying inhibitors of Mycobacterium tuberculosis InhA, using the individual energy terms as descriptors obtained directly from the built-in scoring function of eHiTS. Amini et al. have introduced the support vector inductive logic programming as a general approach to develop system-specific scoring functions (93). The descriptors they used are the distances from each fragment’s central ligand atom to target atoms. In the development of PHOENIX scoring function, Tang et al. have adopted an indirect idea (94). They first modeled independently enthalpy (ΔH) and the change of entropy (TΔS) by fitting relevant descriptors to experimentally measured calorimetric data through PLS and then calculated the binding free energy (ΔG) according to thermodynamic cycle.
Similar to the idea of using occurrence count of ligand–target atom pair as geometric descriptor to generate a scoring function (95), Li et al. (96) have developed a target-specific scoring method, SVM-SP, by using SVM. SVM-SP employs 135 atom pair potentials as descriptors that are derived in the same way as traditional knowledge-based scoring functions. The effectiveness of SVM-SP has been strongly supported by the discovery of three novel micromolar hits against epidermal growth factor receptor. The recently released RF-score by Ballester and Mitchell (97) has been built with RF, where a set of descriptors are introduced based on the count of a particular ligand–target atom pair within a certain distance range. Despite the relatively coarse definition of ligand–target atom pairs, which considers only atomic number with no concern about distance dependence, RF-score strikingly outperformed all 16 state-of-the-art scoring functions in a recent benchmark (73).
SBVS becomes routine in both pharmaceutical companies and academic groups for early-stage drug discovery. In this work, we reviewed the recent advances and applications in DBVS from a problem-centric perspective with an emphasis on the integration of available knowledge adopted by researchers in real practice. It is found that enriching a screening library for a specific target before docking can improve both computational efficiency and hit rate. Also, effective consideration of key ligand-target interactions and other environmental factors during docking, such as target flexibility, metal ions, and water molecules, can give enhanced DBVS performance. In addition, post-docking processing techniques that automate the selection of appropriate poses/compounds not only greatly alleviate the human intervention of docking outputs but also improve the final outcome simultaneously. Developing target-biased scoring functions represents a trend in tweaking current all-purpose alternatives toward specific target classes. Recent development of scoring function also observed an increasing use of machine learning techniques, which have an intrinsic non-linear feature and can implicitly account for some really challenging ligand–target interactions such as solvation and entropic effects.
Despite the listed advances here, current improvements in DBVS over state-of-the-art, in large part, only serve as patches or temporary remedies to existing methods, which often rely on expertise knowledge and thus may have limited applications in real practice. A universally accurate and reliable solution is still far from reach in the near future. Revolutionary innovations are definitely in urgent need and thus highly encouraged to address the fundamental challenges such as target flexibility and water molecules.
Notes
Tiejun Cheng, Qingliang Li and Zhigang Zhou contributed equally to this work.
We thank the Intramural Research Program of the National Institutes of Health (NIH), National Library of Medicine (NLM) for funding support.
Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution and reproduction in any medium, provided the original author(s) and source are credited.
REFERENCES
| 1.. | Ripphausen P,Nisius B,Peltason L,Bajorath Jr. Quo vadis, virtual screening? A comprehensive survey of prospective applicationsJ Med ChemYear: 201053248461846710.1021/jm101020z20929257 |
| 2.. | Clark DE. What has virtual screening ever done for drug discovery?Expert Opin Drug DiscovYear: 2008384185110.1517/17460441.3.8.841 |
| 3.. | Hopkins AL,Groom CR. The druggable genomeNat Rev Drug DiscovYear: 20021972773010.1038/nrd89212209152 |
| 4.. | Villoutreix BO,Eudes R,Miteva MA. Structure-based virtual ligand screening: recent success storiesComb Chem High Throughput ScreenYear: 200912101000101610.2174/13862070978982468220025565 |
| 5.. | Ghosh S,Nie A,An J,Huang Z. Structure-based virtual screening of chemical libraries for drug discoveryCurr Opin Chem BiolYear: 200610319420210.1016/j.cbpa.2006.04.00216675286 |
| 6.. | Seifert MHJ,Lang M. Essential factors for successful virtual screeningMini Rev Med ChemYear: 20078637210.2174/13895570878333154018220986 |
| 7.. | Tuccinardi T. Docking-based virtual screening: recent developmentsComb Chem High Throughput ScreenYear: 200912330331410.2174/13862070978758166619275536 |
| 8.. | Rapp CS,Schonbrun C,Jacobson MP,Kalyanaraman C,Huang N. Automated site preparation in physics-based rescoring of receptor ligand complexesProteins: Struct, Funct, BioinfYear: 2009771526110.1002/prot.22415 |
| 9.. | Brink T,Exner T. pKa based protonation states and microspecies for protein–ligand dockingJ Comput Aided Mol DesYear: 2010241193594210.1007/s10822-010-9385-x20882397 |
| 10.. | Morris GM,Goodsell DS,Halliday RS,Huey R,Hart WE,Belew RK,et al. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy functionJ Comput ChemYear: 199819141639166210.1002/(SICI)1096-987X(19981115)19:14<1639::AID-JCC10>3.0.CO;2-B |
| 11.. | Ewing TJ,Makino S,Skillman AG,Kuntz ID. DOCK 4.0: search strategies for automated molecular docking of flexible molecule databasesJ Comput Aided Mol DesYear: 20011541142810.1023/A:101111582045011394736 |
| 12.. | Rarey M,Kramer B,Lengauer T,Klebe G. A fast flexible docking method using an incremental construction algorithmJ Mol BiolYear: 1996261347048910.1006/jmbi.1996.04778780787 |
| 13.. | Friesner RA,Banks JL,Murphy RB,Halgren TA,Klicic JJ,Mainz DT,et al. Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracyJ Med ChemYear: 2004471739174910.1021/jm030643015027865 |
| 14.. | Jones G,Willett P,Glen RC,Leach AR,Taylor R. Development and validation of a genetic algorithm for flexible dockingJ Mol BiolYear: 1997267372774810.1006/jmbi.1996.08979126849 |
| 15.. | Jain AN. Surflex: fully automatic flexible molecular docking using a molecular similarity-based search engineJ Med ChemYear: 200346449951110.1021/jm020406h12570372 |
| 16.. | Abagyan R,Totrov M,Kuznetsov D. ICM: a new method for protein modeling and design: applications to docking and structure prediction from the distorted native conformationJ Comput ChemYear: 19941548850610.1002/jcc.540150503 |
| 17.. | Venkatachalam CM,Jiang X,Oldfield T,Waldman M. LigandFit: a novel method for the shape-directed rapid docking of ligands to protein active sitesJ Mol Graph ModelYear: 20032128930710.1016/S1093-3263(02)00164-X12479928 |
| 18.. | Zsoldos Z,Szabo I,Szabo Z,Peter Johnson A. Software tools for structure based rational drug designJ Mol Struct (THEOCHEM)Year: 2003666-66765966510.1016/j.theochem.2003.08.105 |
| 19.. | Schneider G. Virtual screening: an endless staircase?Nat Rev Drug DiscovYear: 2010927327610.1038/nrd313920357802 |
| 20.. | Pujadas G,Vaque M,Ardevol A,Blade C,Salvado MJ,Blay M,et al. Protein–ligand docking: a review of recent advances and future perspectivesCurr Pharmaceut AnalYear: 2008411910.2174/157341208783497597 |
| 21.. | Meng X-Y,Zhang H-X,Mezei M,Cui M. Molecular docking: a powerful approach for structure-based drug discoveryCurr Comput-Aided Drug DesYear: 2011714615721534921 |
| 22.. | Yuriev E,Agostino M,Ramsland Pa. Challenges and advances in computational docking: 2009 in reviewJ Mol RecognitYear: 201024214916410.1002/jmr.107721360606 |
| 23.. | Moitessier N,Englebienne P,Lee D,Lawandi J,Corbeil CR. Towards the development of universal, fast and highly accurate docking/scoring methods: a long way to goBr J PharmacolYear: 2008153S1S7S2610.1038/sj.bjp.070751518037925 |
| 24.. | Dias R,Azevedo Jr WF. Molecular docking algorithmsCurr Drug TargetsYear: 200891040104710.2174/13894500878694943219128213 |
| 25.. | Cummings MD,Maxwell AC,DesJarlais RL. Processing of small molecule databases for automated dockingMed ChemYear: 2007310711310.2174/15734060777931748117266630 |
| 26.. | Lipinski CA,Lombardo F,Dominy BW,Feeney PJ. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settingsAdv Drug Deliv RevYear: 1997231–332510.1016/S0169-409X(96)00423-1 |
| 27.. | Perez-Pineiro R,Burgos A,Jones DC,Andrew LC,Rodriguez H,Suarez M,et al. Development of a novel virtual screening cascade protocol to identify potential trypanothione reductase inhibitorsJ Med ChemYear: 20095261670168010.1021/jm801306g19296695 |
| 28.. | Gozalbes R,Simon L,Froloff N,Sartori E,Monteils C,Baudelle R. Development and experimental validation of a docking strategy for the generation of kinase-targeted librariesJ Med ChemYear: 200851113124313210.1021/jm701367r18479119 |
| 29.. | Sage C, Wang R, Jones G. G-protein coupled receptors virtual screening using genetic algorithm focused chemical space. J Chem Inf Model. 2011. doi:10.1021/ci200043z. |
| 30.. | Kireev D,Wigle TJ,Norris-Drouin J,Herold JM,Janzen WP,Frye SV. Identification of non-peptide malignant brain tumor (MBT) repeat antagonists by virtual screening of commercially available compoundsJ Med ChemYear: 201053217625763110.1021/jm100737420931980 |
| 31.. | Lee K,Jeong K-W,Lee Y,Song JY,Kim MS,Lee GS,et al. Pharmacophore modeling and virtual screening studies for new VEGFR-2 kinase inhibitorsEur J Med ChemYear: 201045115420542710.1016/j.ejmech.2010.09.00220869793 |
| 32.. | Heinke R,Spannhoff A,Meier R,Trojer P,Bauer I,Jung M,et al. Virtual screening and biological characterization of novel histone arginine methyltransferase PRMT1 inhibitorsChem Med ChemYear: 200941697719085993 |
| 33.. | Rueda M,Bottegoni G,Abagyan R. Consistent improvement of cross-docking results using binding site ensembles generated with elastic network normal modesJ Chem Inf ModelYear: 200949371672510.1021/ci800373219434904 |
| 34.. | Cavasotto CN,Singh N. Docking and high throughput docking: successes and the challenge of protein flexibilityCurr Comput-Aided Drug DesYear: 2008422123410.2174/157340908785747474 |
| 35.. | Cozzini P,Kellogg GE,Spyrakis F,Abraham DJ,Costantino G,Emerson A,et al. Target flexibility: an emerging consideration in drug discovery and designJ Med ChemYear: 200851206237625510.1021/jm800562d18785728 |
| 36.. | Durrant JD,McCammon JA. Computer-aided drug-discovery techniques that account for receptor flexibilityCurr Opin PharmacolYear: 201010677077410.1016/j.coph.2010.09.00120888294 |
| 37.. | Sotriffer CA. Accounting for induced-fit effects in docking: what is possible and what is not?Curr Top Med ChemYear: 20111117919120939789 |
| 38.. | Lill MA. Efficient incorporation of protein flexibility and dynamics into molecular docking simulationsBiochemistryYear: 201150286157616910.1021/bi200455821678954 |
| 39.. | Lin J-H. Accommodating protein flexibility for structure-based drug designCurr Top Med ChemYear: 20111117117820939792 |
| 40.. | Okamoto M,Takayama K,Shimizu T,Ishida K,Takahashi O,Furuya T. Identification of death-associated protein kinases inhibitors using structure-based virtual screeningJ Med ChemYear: 200952227323732710.1021/jm901191q19877644 |
| 41.. | Rueda M,Bottegoni G,Abagyan R. Recipes for the selection of experimental protein conformations for virtual screeningJ Chem Inf ModelYear: 200950118619310.1021/ci900394320000587 |
| 42.. | Sperandio O,Mouawad L,Pinto E,Villoutreix B,Perahia D,Miteva M. How to choose relevant multiple receptor conformations for virtual screening: a test case of Cdk2 and normal mode analysisEur Biophys JYear: 20103991365137210.1007/s00249-010-0592-020237920 |
| 43.. | Bottegoni G,Kufareva I,Totrov M,Abagyan R. Four-dimensional docking: a fast and accurate account of discrete receptor flexibility in ligand dockingJ Med ChemYear: 200852239740610.1021/jm800995819090659 |
| 44.. | Röhrig UF,Grosdidier A,Zoete V,Michielin O. Docking to heme proteinsJ Comput ChemYear: 200930142305231519288474 |
| 45.. | Caporuscio F,Rastelli G,Imbriano C,Rio A. Structure-based design of potent aromatase inhibitors by high-throughput dockingJ Med ChemYear: 2011544006401710.1021/jm200068921604760 |
| 46.. | Park H,Kim S,Kim YE,Lim S-J. A structure-based virtual screening approach toward the discovery of histone deacetylase inhibitors: identification of promising zinc-chelating groupsChem Med ChemYear: 20105459159720157916 |
| 47.. | Thilagavathi R,Mancera RL. Ligand–protein cross-docking with water moleculesJ Chem Inf ModelYear: 201050341542110.1021/ci900345h20158272 |
| 48.. | Santos R,Hritz J,Oostenbrink C. Role of water in molecular docking simulations of cytochrome P450 2D6J Chem Inf ModelYear: 200950114615410.1021/ci900293e19899781 |
| 49.. | Abel R,Young T,Farid R,Berne BJ,Friesner RA. Role of the active-site solvent in the thermodynamics of factor Xa ligand bindingJ Am Chem SocYear: 200813092817283110.1021/ja077103318266362 |
| 50.. | Wang L,Berne BJ,Friesner RA. Ligand binding to protein-binding pockets with wet and dry regionsProc Natl Acad SciYear: 201110841326133010.1073/pnas.101679310821205906 |
| 51.. | Lie MA,Thomsen R,Pedersen CNS,Schiøtt B,Christensen MH. Molecular docking with ligand attached water moleculesJ Chem Inf ModelYear: 201151490991710.1021/ci100510m21452852 |
| 52.. | Rossato G, Ernst B, Vedani A, Smieško M. AcquaAlta: a directional approach to the solvation of ligand–protein complexes. J Chem Inf Model. 2011. doi:10.1021/ci200150p. |
| 53.. | Brunskole Švegelj M,Turk S,Brus B,Lanišnik Rižner T,Stojan J,Gobec S. Novel inhibitors of trihydroxynaphthalene reductase with antifungal activity identified by ligand-based and structure-based virtual screeningJ Chem Inf ModelYear: 20115171716172410.1021/ci200149921667970 |
| 54.. | Ravindranathan KP,Mandiyan V,Ekkati AR,Bae JH,Schlessinger J,Jorgensen WL. Discovery of novel fibroblast growth factor receptor 1 kinase inhibitors by structure-based virtual screeningJ Med ChemYear: 20105341662167210.1021/jm901386e20121196 |
| 55.. | Lang PT,Brozell SR,Mukherjee S,Pettersen EF,Meng EC,Thomas V,et al. DOCK 6: combining techniques to model RNA–small molecule complexesRNAYear: 200915611210.1261/rna.156360919029306 |
| 56.. | Kerzmann A,Fuhrmann J,Kohlbacher O,Neumann D. BALLDock/SLICK: a new method for protein–carbohydrate dockingJ Chem Inf ModelYear: 20084881616162510.1021/ci800103u18646839 |
| 57.. | Waszkowycz B. Towards improving compound selection in structure-based virtual screeningDrug Discov TodayYear: 2008135–621922610.1016/j.drudis.2007.12.00218342797 |
| 58.. | Marcou G,Rognan D. Optimizing fragment and scaffold docking by use of molecular interaction fingerprintsJ Chem Inf ModelYear: 200747119520710.1021/ci600342e17238265 |
| 59.. | Bouvier G,Evrard-Todeschi N,Girault J-P,Bertho G. Automatic clustering of docking poses in virtual screening process using self-organizing mapBioinformaticsYear: 2010261536010.1093/bioinformatics/btp62319910307 |
| 60.. | Novikov F,Stroylov V,Stroganov O,Chilov G. Improving performance of docking-based virtual screening by structural filtrationJ Mol ModelYear: 20101671223123010.1007/s00894-009-0633-820041273 |
| 61.. | Wei D,Zheng H,Su N,Deng M,Lai L. Binding energy landscape analysis helps to discriminate true hits from high-scoring decoys in virtual screeningJ Chem Inf ModelYear: 201050101855186410.1021/ci900463u20968314 |
| 62.. | Böhm H-J. Prediction of binding constants of protein ligands: a fast method for the prioritization of hits obtained from de novo design or 3D database search programsJ Comput Aided Mol DesYear: 199812430910.1023/A:10079999201469777490 |
| 63.. | Wang R,Lai L,Wang S. Further development and validation of empirical scoring functions for structure-based binding affinity predictionJ Comput Aided Mol DesYear: 2002161112610.1023/A:101635781188212197663 |
| 64.. | Gehlhaar DK,Verkhivker GM,Rejto PA,Sherman CJ,Fogel DR,Fogel LJ,et al. Molecular recognition of the inhibitor AG-1343 by HIV-1 protease: conformationally flexible docking by evolutionary programmingChem BiolYear: 19952531732410.1016/1074-5521(95)90050-09383433 |
| 65.. | Eldridge MD,Murray CW,Auton TR,Paolini GV,Mee RP. Empirical scoring functions: I. The development of a fast empirical scoring function to estimate the binding affinity of ligands in receptor complexesJ Comput Aided Mol DesYear: 199711542544510.1023/A:10079961245459385547 |
| 66.. | McMartin C,Bohacek RS. QXP: powerful, rapid computer algorithms for structure-based drug designJ Comput Aided Mol DesYear: 199711433334410.1023/A:10079077288929334900 |
| 67.. | Muegge I. PMF scoring revisitedJ Med ChemYear: 200549205895590210.1021/jm050038s17004705 |
| 68.. | Xue M,Zheng M,Xiong B,Li Y,Jiang H,Shen J. Knowledge-based scoring functions in drug design. 1. Developing a target-specific method for kinase−ligand interactionsJ Chem Inf ModelYear: 20105081378138610.1021/ci100182c20681607 |
| 69.. | Gohlke H,Hendlich M,Klebe G. Knowledge-based scoring function to predict protein–ligand interactionsJ Mol BiolYear: 2000295233735610.1006/jmbi.1999.337110623530 |
| 70.. | Mooij WTM,Verdonk ML. General and targeted statistical potentials for protein–ligand interactionsProteins: Struct, Funct, BioinfYear: 200561227228710.1002/prot.20588 |
| 71.. | Huang S-Y,Zou X. Inclusion of solvation and entropy in the knowledge-based scoring function for protein–ligand interactionsJ Chem Inf ModelYear: 201050226227310.1021/ci900298720088605 |
| 72.. | Huang S-Y,Grinter SZ,Zou X. Scoring functions and their evaluation methods for protein-ligand docking: recent advances and future directionsPhys Chem Chem PhysYear: 20101240128991290810.1039/c0cp00151a20730182 |
| 73.. | Cheng T,Li X,Li Y,Liu Z,Wang R. Comparative assessment of scoring functions on a diverse test setJ Chem Inf ModelYear: 20094941079109310.1021/ci900005319358517 |
| 74.. | Warren GL,Andrews CW,Capelli A-M,Clarke B,LaLonde J,Lambert MH,et al. A critical assessment of docking programs and scoring functionsJ Med ChemYear: 200649205912593110.1021/jm050362n17004707 |
| 75.. | Ferrara P,Gohlke H,Price DJ,Klebe G,Brooks CL. Assessing scoring functions for protein–ligand interactionsJ Med ChemYear: 200447123032304710.1021/jm030489h15163185 |
| 76.. | Wang R,Lu Y,Fang X,Wang S. An extensive test of 14 scoring functions using the PDBbind refined set of 800 protein–ligand complexesJ Chem Inf Comput SciYear: 20044462114212510.1021/ci049733j15554682 |
| 77.. | Wang R,Lu Y,Wang S. Comparative evaluation of 11 scoring functions for molecular dockingJ Med ChemYear: 200346122287230310.1021/jm020378312773034 |
| 78.. | Raub S,Steffen A,Kämper A,Marian CM. AIScore: chemically diverse empirical scoring function employing quantum chemical binding energies of hydrogen-bonded complexesJ Chem Inf ModelYear: 20084871492151010.1021/ci700466918597446 |
| 79.. | Charifson PS,Corkery JJ,Murcko MA,Walters WP. Consensus scoring: a method for obtaining improved hit rates from docking databases of three-dimensional structures into proteinsJ Med ChemYear: 199942255100510910.1021/jm990352k10602695 |
| 80.. | Wang R,Wang S. How does consensus scoring work for virtual library screening? An idealized computer experimentJ Chem Inf Comput SciYear: 20014151422142610.1021/ci010025x11604043 |
| 81.. | Seifert MHJ. Targeted scoring functions for virtual screeningDrug Discov TodayYear: 20091411–1256256910.1016/j.drudis.2009.03.01319508918 |
| 82.. | Pfeffer P,Gohlke H. DrugScoreRNA: knowledge-based scoring function to predict RNA–ligand interactionsJ Chem Inf ModelYear: 20074751868187610.1021/ci700134p17705464 |
| 83.. | Teramoto R,Fukunishi H. Supervised scoring models with docked ligand conformations for structure-based virtual screeningJ Chem Inf ModelYear: 20074751858186710.1021/ci700116z17685604 |
| 84.. | Seifert MHJ. Optimizing the signal-to-noise ratio of scoring functions for protein–ligand dockingJ Chem Inf ModelYear: 200848360261210.1021/ci700345n18293951 |
| 85.. | Pham T,Jain A. Customizing scoring functions for dockingJ Comput Aided Mol DesYear: 200822526928610.1007/s10822-008-9174-y18273558 |
| 86.. | Catana C,Stouten PFW. Novel, customizable scoring functions, parameterized using N-PLS, for structure-based drug discoveryJ Chem Inf ModelYear: 2007471859110.1021/ci600357t17238252 |
| 87.. | Terp GE,Johansen BN,Christensen IT,Jørgensen FS. A new concept for multidimensional selection of ligand conformations (MultiSelect) and multidimensional scoring (MultiScore) of protein–ligand binding affinitiesJ Med ChemYear: 200144142333234310.1021/jm001090l11428927 |
| 88.. | Martin EJ,Sullivan DC. AutoShim: empirically corrected scoring functions for quantitative docking with a crystal structure and IC50 training dataJ Chem Inf ModelYear: 200848486187210.1021/ci700454818380449 |
| 89.. | Cheng T,Liu Z,Wang R. A knowledge-guided strategy for improving the accuracy of scoring functions in binding affinity predictionBMC BioinfYear: 201011119310.1186/1471-2105-11-193 |
| 90.. | Hecht D,Fogel GB. Computational intelligence methods for docking scoresCurr Comput-Aided Drug DesYear: 20095566810.2174/157340909787580863 |
| 91.. | Durrant JD,McCammon JA. NNScore: a neural-network-based scoring function for the characterization of protein–ligand complexesJ Chem Inf ModelYear: 201050101865187110.1021/ci100244v20845954 |
| 92.. | Kinnings SL,Liu N,Tonge PJ,Jackson RM,Xie L,Bourne PE. A machine learning-based method to improve docking scoring functions and its application to drug repurposingJ Chem Inf ModelYear: 201151240841910.1021/ci100369f21291174 |
| 93.. | Amini A,Shrimpton PJ,Muggleton SH,Sternberg MJE. A general approach for developing system-specific functions to score protein–ligand docked complexes using support vector inductive logic programmingProteins: Struct, Funct, BioinfYear: 200769482383110.1002/prot.21782 |
| 94.. | Tang YT,Marshall GR. PHOENIX: a scoring function for affinity prediction derived using high-resolution crystal structures and calorimetry measurementsJ Chem Inf ModelYear: 201151221422810.1021/ci100257s21214225 |
| 95.. | Deng W,Breneman C,Embrechts MJ. Predicting protein–ligand binding affinities using novel geometrical descriptors and machine-learning methodsJ Chem Inf Comput SciYear: 200444269970310.1021/ci034246+15032552 |
| 96.. | Li L,Khanna M,Jo I,Wang F,Ashpole NM,Hudmon A,et al. Target-specific support vector machine scoring in structure-based virtual screening: computational validation, in vitro testing in kinases, and effects on lung cancer cell proliferationJ Chem Inf ModelYear: 201151475575910.1021/ci100490w21438548 |
| 97.. | Ballester PJ,Mitchell JBO. A machine learning approach to predicting protein–ligand binding affinity with applications to molecular dockingBioinformaticsYear: 20102691169117510.1093/bioinformatics/btq11220236947 |
Figures
[Figure ID: Fig1] |
Fig. 1
Typical workflow of a docking-based virtual screening (DBVS) |
Tables
Examples of Widely Used Docking Programs
| Program | Search strategy | Free for academia | Website |
|---|---|---|---|
| AutoDock (10) | GA/MC | Yes | http://autodock.scripps.edu |
| Dock (11) | IC | Yes | http://dock.compbio.ucsf.edu |
| FlexX (12) | IC | No | http://www.biosolveit.de/flexx |
| Glide (13) | Hybrid | No | http://www.schrodinger.com |
| Gold (14) | GA | No | http://www.ccdc.cam.ac.uk/products/life_sciences/gold |
| Surflex (15) | IC | No | http://www.tripos.com/index.php |
| ICM (16) | MC | No | http://www.molsoft.com/docking.html |
| LigandFit (17) | MC | No | http://accelrys.com/products/discovery-studio |
| eHiTS (18) | IC | No | http://www.simbiosys.ca/ehits/index.html |
GA genetic algorithm, MC Monte Carlo, IC incremental construction
Commonly Screened Chemical Databases
| Database | Type | No. of compoundsa | Website |
|---|---|---|---|
| PubChem | Public | 30 million | http://pubchem.ncbi.nlm.nih.gov |
| ChEMBL | Public | 1 million | https://www.ebi.ac.uk/chembldb/index.php |
| NCI Set | Public | 140,000 | http://dtp.nci.nih.gov/index.html |
| ChemSpider | Public | 26 million | http://www.chemspider.com |
| CoCoCo | Public | 7 million | http://cococo.unimore.it/tiki-index.php |
| TCM | Public | 32,000 | http://tcm.cmu.edu.tw |
| ZINC | Public | 13 million | http://zinc.docking.org |
| ChemBridge | Commercial | 700,000 | http://www.chembridge.com |
| Specs | Commercial | 240,000 | http://www.specs.net |
| Asinex | Commercial | 550,000 | http://www.asinex.com |
| Enamine | Commercial | 1.7 million | http://www.enamine.net |
| Maybridge | Commercial | 56,000 | http://www.maybridge.com |
| WOMBAT | Commercial | 263,000 | http://www.sunsetmolecular.com |
| ChemDiv | Commercial | 1.5 million | http://www.chemdiv.com |
| ChemNavigator | Commercial | 55.3 million | http://www.chemnavigator.com |
| ACD | Commercial | 3,870,000 | http://accelrys.com/products/databases/sourcing/available-chemicals-directory.html |
| MDDR | Commercial | 150,000 | http://accelrys.com/products/databases/bioactivity/mddr.html |
aApproximate numbers
Examples of Current Scoring Functions
| Type | Scoring function |
|---|---|
| Force field | AutoDock (10), DOCK (11), GoldScore (14), D-Score (11) |
| Empirical | LUDI (62), X-Score (63), PLP (64), ChemScore (65), FlexX/F-Score (12), GlideScore (13), Surflex (15), QXP/Flo+ (66) |
| Knowledge based | PMF04 (67), kinase-PMF (68), DrugScore (69), ASP (70), ITScore (71) |
Article Categories:
Keywords: Key words docking, machine learning, structure-based virtual scoring, target-biased scoring function. |
|
Previous Document: Regulator of G-protein signaling-5 inhibits bronchial smooth muscle contraction in severe asthma.
Next Document: Serum ferritin levels and endocrinopathy in medically treated patients with ? thalassemia major.

