Document Detail

A QSAR Study of Matrix Metalloproteinases Type 2 (MMP-2) Inhibitors with Cinnamoyl Pyrrolidine Derivatives.
Jump to Full Text
MedLine Citation:
PMID:  22896815     Owner:  NLM     Status:  PubMed-not-MEDLINE    
Abstract/OtherAbstract:
A multivariate PLS-QSAR study with a data set of 31 cinnamoyl pyrrolidine derivatives described as type 2 matrix metalloproteinases (MMP-2) inhibitors is presented in this paper. The variable selection was performed with the Ordered Predictors Selection (OPS) algorithm. The PLS model presented six descriptors and three Latent Variables (LV) that cumulated 71.845% of variance. Leave-N-out (LNO) cross validation and y-randomization tests showed that the model presented robustness and no chance correlation, respectively. The descriptors indicated that MMP-2 inhibition depends mainly on the electronic properties of the compounds. The model obtained can be useful as a support tool in the design of new MMP-2 inhibitors.
Authors:
Eduardo Borges de Melo
Related Documents :
21381025 - Development and psychometric evaluation of the milwaukee psychotherapy expectations que...
8279435 - Methodologic issues for pooling dietary data.
22790525 - Translation and psychometric properties of the chinese version of the perceived workpla...
348745 - Research in orthokeratology. part viii: results, conclusions and discussion of techniques.
6797165 - Balance of glucose utilization in rabbit reticulocytes.
18599515 - Creating unbiased cross-sectional covariate-related reference ranges from serial correl...
Publication Detail:
Type:  Journal Article     Date:  2012-01-31
Journal Detail:
Title:  Scientia pharmaceutica     Volume:  80     ISSN:  2218-0532     ISO Abbreviation:  Sci Pharm     Publication Date:  2012 Jun 
Date Detail:
Created Date:  2012-08-16     Completed Date:  2012-10-02     Revised Date:  2013-05-30    
Medline Journal Info:
Nlm Unique ID:  0026251     Medline TA:  Sci Pharm     Country:  Austria    
Other Details:
Languages:  eng     Pagination:  265-81     Citation Subset:  -    
Affiliation:
Theoretical Medicinal and Environmental Chemistry Laboratory (LQMAT), Department of Pharmacy, Western Paraná State University (Unioeste), 2069 Universitária St, 8519110, CascaveI, PR, Brazil.
Export Citation:
APA/MLA Format     Download EndNote     Download BibTex
MeSH Terms
Descriptor/Qualifier:
Comments/Corrections

From MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine

Full Text
Journal Information
Journal ID (nlm-ta): Sci Pharm
Journal ID (iso-abbrev): Sci Pharm
Journal ID (publisher-id): Scientia Pharmaceutica
ISSN: 0036-8709
ISSN: 2218-0532
Publisher: Österreichische Apotheker-Verlagsgesellschaft
Article Information
Download PDF
© de Melo; licensee Österreichische Apotheker-Verlagsgesellschaft m. b. H., Vienna, Austria.
License:
Received Day: 27 Month: 12 Year: 2011
Accepted Day: 31 Month: 1 Year: 2012
Print publication date: Month: 6 Year: 2012
collection publication date: Year: 2012
Electronic publication date: Day: 31 Month: 1 Year: 2012
Volume: 80 Issue: 2
First Page: 265 Last Page: 281
ID: 3383210
PubMed Id: 22896815
DOI: 10.3797/scipharm.1112-21
Publisher Id: scipharm-2012-80-265

A QSAR Study of Matrix Metalloproteinases Type 2 (MMP-2) Inhibitors with Cinnamoyl Pyrrolidine Derivatives
Eduardo Borges de Melo
Theoretical Medicinal and Environmental Chemistry Laboratory (LQMAT), Department of Pharmacy, Western Paraná State University (Unioeste), 2069 Universitária St, 8519110, CascaveI, PR, Brazil.
Correspondence: E-mail: eduardo.melo@unioeste.br

Introduction

The matrix metalloproteinases (MMPs) are a family of enzymes that are intimately involved in tissue remodeling. These zinc-containing endopeptidases consist of subsets of enzymes, and they are involved in the degradation of the extracellular matrix (ECM) that forms the connective material between cells and around tissues. In pathologic conditions an increase of MMP activity occurs, leading to tissue degradation [1].

Currently, about 27 MMPs are known. Their overexpression is associated with several diseases: cancer, cardiovascular diseases (including congestive heart failure), osteoarthritis, rheumatoid arthritis, chronic obstructive pulmonary disease, psoriasis, dermatitis, Alzheimer’s disease and periodontitis, among others [1, 2]. Thus, MMPs are currently an interesting target for drug design. However, despite the great amount of research, the tetracycline doxycycline (Fig. 1) is the only MMP inhibitor available in therapeutics. This longer-acting antibiotic also presents a weak inhibition of collagenases (MMPs-1, 8 and 13), and it is currently marketed for clinical treatment of chronic periodontal disease [35].

Among the MMPs, MMP2 and MMP9 are named gelatinases. These enzymes are able to degrade a broad range of matrix substrates, including gelatin, type IV collagen of basal laminae, as well as other nonhelical collagen domains and proteins, such as fibronectin and laminin, that constitute cellular connective tissue and are strongly involved in both normal and pathological tissue remodeling [1, 6]. The overexpression of this subclass, especially MMP2, is found to be strongly correlated to an aggressive malignant phenotype, and it presents poor prognosis for several types of aggressive cancer, such as ovarian, lung, breast, bladder and gastric cancers [68]. Thus, MMP2 inhibitors have been studied as a target for anticancer drug design.

Quantitative structure-activity relationship (QSAR) describes how a given biological activity can vary as a function of molecular descriptors derived from the chemical structure of a set of molecules. A model containing those calculated descriptors can be used to predict responses from new compounds, constituting an important tool to support the synthesis of new drugs [9, 10]. Thus, considering the continuous need for new anticancer drugs, a QSAR study based on 31 cinnamoyl pyrrolidine derivatives (Table 1) synthesized and assayed by Zhang et al. [8] was carried out. The dataset was obtained through a hybridization approach between the L-hydroxyproline scaffold, the MMPs substrate, the cinnamic acid, an inhibitor of the A5491 human lung gland cancer, and the caffeic acid, an MMP-2 inhibitor (Fig. 2). The aim was obtaining a mathematical model that could be used for prediction of the inhibitory potency of new cinnamoyl pyrrolidine derivatives against MMP-2.


Results and Discussion

The study was carried out using the QSAR Modeling [11]. The variable selection with the Ordered Predictors Selection (OPS) algorithm [1215] generated a model based on three Latent Variables (LV) that cumulate 71.845% of variance (LV1: 18.043%; LV2: 31.298%; LV3: 22.504%). These LV derivate from six selected descriptors: SOFT (softness), EEig02r (eigenvalue 02 from edge adjacent matrix weighted by resonance integrals), αxx (the component vector to the overall polarizability in the x-axis), q10NBO (partial charge of the atom #10 calculated through Natural Bond Orbitals approach), q2NBO (partial charge of the atom #2 calculated through Natural Bond Orbitals approach) and SsssN(oth) (E-state index for amino group attached to functional groups not aliphatic or aromatic). The values of each descriptor are available in the Supporting Information, Table S1. The standardized regression coefficients are −0.549 for EEig02r, 0.545 for SOFT, 0.377 for αxx, 0.238 for q10NBO, 0.250 for q1NBO, and −0.314 for SsssN(oth). According to Wold [16], regression coefficients larger than about half the maximum regression coefficient value indicate that the descriptor is significant for the PLS-QSAR model. Thus, the reference value is 0.274. The coefficients of q2NBO and q10NBO are lower than this value, but its removal decreases the statistical quality of the model. Thus, these descriptors can be considered important for the model. In addition, the maximum difference is only 0.036 units, which is very low. Thus, both descriptors were maintained in the model.

Fig. 3 shows the studentized residuals (σ) versus the leverage samples plot, and it was used for the identification of outliers. No compound presented residuals higher than 2.5xσ. Only one compound presented leverage higher than the leverage cutoff line, but it can be considered acceptable [17]. Therefore, the model can be considered free of outliers, something which guarantees the maximum possible representation in terms of structure and range of inhibitory activity for the dataset under study.

The model (Equation I) explains 78.324% (R2=0.783) and predicts 61.844% (Q2LOO=0.618) of variance. The predicted values in the cross-validation step and the residuals are available in the Supporting Information, Table S2. The difference between the values of R2 and Q2LOO was 0.165 units. A large difference between R2 and Q2LOO exceeding 0.2–0.3 is a clear indication that the model suffers from overfitting [18]. Thus, this difference may be considered acceptable. The F value (32.521) was higher than the corresponding tabled value (p=3 and n-p-1=27) with a 95% confidence interval (α=0.05). The value of PRESSval was smaller than SSy, another indicator of the statistical significance of the prediction [16].

[Formula ID: FD1]
Eq. 1. 
pIC50=0.394(SOFT)-2.198(EEig02r)+0.014(αxx)+80.105(q10NBO)+11.339(q2NBO)-9.218(SsssN(oth))+64.222n=31;    R2=0.783;   SEC=0.276;   F(3,27)=32.521   (cF=2.960);    Q2LOO=0.618;SEV=0.342;PRESSval=3.621(SSy=9.491).

The results obtained from y-randomization [19] analysis and LNO cross-validation [20] are available in Figs. 4 and 5. The y-randomization aids in verifying the possibility that the explained and predicted variances are due to chance correlation [19]. It can be observed that the results obtained for all randomized models have a bad quality when compared to the original model, because the intercepts are within the acceptable values recommended in literature, i.e., below 0.3 (Fig. 4A) and 0.05 (Fig. 4B). These results indicate that the variance explained by the model was not due to chance correlation.

LNO cross-validation (Fig. 5) employs smaller training sets than the LOO cross-validation, and it can be repeated several times, because of the large number of combinations that rise when more than one compound is left out from the training set, once at a time. A QSAR model can be considered robust when the average values of Q2LNO are relatively high and close to Q2LOO [19]. The model obtained in this study has an average Q2LNO (0.604), only 0.014 units lower than Q2LOO. The standard deviation for each “N” (performed in hexaplicate) value is small, with the maximum of 0.055 for Q2L4O.

Some studies show that only externally validated models may be considered realistic and applicable for drug design [2124]. The real model (II) was obtained after the split of data in training (n=26) and test (n=5) sets. The standardized regression coefficients of each descriptor are −0.579 for EEig01x, 0.599 for SOFT, 0.362 for αxx, 0.149 for q10NBO, 0.322 for q1NBO, and −0.278 for SsssN(oth). The model (II) has statistical parameters similar to those for the auxiliary model (i.e., Eq. 1). Therefore, they can be considered equivalent and can be used in the external validation.

[Formula ID: FD2]
Eq. 2. 
pIC50=0.450(SOFT)-2.293(EEig01x)+0.013(αxx)+61.930(q10NBO)+14.508(q2NBO)-8.637(SsssN(oth))+55.156n=26;    R2=0.809;   SEC=0.264;   F(3,22)=31.089   (cF=3.049);    Q2LOO=0.626;SEV=0.340;PRESSval=3.000(SSy=8.026).

Results obtained for the external validation (Table 2) show that the model has high external prediction power, considering the proposed limits. R2pred, tool used as a measure of the model’s external predictive power, was higher than the adopted threshold (R2pred = 0.641 > 0.5), and the associated error (SEP) with this parameter may be considered low. The Golbraikh-Tropsha statistics [25, 26] aid to confirm the prediction power of the model. Both values of k and k’ and the relation |R20R’20| are within acceptable ranges (0.85 ≤ x ≤ 1.15, where x = k or k’, and |R20R’20| < 0.3).

It can be observed that the obtained model has reasonable internal and external quality. However, it is always desirable to obtain a model that is able to relate the physicochemical properties represented by the selected molecular descriptors to the action mechanism of the system under study [27]. Zhang et al. [8] described the experimental structure-activity relationships of the data set, highlighting the importance of heteroatoms (especially the hydroxil group) to form hydrogen bonds, and π electrons to facilitate interactions with hydrophobic regions of the receptor, and a slight decrease in inhibitory potency with the addition of methoxyl to R1 and R2. Furthermore, a docking study indicated that the ester carbonyl (atom #20) could bind with the zinc located in the active site, the lateral chain represented in this paper by R3 bind with the S1’ cavity, and the lateral chain attached to the nitrogen bind with the S1 cavity. A representation of the metalloproteinases active site [28, 29] is presented in Fig. 6.

The SOFT, a quantum chemical descriptor, was calculated using the relation SOFT=1/GAP, where GAP is the difference between the energies (calculated at B3LYP/6-311(d,p) theory level) of lowest unoccupied molecular orbital and highest occupied molecular orbital (ELUMO−EHOMO). These molecular descriptors are known to be related to molecular reactivity. Generally, softer molecules are more reactive [26, 30]. As the SOFT coefficient is positively correlated to pIC50, this indicates that derivatives with high value for this descriptor will react more easily. The histogram presented in Fig. 7 shows exactly this trend: considering the 16 most active compounds, only four (A2, A3, A7, and A0) have SOFT < 5. The compounds found among the most active have a greater tendency to present many heteroatoms (oxygen and chlorine) and π electrons in the substituent R3, in agreement with the experimental structure-activity relationships discussed by Zhang et al. [8], probably by facilitating the interaction with the enzyme via hydrogen and hydrophobic bonds. Thus, similar to what was proposed by Liu et al. for a set of α-glucosidase inhibitors [30], the inhibitory activity would be expected to be improved by introducing more heteroatoms and electrons π in the structure of new derivatives.

The EEig02r, which presents a negative coefficient, is an edge adjacency index, a topological descriptor derived from the edge adjacency matrix, also called bond matrix, which encodes the connectivity between graph edges [26, 31]. In this approach, as in many other graph theoretical representations of chemical structures, the vertices of the molecular graph represent atoms and edges represent bonds in molecules. The edge adjacency index with this weighting scheme is sensitive to the presence of heteroatoms and multiple bonds in the molecule [26]. This class of descriptors can be weighted by several different atomic properties. The most interesting aspect of the presence of a weighted-resonance index in the model is that this weighting scheme turns the descriptor more sensitive to the presence of heteroatoms and multiple bonds in the molecule [26]. So, its selection by OPS algorithm may be, again, related to the importance of heteroatoms and π electrons in the R3 substituent.

The αxx, calculated in the Marvin 4.1.8 [32] through a method based on the empiric model proposed by Miller and Savchik [33], describes the ability of a molecule to be polarized in the X Cartesian axis. The signal of the coefficient is positive, indicating that the improvement of the polarization in this plane is favorable to the activity. In Fig. 8 it is possible to see that the x-axis always crosses the frontal region of the structures. The size of R3 substituent causes a slight shift in the position of the axis, as it can be seen in the compounds C0 (low potent) and C10 (high potent). This information can be related to the interpretation proposed for the SOFT, since the presence of a greater number of heteroatoms and π electrons in R3 increase the polarization of this Cartesian axis.

The q2NBO and q10NBO are atomic charges descriptors calculated using the Natural Bond Orbital (NBO) theory. The charges measure the extent of electronic density localization in a molecule. Negative qn values mean that there is excess electronic charge in the atom while positive values mean that the atom is electron-deficient [26]. It is possible to observe that the charge of atom #2 undergoes a slight increase in electron density (see Supporting Information, Table S1) in subsets B and C, probably due to an electron donor effect resulting from the insertion of the methoxyl at positions R1 and R2. This effect was more pronounced in the subset B (only R2 substituent) than in the subset C (substituents at R1 and R2). Interestingly, the compounds of subset A are generally more potent than their corresponding in subsets B and C, which have, in general, higher electron densities in the atom #1. It can be proposed, since the sign of its coefficient is positive, that an electron donor effect caused by the insertion of the methoxyl in the aromatic ring decreases its electron density, hampering the interaction of this group with the S1 site of MMP-2. This same effect can be observed, in a less pronounced manner, in the atom #10, the nitrogen of pyrrolidine ring, since the descriptor q10NBO also has a positive coefficient.

The SsssN(oth) is an atom type E-state (electrotopological state) index, and it also corresponds to the nitrogen from the pyrrolidine ring. The E-state formalism considers that each atom or bond has an intrinsic state, which is disturbed by every other atom or bond in the molecule. This state encodes information about the electronic distribution (as a variation caused by all other atoms) and topological aspects (major/minor accessibility of atoms and bonds to the external environment), and how such information can influence intermolecular interactions [26, 34]. Since this descriptor is also related to the atom #10, this indicates that, although the most important point of structural variation for the activity is the R3 substituent, other parts of the molecule also influence the activity. The pyrrolidine nitrogen, for example, is close to the ester carbonyl side chain, the binding point with the zinc atom located in the active site of MMP-2. The negative coefficient indicates that the decrease of this descriptor is favorable to the activity. Among the dataset, the lowest SsssN(oth) values are in the A subset (Supporting Information, Table S1). This subset has no substituents in R1 and R2 (Table 1). Thus, it may indicate that these substitutions also affect the intrinsic value of nitrogen, as well as the partial charge descriptor q10NBO, influencing the interactions that this part of the molecule can have with the binding site of MMP-2.

Interestingly, the three most important descriptors (EEig02r, SOFT and αxx), considering the standardized coefficients of the real model (Eq. 2), are related exactly to the R3 substituents, the main point of structural variation in the dataset, and it is therefore primarily responsible for the variation in inhibitory potency. This result strengthens the importance of hydrogen and hydrophobic bonds to S1' binding site of MMP-2, and demonstrates how the manipulation of this characteristic in structurally related compounds can be useful in the design of new cinnamoyl pyrrolidine derivatives able to inhibit MMP-2.


Conclusion

The model obtained using the OPS, an algorithm for variable selection, showed a statistically significant internal and external prediction power. In addition, the LNO cross-validation shows the model is robust, and in the y-randomization test it shows the model does not present chance correlation. The selected descriptors suggest that the presence of heteroatoms, especially, and π electrons in the R3 substituent can be important for the binding of compounds to the regions S1’ of the binding site of MMP-2, but the handling of electronic distribution in the side chain attached to the pyrrolidinic nitrogen, which binds to the S1 site, can also be exploited for the design of new active derivatives. The manipulation of these features can assist in obtaining new lead compounds that can be useful for developing new drugs used in the chemotherapy for treating aggressive cancers.


Experimental
Molecular Modeling

Three-dimensional structures were built using HyperChem 7 [35] from the structure ZINC40405643, obtained in the ZINC Database (http://zinc.docking.org) [36]. Calculations of MM+ force field were carried out using the same software. The most stable conformations obtained were further optimized at AM1 semi-empirical quantum mechanical method, followed by Hartree-Fock level (HF/6-31G(d)) and Density Functional Theory (DFT) level (B3LYP/6-311G(d,p)) using Gaussian 09 [37]. The DFT/B3LYP was chosen as method for obtaining the geometries and electronic properties because it leads to quite satisfactory results in the analysis with such aims [9, 10].

Molecular descriptors

The SMILES strings [38] of each compound were used to obtain E-state indices in the Parameter Client [39]. The optimized geometries were used to obtain, in the Dragon 3.0 Web Version [31], the following classes of descriptors: constitutional descriptors, functional groups counts, charge descriptors, molecular properties, walk and path counts, information indices, edge adjacency indices, topological charge indices, topological descriptors, connectivity indices, 2D autocorrelations, Burden eigenvalues, and eigenvalue-based indices. The optimized geometries were also used to obtain the electronic descriptors in the Gauss View 5 [40]. Partial charges of the basic structure were calculated by means of two approaches: Mulliken Charges and Natural Bond Orders [41]. In the Marvin 4.1.8 [32], it was obtained the molecular polarizability (α) and the respective vectorial components (αxx, αyy and αzz). After removal of missing, invariants, and quasi-invariants descriptors calculated in the Dragon 3.0, a total of 439 molecular descriptors were available for use.

Mathematical method

The partial least squares (PLS), a classical chemometric method, was employed to explore the quantitative relationships between the training set and MMP-2 inhibition. In this calibration method, LV are obtained including the dependent variable (in this case, pIC50) in the analysis in such a way that the covariance between the projection of the samples in the new axis system (also orthogonal) and the dependent variable is maximized [42, 43]. For this, descriptors should be preprocessed using the autoscaling scheme (columnwise mean-centered and scaled to unity variance). Thus, they can be compared to each other on the same scale.

Variable selection

The step of variable selection in a QSAR study is a way to identify reduced subsets of descriptors that in fact reproduce the observed values of a biological activity, i.e. those that are the most useful to obtain a more accurate prediction model. The use of a good variable selection method helps to obtain the subset to reach an optimal mathematical equation for the prediction of the activity under study and, therefore, simple, robust, and more easily interpretable models [44, 45]. In this study, a two-step procedure was employed: (i) the 439 original descriptors were reduced to 81 by eliminating those that presented the absolute value of Pearson’s correlation coefficient (|r|) with pIC50 lower than 0.3; and (ii) the ordered predictor selection (OPS) algorithm [1215] was used to select the most important descriptors. OPS is able to build PLS models by rearranging the columns of the matrix in such a way that the most important descriptors, classified according to an informative vector (available options: correlation vector, regression vector and an element-wise product between both), are placed in the first columns. Then, successive PLS regressions are performed with an increasing number of descriptors to find the best model. In this work, the three informative vectors were used. The best models were classified in descending order of statistical quality according to their coefficient of determination of leave-one-out cross validation (Q2LOO) or standard error of cross validation (SEV) values. OPS is implemented in QSAR Modeling [11], a free JAVA-based software developed by the courtesy of the Theoretical and Applied Chemometrics Laboratory’s research group (http://lqta.iqm.unicamp.br).

Model validation

Several statistical tools (see Supporting Information) are suggested in literature for validation of QSAR models. For the internal quality, the adopted parameters were the coefficient of multiple determination of calibration (R2), standard error of calibration (SEC), F-ratio test with a 95% confidence interval (F, α=0.05) Q2LOO, SEV and predictive residual sum of squares of validation (PRESSval) [18]. The adopted limits are R2 > 0.6 and Q2LOO > 0.5. SEC and SEV values should be as low as possible. For PRESSval, values should be lower than the sum of squares of the response values (SSy) [19]. F-test value should be higher than the tabled F value (Fp,np−1, where n is the number of compounds and p is the number of LV) and the higher the difference between them, the more statistically significant is the model [46].

The robustness of the model was examined through leave-N-out (LNO) cross validation, with N=1 to 7. This test was repeated three times for each “N” value. All rows from the data matrix and respective y values were randomized in each step of LNO process. It is expected that the average value of each Q2LNO would be close to Q2LOO (coefficient of multiple determination of leave-one-out cross validation) with standard deviations close to zero [21]. The possibility of chance correlation was tested using y-randomization test, where only the y vector (pIC50) was scrambled 10 times. The approach suggested by Eriksson et al. [20], based on the |r| between the original vector y and the randomized vectors y, was used to quantify chance correlation. In this approach, two regression lines are built using these correlation coefficients (x-axis) and the R2 and Q2LOO values (y-axis). The intercepts of the equations obtained in the linear regression should be lower than 0.3 for R2 and 0.05 for Q2LOO.

Once internally validated, the data set was split into training set (n=26) and test set (n=5), generating the real model [18]. The test set was selected manually, in such a way that the entire range of pIC50 (6.25 to 8.208, 1.958 logarithmic units) and the structural variations of the data set were well represented. A dendrogram obtained for the complete data set by Hierarchical Cluster Analysis (HCA) [47] (Supporting Information, Fig. S1) aid to confirm that the selected compounds are suitable as test set. Thus, a structurally representative test set could be formed by the compounds B2 (pIC50=6.553), C4 (pIC50=6.696), C5 (pIC50=6.952), C9 (pIC50=7.542), and A0 (pIC50=7.951). The HCA analysis are performed in Pirouette 4 [48].

The parameter coefficient for multiple determination of prediction (R2pred) and standard error of external prediction (SEP) was used as a measure of the predictive power of a QSAR model. The recommended limit is R2pred > 0.5 [49], and SEP values also should be as low as possible. However, this is not enough to guarantee that the model is really predictive. It is also recommended to check: (i) the slopes k or k’ of the linear regression lines between the observed activity (yi) and the predicted activity in the external validation (ŷei), where the slopes should be 0.85 ≤ x ≤ 1.15 (x = k or k’); and (ii) the absolute value of the difference between the coefficients of multiple determination, R20 and R’20, smaller than 0.3 [26, 27].


Notes

This article is available from: http://dx.doi.org/10.3797/scipharm.1112-21

The MCT/CNPq/Fundaç ão Araucária (www.fundacaoaraucaria.org.br) is acknowledged for financial support (under Protocol 2010/7354).

Supporting Information

Values of selected descriptors for each compound are available in Table S1. The results of leave-one-out cross-validation are available in Table S2. The dendrogram used to aid in the selection of test set is available in Figure S1. Statistics parameters and adopted limits for the evaluation of the quality of the QSAR model are also available as supporting information. These documents are available in the online version (Format: PDF, Size: < 0.1 MB): http://dx.doi.org/10.3797/scipharm.1112-21.

Author’s Statement
Competing Interests

The author declares no conflict of interest.


References
[1]. Kontogiorgis CA,Papaioannou P,Hadjipavlou-Litina DJ. Matrix metalloproteinase inhibitors: a review on pharmacophore mapping and (Q)SARs resultsCurr Med ChemYear: 200512339355 http://www.ncbi.nlm.nih.gov/pubmed/15723623. 15723623
[2]. Pirard B. Insight into the structural determinants for selective inhibition of matrix metalloproteinasesDrug Discov TodayYear: 200712640646 http://dx.doi.org/10.1016/j.drudis.2007.06.003. 17706545
[3]. Tu G,Xu W,Huang H,Li S. Progress in the development of matrix metalloproteinase inhibitorsCurr Med ChemYear: 20081513881395 http://dx.doi.org/10.2174/092986708784567680. 18537616
[4]. Griffin MO,Ceballos G,Villarreal FJ. Tetracycline compounds with non-antimicrobial organ protective properties: Possible mechanisms of actionPharmacol ResYear: 201163102107 http://dx.doi.org/10.1016/j.phrs.2010.10.004. 20951211
[5]. Patrick GL. An Introduction to Medicinal Chemistry4th edOxfordOxford University PressYear: 2009752
[6]. Al-Quntar AA,Baum O,Reich R,Srebnika M. Recently synthesized class of vinylphosphonates as potent matrix metalloproteinase (MMP-2) inhibitorsArch PharmYear: 20043377680 http://dx.doi.org/10.1002/ardp.200300828.
[7]. Li X,Li J. Recent advances in the development of MMPIs and APNIs based on the pyrrolidine plataformsMini Rev Med ChemYear: 201010794805 http://dx.doi.org/10.2174/138955710791608334. 20482497
[8]. Zhang L,Zhang J,Fang H,Wanga Q,Xua W. Design, synthesis and preliminary evaluation of new cinnamoyl pyrrolidine derivatives as potent gelatinase inhibitorsBioorg Med ChemYear: 20061482868294 http://dx.doi.org/10.1016/j.bmc.2006.09.015. 17008101
[9]. Ribeiro FAL,Ferreira MMC. QSPR models of boiling point, octanol–water partition coefficient and retention time index of polycyclic aromatic hydrocarbonsJ Mol Struct TheochemYear: 2003663109126 http://dx.doi.org/10.1016/j.theochem.2003.08.107.
[10]. Molfetta FA,Bruni AT,Rosseli FP,Silva ABF. A partial least squares and principal component regression study of quinone compounds with trypanocidal activityStruct ChemYear: 2007184957 http://dx.doi.org/10.1007/s11224-006-9120-3.
[11]. QSAR Modeling, version 2.0. Theoretical and Applied Chemometrics Laboratory, State University of Campinas, Brazil. http://lqta.iqm.unicamp.br.
[12]. Teófilo RF,Martins JP,Ferreira MMC. Sorting variables by using informative vectors as a strategy for feature selection in multivariate regressionJ ChemometricsYear: 2009233248 http://dx.doi.org/10.1002/cem.1192.
[13]. Hernández N,Kiralj R,Ferreira MMC,Talavera I. Critical comparative analysis, validation and interpretation of SVM and PLS regression models in a QSAR study on HIV-1 protease inhibitorsChemometr Intell Lab986577 http://dx.doi.org/10.1016/j.chemolab.2009.04.012.
[14]. Melo EB. Multivariate SAR/QSAR of 3-aryl-4-hydroxyquinolin-2(1H)-one derivatives as type I fatty acid synthase (FAS) inhibitorsEur J Med ChemYear: 20104558175826 http://dx.doi.org/10.1016/j.ejmech.2010.09.044. 20965618
[15]. Melo EB. A new quantitative structure–property relationship model to predict bioconcentration factors of polychlorinated biphenyls (PCBs) in fishes using E-state index and topological descriptorsEcotoxicol Environ SafYear: 201275213222 http://dx.doi.org/10.1016/j.ecoenv.2011.08.026. 21959189
[16]. van de Waterbeemd HPLS for multivariate linear modelingChemometric Methods in Molecular DesignWeinheimWiley-VCHYear: 1998195218
[17]. Gramatica P. Principles of QSAR models validation: internal and externalQSAR Comb ChemYear: 200726694701 http://dx.doi.org/10.1002/qsar.200610151.
[18]. Kiralj R,Ferreira MMC. Basic validation procedures for regression models in QSAR and QSPR studies: theory and applicationJ Braz Chem SocYear: 200920770787 http://dx.doi.org/10.1590/S0103-50532009000400021.
[19]. Eriksson L,Jaworska J,Worth AP,Cronin MTD,McDowell RM,Gramatica P. Methods for reliability and uncertainty assessment and for applicability evaluations of classification and regression-based QSARsEnviron Health PerspectYear: 200311113611375 http://dx.doi.org/10.1289/ehp.5758. 12896860
[20]. Melagraki G,Afantitis A,Sarimveis H,Koutentis PA,Markopolus J,Igglessi-Markopoulou O. Optimization of biaryl piperidine and 4-amino-2-biarylurea MCH1 receptor antagonists using QSAR modeling, classification techniques and virtual screeningJ Comput Aided Mol DesYear: 200721251267 http://dx.doi.org/10.1007/s10822-007-9112-4. 17377847
[21]. van de Waterbeemd HStatistical validation of QSAR resultsChemometric Methods in Molecular DesignWeinheimWiley-VCHYear: 1998309318
[22]. Golbraikh A,Tropsha A. Beware of q2!J Mol Graph ModelYear: 200220269276 http://dx.doi.org/10.1016/S1093-3263(01)00123-1. 11858635
[23]. Aptula AO,Jeliazkova NG,Schultz TW,Cronin MTD. The better predictive model: high q2 for the training set or low root mean square error of prediction for the test set?QSAR Comb ChemYear: 200524385396 http://dx.doi.org/10.1002/qsar.200430909.
[24]. Consonni V,Ballabio D,Todeschini R. 2010. Evaluation of model predictive ability by external validation techniquesJ ChemometricsYear: 201024194201 http://dx.doi.org/10.1002/cem.1290.
[25]. Golbraikh A,Shen M,Xiao Z,Xiao Y,Lee K,Tropsha A. Rational selection of training and test set for the development of validated QSAR modelsQSAR Comb ChemYear: 200317241253 http://dx.doi.org/10.1023/A:1025386326946.
[26]. Todeschini R,Consonni V. Molecular Descriptors for Chemoinformatics2th ed1 alphabetical listing. WeinheimWiley-VCHYear: 2009967
[27]. Organization for Economic Co-Operation and Development (OECD)Guidance document on the validation of (quantitative) structure-activity relationship [(Q)SAR] models http://www.oecd.org/dataoecd/33/37/37849783.pdf.
[28]. Cheng M,De B,Almstead NG,Pikul S,Dowty ME,Dietsch CR,Dunaway CM,Gu F,Hsieh LC,Janusz MJ,Taiwo YO,Natchus MG,Hudlicky T,Mandel M. Design, synthesis, and biological evaluation of matrix metalloproteinase inhibitors derived from a modified proline scaffoldJ Med ChemYear: 19994254265436 http://dx.doi.org/10.1021/jm9904699. 10639284
[29]. Discovery Studio Visualizer, version 2.5.5.9350. Accelrys Software Inc, www.accelrys.com
[30]. Liu Y,Ke Z,Cui J,Chen W,Ma L,Wang B. Synthesis, inhibitory activities, and QSAR study of xanthone derivatives as α-glucosidase inhibitorsBioorg Med ChemYear: 20081671857192 http://dx.doi.org/10.1016/j.bmc.2008.06.043. 18632275
[31]. Dragon, version web 3.0. Talete srl, www.talete.mi.it
[32]. Marvin, version 4.1.8. ChemAxon Inc. www.chemaxon.com/marvin
[33]. Miller KJ,Savchik JA. A new empirical method to calculate average molecular polarizabilitiesJ Am Chem SocYear: 197910172067213 http://dx.doi.org/10.1021/ja00518a014.
[34]. Devillers J,Balaban ATTopological Indices and Related Descriptors in QSAR and QSPRLondonGordon and BreachYear: 1999491562
[35]. Hyperchem, version 7.1. Hyper Co. www.hyper.com
[36]. Irwin JJ,Shoichet BK. ZINC - a free database of commercially available compounds for virtual screeningJ Chem Inf ModelYear: 200545177182 http://dx.doi.org/10.1021/ci049714+. 15667143
[37]. Gaussian, version 09. Gaussian Inc, www.gaussian.com
[38]. Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rulesJ Chem Inf Comput SciYear: 1988283148 http://dx.doi.org/10.1021/ci00057a005.
[39]. Parameter Client. Virtual Computational Chemistry Laboratory. www.vcclab.org/lab/pclient
[40]. Gauss View Gauss View, version 05. Gaussian Inc, www.gaussian.com
[41]. Young DC. Computational chemistry: a practical guide for applying techniques to real-world problemsNew YorkWiley-InterscienceYear: 2001369
[42]. Wold S,Sjöström M,Eriksson L. PLS-regression: a basic tool of chemometricsChemometr Intell LabYear: 200158109130 http://dx.doi.org/10.1016/S0169-7439(01)00155-1.
[43]. Roy PP,Roy K. On Some Aspects of Variable Selection for Partial Least Squares Regression ModelsQSAR Comb SciYear: 200827302313 http://dx.doi.org/10.1002/qsar.200710043.
[44]. Ferreira MMC,Montanari CA,Gaudio AC. Variable selection in QSARQuím NovaYear: 200225439448 http://dx.doi.org/10.1590/S0100-40422002000300017.
[45]. González MP,Terán C,Saíz-Urra L,Teijeira M. Variable selection methods in QSAR: an overviewCurr Top Med ChemYear: 2008816061627 http://dx.doi.org/10.2174/156802608786786552. 19075770
[46]. Gaudio AC,Zandonade E. Proposition, validation and analysis of QSAR modelsQuím NovaYear: 200124658671 http://dx.doi.org/10.1590/S0100-40422001000500013.
[47]. Beebe KR,Pell RJ,Seasholtz MB. Chemometrics: a practical guideWileyNew YorkYear: 1998360
[48]. Pirouette, version 4. Infometrix Inc. www.infometrix.com
[49]. Roy PP,Leonard JT,Roy K. Exploring the impact of size of training sets for the development of predictive QSAR modelsChemometr Intell LabYear: 2008903142 http://dx.doi.org/10.1016/j.chemolab.2007.07.004.

Article Categories:
  • Research Article

Keywords: Matrix metalloproteinases, MMP2, Gelatinases, Cancer, QSAR, OPS.

Previous Document:  Routing of Biomolecules and Transgenes' Vectors in Nuclei of Oocytes.
Next Document:  Pharmacophore Identification and QSAR Studies on Substituted Benzoxazinone as Antiplatelet Agents: k...