Volume 33 Issue 4
Sep.  2020
Turn off MathJax
Article Contents

Abdulla Al Mamun, Zheng Mei, Ling Qiu, Xue-hai Ju. Theoretical Investigation on QSAR of (2-Methyl-3-biphenylyl) methanol Analogs as PD-L1 Inhibitor[J]. Chinese Journal of Chemical Physics , 2020, 33(4): 459-467. doi: 10.1063/1674-0068/cjcp1909168
Citation: Abdulla Al Mamun, Zheng Mei, Ling Qiu, Xue-hai Ju. Theoretical Investigation on QSAR of (2-Methyl-3-biphenylyl) methanol Analogs as PD-L1 Inhibitor[J]. Chinese Journal of Chemical Physics , 2020, 33(4): 459-467. doi: 10.1063/1674-0068/cjcp1909168

Theoretical Investigation on QSAR of (2-Methyl-3-biphenylyl) methanol Analogs as PD-L1 Inhibitor

doi: 10.1063/1674-0068/cjcp1909168
More Information
  • Corresponding author: Ling Qiu. E-mail:lingqiu@jsinm.org; Xue-hai Ju. E-mail:xhju@njust.edu.cn
  • Received Date: 2019-09-24
  • Accepted Date: 2019-10-31
  • Publish Date: 2020-08-27
  • Cancer is one of the most serious issues in human life. Blocking programmed cell death protein 1 and programmed death ligand-1 (PD-L1) pathway is one of the great innovations in the last few years, a few numbers of inhibitors can be able to block it. (2-Methyl-3-biphenylyl) methanol derivative is one of them. Here, the quantitative structure-activity relationship (QSAR) established twenty (2-methyl-3-biphenylyl) methanol derivatives as the programmed death ligand-1 inhibitors. Density functional theory at the B3LPY/6-31+G(d, p) level was employed to study the chemical structure and properties of the chosen compounds. Highest occupied molecular orbital energy $E_{\rm{HOMO}}$, lowest unoccupied molecular orbital energy $E_{\rm{LUMO}}$, total energy $E_{\rm{T}}$, dipole moment DM, absolute hardness $\eta$, absolute electronegativity $\chi$, softness $S$, electrophilicity $\omega$, energy gap $\Delta E$, etc., were observed and determined. Principal component analysis (PCA), multiple linear regression (MLR) and multiple non-linear regression (MNLR) analysis were carried out to establish the QSAR. The proposed quantitative models and interpreted outcomes of the compounds were based on statistical analysis. Statistical results of MLR and MNLR exhibited the coefficient $R^2$ was 0.661 and 0.758, respectively. Leave-one-out cross-validation, $r^2_{\rm{m}}$ metric, $r^2_{\rm{m}}$ test, and "Golbraikh & Tropsha's criteria" analyses were applied for the validation of MLR and MNLR, which indicate two models are statistically significant and well stable with data variation in the external validation towards PD-L1. The obtained results showed that the MNLR model predicts the bioactivity more accurately than MLR, and it may be helpful and supporting for evaluation of the biological activity of PD-L1 inhibitors.
  • 加载中
  • [1] D. M. Pardoll, Nat. Rev. Cancer 12, 252 (2012). doi:  10.1038/nrc3239
    [2] G. Katarzyna, T. Marcin, M. Damian, K. Magdalena, H. Aleksandra, B. Urszula, P. Marcin, B. Roberto, D. Alexander, and H. A. Tad, Molecules 24, 2071 (2019). doi:  10.3390/molecules24112071
    [3] G. Katarzyna, Z. M. Krzysztof, G. Przemyslaw, M. Katarzyna, M. Bogdan, T. Ricarda, S. Lukasz, D. Alexander, D. Grzegorz, and H. A. Tad, J. Med. Chem. 60, 5857 (2017). doi:  10.1021/acs.jmedchem.7b00293
    [4] D. Huang, W. Wen, X. Liu, Y. Li, and J. Z. H. Zhang, RSC Adv. 26, 14944 (2019).
    [5] Z. T. M. Tryfon, K. Markella, G. Yongzhi, K. Dobroslawa, Z. Krzysztof, D. Grzegorz, H. A. Tad, and D. Alexander, Expert Opin. Ther. Pat. 26, 973 (2016). doi:  10.1080/13543776.2016.1206527
    [6] E. Perry, J. J. Mills, B. Zhao, F. Wang, Q. Sun, P. P. Christov, J. C. Tarr, T. A. Rietz, E. T. Olejniczak, T. Lee, and S. Fesik, Med. Chem. Lett. 29, 786 (2019). doi:  10.1016/j.bmcl.2019.01.028
    [7] L. S. Chupak and X. Zheng, WIPO, Wo2015034820A1 (2015).
    [8] L. S. Chupak, D. Min, S. W. Martin, Z. Xiaofan, P. Hewawasam, T. P. Connolly, X. Ningning, Y. K. Sun, J. Zhu, and D. R. Langley, WIPO, Wo2015160641 (2015).
    [9] A. F. A. Magid, ACS Med. Chem. Lett. 6, 489 (2015). doi:  10.1021/acsmedchemlett.5b00148
    [10] D. Niranjan, G. Simran, S. Deepali, J. Aakash, and O. P. Agrawal, J. Drug Deliv. Ther. 9, 645 (2019).
    [11] R. K. Kunal, D. Supratik, and N. Rudra, Understanding the Basics of QSAR for Applications in Pharmaceutical Sciences and Risk Assessment, Boston: Academic Press, (2015).
    [12] P. Buchwald and B. Nicholas, Drug Future 27, 577 (2002). doi:  10.1358/dof.2002.027.06.856934
    [13] A. S. Planche, ACS Omega 4, 3122 (2019). doi:  10.1021/acsomega.8b03693
    [14] A. S. Kulkarni, A. J. Kasabe, M. S. Bhatia, and V. L. Gaikwad, AAPS PharmSciTech 20, 268 (2019). doi:  10.1208/s12249-019-1480-2
    [15] N. Yorulmaz, O. Oltulu, and E. Eroǧlu, J. Mol. Struct. 1163, 270 (2018). doi:  10.1016/j.molstruc.2018.02.107
    [16] A. A. Buglak, A. V. Zherdev, H. T. Lei, and B. B. Dzantiev, PloS One 14, e0214879 (2019). doi:  10.1371/journal.pone.0214879
    [17] G. Susithra, S. Ramalingam, S. Periandy, and R. Aarthi, Egypt. J. Basic Appl. Sci. 5, 313 (2018). doi:  10.1016/j.ejbas.2018.05.011
    [18] E. Elhallaoui, E. Elasri, F. Ouazzani, A. Mechaqrane, and T. Lakhlifi, Int. J. Mol. Sci. 4, 249 (2003). doi:  10.3390/i4050249
    [19] N. F. Holguín, J. Frau, and D. G. Mitnik, IntechOpen, (2019).
    [20] P. M. Khan and K. Roy, Expert Opin. Drug Dis. 13, 1075 (2018). doi:  10.1080/17460441.2018.1542428
    [21] N. Hernández, R. Kiralj, M. M. C. Ferreira, and I. Talavera, Chemometr. Intell. Lab. Syst. 98, 65 (2009). doi:  10.1016/j.chemolab.2009.04.012
    [22] M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheeseman, G. Scalmani, V. Barone, B. Mennucci, G. A. Petersson, H. Nakatsuji, M. Caricato, X. Li, H. P. Hratchian, A. F. Izmaylov, J. Bloino, G. Zheng, J. L. Sonnenberg, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, T. Vreven, J. A. Montgomery, J. E. Peralta, F. Ogliaro, M. Bearpark, J. J. Heyd, E. Brothers, K. N. Kudin, V.N. Staroverov, T. Keith, R. Kobayashi, J. Normand, K. Raghavachari, A. Rendell, J. C. Burant, S. S. Iyengar, J. Tomasi, M. Cossi, N. Rega, J. M. Millam, M. Klene, J. E. Knox, J. B. Cross, V. Bakken, C. Adamo, J. Jaramillo, R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W. Ochterski, J. M. Martin, K. Morokuma, V. G. Zakrzewski, G. A.Voth, P. Salvador, J. J. Dannenberg, S. Dapprich, A. D. Daniels, O. Farkas, J. B. Foresman, J. V. Ortiz, J. Cioslowski, and D. J. Fox, Gaussian 09, Revision A.02, Wallingford CT: Gaussian Inc., (2009).
    [23] A. Gupta, V. Kumar, and P. Aparoy, Curr. Top. Med. Chem. 18, 1075 (2018). doi:  10.2174/1568026618666180719164149
    [24] M. Karelson, V. S. Lobanov, and A. R. Katritzky, Chem. Rev. 96, 1027 (1996). doi:  10.1021/cr950202r
    [25] T. Lu and F. Chen, J. Comput. Chem. 33, 580 (2012). doi:  10.1002/jcc.22885
    [26] M. Larif, A. Adad, R. Hmammouchi, A. I. Taghki, A. Soulaymani, A. Elmidaoui, M. Bouachrine, and T. Lakhlifi, Arab. J. Chem. 10, S946 (2017). doi:  10.1016/j.arabjc.2012.12.033
    [27] XLSTAT Company, XLSTAT Software, http://www.xlstat.com, (2013).
    [28] D. Soro, L. Ekou, and M. G. R. Koné, Int. Res. J. Pure Appl. Chem. 16, 1 (2018).
    [29] A. Golbraikh and A. Tropsha, J. Mol. Graph. Model. 20, 269 (2002). doi:  10.1016/S1093-3263(01)00123-1
    [30] P. P. Roy and K. Roy, QSAR Comb. Sci. 27, 302 (2008). doi:  10.1002/qsar.200710043
    [31] J. T. Ristovski, N. Janković, V. Borčić, S. Jain, Z. Bugarčić, and M. Mikov, J. Pharmaceut. Biomed. 155, 42 (2018). doi:  10.1016/j.jpba.2018.03.038
    [32] M. Ghamali, S. Chtita, R. Hmamouchi, A. Adad, M. Bouachrine, and T. Lakhlifi, J. Taibah Univ. Sci. 10, 534 (2016). doi:  10.1016/j.jtusci.2015.09.006
    [33] R. M. O'brien, Qual. Quant. 41, 673 (2007). doi:  10.1007/s11135-006-9018-6
    [34] K. Roy, I. Mitra, S. Kar, K. Ojha, R. N. Das, and H. Kabir, J. Chem. Inf. Model. 52, 396 (2012). doi:  10.1021/ci200520g
    [35] A. Cherkasov, E. N. Muratov, D. Fourches, A. Varnek, I. I. Baskin, M. Cronin, J. Dearden, P. Gramatica, Y. C. Martin, R. Todeschini, V. Consonni, V. E. Kuz'min, R. Cramer, R. Benigni, C. Yang, J. Rathman, L. Terfloth, J. Gasteiger, A. Richard, and A. Tropsha, J. Med. Chem. 57, 4977 (2014). doi:  10.1021/jm4004285
    [36] A. K. Debnath, A. K. Ghose, and V. N. Viswanadhan, Combinatorial Library Design and Evaluation, New York: Marcel Dekker Inc., 73 (2001).
    [37] K. Roy, P. Chakraborty, I. Mitra, P. K. Ojha, S. Kar, and R. N. Das, J. Comput. Chem. 34, 1071 (2013). doi:  10.1002/jcc.23231
    [38] A. Gajewicz, Environ. Sci. Nano 5, 408 (2018). doi:  10.1039/C7EN00774D
    [39] T. I. Netzeva, A. P. Worth, T. Aldenberg, and R. Benigni, Altern. Lab. Anim. 33, 155 (2005). doi:  10.1177/026119290503300209
    [40] L. Eriksson, J. Jaworska, A. P. Worth, M. T. D. Cronin, R. M. Mcdowel, and P. Gramatica, Environ. Health Perspect. 111, 1361 (2003). doi:  10.1289/ehp.5758
    [41] J. Jaworska, N. N. Jeliazkova, and T. Aldenberg, Altern. Lab. Anim. 33, 445 (2005). doi:  10.1177/026119290503300508
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Figures(7)  / Tables(6)

Article Metrics

Article views(10) PDF downloads(2) Cited by()

Proportional views
Related

Theoretical Investigation on QSAR of (2-Methyl-3-biphenylyl) methanol Analogs as PD-L1 Inhibitor

doi: 10.1063/1674-0068/cjcp1909168

Abstract: Cancer is one of the most serious issues in human life. Blocking programmed cell death protein 1 and programmed death ligand-1 (PD-L1) pathway is one of the great innovations in the last few years, a few numbers of inhibitors can be able to block it. (2-Methyl-3-biphenylyl) methanol derivative is one of them. Here, the quantitative structure-activity relationship (QSAR) established twenty (2-methyl-3-biphenylyl) methanol derivatives as the programmed death ligand-1 inhibitors. Density functional theory at the B3LPY/6-31+G(d, p) level was employed to study the chemical structure and properties of the chosen compounds. Highest occupied molecular orbital energy $E_{\rm{HOMO}}$, lowest unoccupied molecular orbital energy $E_{\rm{LUMO}}$, total energy $E_{\rm{T}}$, dipole moment DM, absolute hardness $\eta$, absolute electronegativity $\chi$, softness $S$, electrophilicity $\omega$, energy gap $\Delta E$, etc., were observed and determined. Principal component analysis (PCA), multiple linear regression (MLR) and multiple non-linear regression (MNLR) analysis were carried out to establish the QSAR. The proposed quantitative models and interpreted outcomes of the compounds were based on statistical analysis. Statistical results of MLR and MNLR exhibited the coefficient $R^2$ was 0.661 and 0.758, respectively. Leave-one-out cross-validation, $r^2_{\rm{m}}$ metric, $r^2_{\rm{m}}$ test, and "Golbraikh & Tropsha's criteria" analyses were applied for the validation of MLR and MNLR, which indicate two models are statistically significant and well stable with data variation in the external validation towards PD-L1. The obtained results showed that the MNLR model predicts the bioactivity more accurately than MLR, and it may be helpful and supporting for evaluation of the biological activity of PD-L1 inhibitors.

Abdulla Al Mamun, Zheng Mei, Ling Qiu, Xue-hai Ju. Theoretical Investigation on QSAR of (2-Methyl-3-biphenylyl) methanol Analogs as PD-L1 Inhibitor[J]. Chinese Journal of Chemical Physics , 2020, 33(4): 459-467. doi: 10.1063/1674-0068/cjcp1909168
Citation: Abdulla Al Mamun, Zheng Mei, Ling Qiu, Xue-hai Ju. Theoretical Investigation on QSAR of (2-Methyl-3-biphenylyl) methanol Analogs as PD-L1 Inhibitor[J]. Chinese Journal of Chemical Physics , 2020, 33(4): 459-467. doi: 10.1063/1674-0068/cjcp1909168
  • An important biomarker programmed death ligand-1 (PD-L1) (also known as CD274 & B7-H1) for cancer has attracted increasing attention recently, which is considered to be suitable for immune therapy [1, 2]. However, the immunotherapies by different antibodies display several drawbacks, such as the cost-effective, shorter half-life, and destroying the immune system. Hence, the invention of small-molecule inhibitors against PD-1/PD-L1 could overcome these adverse effects, but it is still challenging due to incomplete structural information [3]. Different studies have reported about the small-molecule inhibitors of PD-1/PD-L1, such as peptides, macrocyclic, and peptidomimetics [4-6]. A binding assay in a homogeneous time-resolved fluorescence (HTRF) has revealed that nonpeptide is suitable for inhibiting PD-1/PD-L1 (PD-1: programmed cell death protein 1) reported by Chupak et al. [7]. However, no further structure data were founded on the quantum level to support their activity. The (2-methyl-3-biphenylyl) methanol (MBPM) derivatives exhibited bioactivities as suitable inhibitors for the PD-1/PD-Ll protein/protein and protein/ligands interactions. Therefore, the development of small-molecule inhibitors for PD-1/PD-L1 plays a vital role in improving the immune system of a patient suffering from cancer and chronic infection diseases [8, 9].

    To evaluate the relationship between molecular descriptors and biological activities, the QSAR methodology is being adopted in the field of medicinal chemistry [10]. It has been observed in the QSAR technique that similar structures have similar features, while many conflicts among their molecules are the core concept based on QSAR techniques. However, it is difficult to compare their physicochemical properties and biolog-ical activities [11]. The main advantage of the QSAR technique is to (i) identify the basic core structure which exhibits the targeted responses and (ii) evaluate the bioactivities of untested molecules without screening followed by the planning of virtual libraries [12]. Primary pharmacophore or active molecules may be categorized by QSAR for the activity against cancer [13]. QSAR plays a vital role in drug design and contributes an integral part to pharmaceutical analysis units and drug design [14-17]. 3D-QSAR and classical QSAR are highly recommended areas of research in the field of drug design. The basis for different QSAR methods is the "description" of the molecular structures. A large number of descriptors have been established for QSAR studies [18]. Recently many researchers employed descriptors calculated using density functional theory (DFT) to obtain an improved accuracy of results and more reliable QSAR models instead of using semi-empirical technique AM1 or PM3 [19]. Nowadays, the development of the PD-L1 inhibitor is a hot topic. Establishment of the QSAR model for PD-L1 inhibitor is essential to the further selection or design of a new PD-L1 inhibitor, which is very cost-effective and time-saving.

    In this work, we developed two novel models using the DFT method to evaluate the relationship between the structures of MBPM derivatives and the anti-cancer activity for PD-L1. Various chemometric methods known as principal component analysis (PCA), multiple linear regression (MLR), multiple non-linear regression (MNLR), and leave-one-out cross-validation (LOO-CV), $r^2_{\rm{m}}$ metric, $r^2_{\rm{m}}$ test, and "Golbraikh & Tropsha's criteria" analyses were applied to establish the QSAR model and to evaluate the anti-cancer activity of PD-L1 inhibitor.

  • Experimental pIC$_{50}$ of MBPM (FIG. 1) small molecules have been reported in a recent publication. Twenty molecules were taken from the literature. All IC$_{50}$ values were transformed into pIC$_{50}$ ($-\log$IC$_{50}$, IC$_{50}$ in units of μmol/L), as listed in Table Ⅰ, which were taken as independent variables in the subsequent QSAR analyses [3, 5, 7]. For QSAR modeling, the data set was divided into two sets. The test set consisted of compounds 6, 8, 11, 14, and 18, randomly, while the rest compounds were considered as a training set [20, 21]. Moreover, leave-one-out (LOO) was applied for the internal validation of the training set.

    Figure 1.  Derivatives of (2-methyl-3-biphenylyl) methanol (MBPM).

    Table Ⅰ.  Twenty MBPM derivatives (FIG. 1) and their binding affinities toward PD-L1.

  • There are many kinds of descriptors used to build QSAR. The descriptors are described with various properties such as electronic properties, lipophilicity, charge properties, etc. Various quantum chemical parameters were used in this work to build a strong and predictable QSAR. Firstly, structures of molecules were generated using Gaussian View 5.0.9. The geometries of twenty MBPM molecules were optimized at the B3LYP/6-31+G(d, p) level by using the Gaussian 09 package [22]. Quantum computational results were carefully considered in respect of correlated molecular descriptors, including the highest occupied molecular orbital energy $E_{\rm{HOMO}}$ (in eV), the lowest unoccupied molecular orbital energy $E_{\rm{LUMO}}$ (in eV), total energy $E_{\rm{T}}$ (in eV), dipole moment (DM in Debye), absolute hardness $\eta$ (in eV), absolute electronegativity $\chi$ (in eV), softness $S$, electrophilicity $\omega$ (in eV), energy gap $\Delta E$ (in eV), the most positive net atomic charge $Q_{\rm{MAX}}$, and the most negative net atomic charge $Q_{\rm{MIN}}$. Mulliken charges were used for the $Q_{\rm{MAX}}$ and $Q_{\rm{MIN}}$ values, since so many QSAR models selected this charge [15-17]. $E_{\rm{HOMO}}$ and $E_{\rm{LUMO}}$ are directly related to the ionization potential and electron affinity. The HOMO/LUMO orbitals also correspond to the concept of absolute electronegativity $\chi$, electrophilicity $\omega$. Low energy gap $\Delta E$ implies a higher reactivity in chemical reactions. Hardness and softness are closely related to the polarizability, while the larger dipole moment represents the larger polarity. Partial charges (the most positive $Q_{\rm{MAX}}$ and the most negative net $Q_{\rm{MIN}}$ atomic charges) can describe intermolecular interactions. The $\eta$, $\chi$, $\omega$, and $S$ are determined by the following equations [23, 24].

    The Multiwfn Software was used to calculate the surface positive average value $S_{\rm{PAV}}$ (in eV) and surface negative average value $S_{\rm{NAV}}$ (in eV) [25]. Table Ⅱ shows the molecular structure descriptors.

    Table Ⅱ.  Molecular descriptors obtained at DFT-B3LYP/6-31+G(d, p) level.

  • In this study, quantum chemical descriptors are used to identify the relationship between chemical structure and biological activity. Based on the PCA, quantitative descriptors for the twenty compounds were studied using statistical methods with the software system XLSTAT version 2013 [26, 27].

    PCA is a useful statistical technique to get the most substantial quantity of encoded data within the designated structures and to perceive the distribution of the compounds [28]. This method details statistical information in the data shown in Table Ⅲ. Firstly, the backward elimination method with MLR is employed to develop the structure-activity relationship (SAR) to reduce the transformation among actual and predicted values. Corresponding descriptors are also choosen as input variables in the MNLR. For the evaluation of QSAR, testing the stability, predictive capacity, and generalization ability of the models are essential parts and needed to be considered. The predictive ability of the QSAR model is validated by using two basic principles, i.e. internal and external validation. The cross-validation technique mainly involves internal validation.

    Table Ⅲ.  Correlation matrix [Pearson (n)] between various obtained descriptors.

    LOO $R^2_{\rm{CV}}$ was used to evaluate the internal predictive capability of the model. A high $R^2_{\rm{CV}}$$>$0.5 usually indicates high robustness potential capacity of QSAR model, but only a high value $R^2_{\rm{CV}}$ is not sufficient to assess the predictive potential of a QSAR model [29]. In order to determine the generalizability of the QSAR model for the new chemical and thus the truly predictive efficiency of the models, external validation was performed. Furthermore, $r^2_{\rm{m}}$ metrics were used to evaluate the closeness between the determined and predicted activities established by Roy and Roy [30].

  • PCA was performed to show the correlation between different descriptors [31]. Correlation matrix into 13 descriptors is shown in Table Ⅲ, which gives information about the high or low relationship between the variables. In total, 13 descriptors are employed with encrypt twenty molecules to the analysis of PCA. The primary four principal axes obtained by PCA analysis were sufficiently described, and the data were provided by the data matrix. The chances of variance are 46.14%, 22.43%, 14.82%, and 5.96% for axes $E_{\rm{HOMO}}$, $E_{\rm{LUMO}}$, DM and $Q_{\rm{MAX}}$, respectively. The overall chance is predicted to be 89.35%. Generally, co-linearity ($r$$>$0.5) was determined among all of the variables and between the variables and pIC$_{50}$ [32]. High co-relation was found between the softness and energy gap ($r$=0.999). A cut-up value $R$$\geq$0.9 is used to eliminate the redundancy in our data matrix.

  • MLR model was established to predict the biological properties by using backward elimination. The most effective linear model was employed with the highest occupied molecular orbital energy $E_{\rm{HOMO}}$, the lowest unoccupied molecular orbital energy $E_{\rm{LUMO}}$, the dipole moment (DM), and the most positive net atomic charge ($Q_{\rm{MAX}}$) as molecular descriptors.

    with $N_{\rm{training}}$=15, $R$=0.813, $R^2$=0.661, $R^2_{\rm{CV}}$=0.601, $\bar r^2_{\rm{m(LOO)}}$=0.574, $\Delta r^2_{\rm{m(LOO)}}$=0.062, RMSE$_{\rm{train}}$=0.518, $F$=4.886, $P$$<$0.01, $N_{\rm{test}}$=5, $R^2_{\rm{pred}}$=0.691, $\bar r^2_{\rm{m(test)}}$= 0.529, $\Delta r^2_{\rm{m(test)}}$=0.108, and RMSE$_{\rm{test}}$=0.492.

    Here, $N$ belongs to the number of compounds both training set or test set and $R^2$ remains the coefficient of determination, RMSE stands for the root mean square error, $F$ is the Fisher statistic factor, and $P$ presents the significance level. Lower RMSE and higher coefficient ($R^2$) display that the model is reliable. MLR model is cross-validated by LOO method, which produces its $R^2_{\rm{CV}}$ value of 0.601. The matric values $\bar r_{\rm{m}}^2$ and $\Delta r_{\rm{m}}^2$ show that the determined QSAR, MLR model is acceptable. Here, $\bar r_{\rm{m}}^2$ is the average value of $r_{\rm{m}}^2$ and $\bar r_{\rm{m}}'^2$, and $\Delta r_{\rm{m}}^2$ is the absolute difference between $r_{\rm{m}}^2$ and $r_{\rm{m}}'^2$. The parameters for training set $\bar r^2_{\rm{m(LOO)}}$ and $\Delta r^2_{\rm{m(LOO)}}$ were used as internal validation, while $\bar r^2_{\rm{m(test)}}$ and $\Delta r^2_{\rm{m(test)}}$ were used for external validation.

    The predicted values of the test set are also obtained from the MLR model. Here, $R^2_{\rm{pred}}$ is 0.691, which confirms that the proposed MLR model has a good prognostic capability. However, the RMSE$_{\rm{train}}$ and RMSE$_{\rm{test}}$ value were determined 0.518 and 0.492, respectively. Table Ⅳ shows that the multiple correlations of four descriptors were perceived to calculate their variation inflation factors (VIF). Consequently, the interco-relation of employed descriptors in the proposed model is very low. VIF is outlined as 1/(1$-$$R^2$), where $R$ stands the coefficient of numerous correlations coefficient for the independent variable and stands against all other descriptors in the proposed model [33]. A VIF value (1.0 < VIF < 5.0) indicates that the obtained model is appropriate.

    Table Ⅳ.  Various inflation factor (VIF) of the descriptor in the QSAR model.

    For the MBPM listed in Table Ⅰ, there are two backbone rings and one aromatic benzene/pyridine ring. To investigate the substituents effect of aromatic benzene/pyridine position R1, R2, R3, and R4 on the corresponding biological activity, four descriptors within the regression model were evidenced to have vital contributions to the model. Due to the higher $E_{\rm{HOMO}}$ of Eq.(5), it can be regarded as the ionization potential, which is proportional to the electron releasing ability of the compounds. $E_{\rm{HOMO}}$ has a positive sign in the model, which indicates the lower ability to oxidize. The energy of the LUMO is directly related to the electron affinity. The susceptibility of a molecule connected to nucleophile can be defined with the value of the energy of LUMO. It possesses negative sign in the model that indicates stronger electrophilicity. Dipole moment (DM) is an electronic descriptor that exhibits the polarity of the compounds and co-relates to the solubility of molecules. DM has a positive sign in the model suggests that the activity is inverse-proportional to polarity of the MBPM analogs. The most positive ($Q_{\rm{MAX}}$) and the most negative ($Q_{\rm{MIN}}$) net atomic charge are other common charge-based descriptors. Atomic charges are also used for the explanation of the molecular polarity of molecules.

    The QSAR of the selected compounds assorts predictive values, and some are relatively close to the relationship of MLR between predicted and actual values. Predicted values obtained around the regression line indicate the significant activity against PD-L1. Compound 17 exhibited a high peak in FIG. 2 since the pyridine group along with -OCH$_3$ moiety in R2 position shows an electron-donating ability, while on R3 a side chain with secondary amine might be significant against PD-L1. Compound 1 displays high activity, which might be due to the presence of -OCH$_3$ group at R2 and R3 positions on the benzene ring, while a side chain at R4 shows a slight decrease polarity of the compound. Compound 20 exhibits a significant electron-donating effect of methoxy group on benzene ring at R2 and R3 position that a moderate peak appears in FIG. 2. The evaluation of QSAR for compound 15 with morpholine moiety at R4 position and methoxy group at R2 and R3 shows slightly high activity. Compound 2 has an electron-withdrawing group, e.g. bromine at R1 and carboxylic group at side chain on R4 express slightly low activity. Conversely, compounds without electron-donating groups exhibit less active against PD-L1. The relationship between the predicted and observed activities are displayed in FIG. 2 and FIG. 3. The proposed descriptors in Eq.(5) by MNLR were then used as input parameters towards the MLR.

    Figure 2.  Graphical representation of active molecule observations vs. standardized residues by MLR (training set).

    Figure 3.  Correlations between the experimental pIC$_{50}$ and predicted pIC$_{50}$ of the training and test sets analyzed by MLR.

  • The multiples nonlinear regression model (MNLR) was additionally used to improve the QSAR models. Using the same training set and descriptors selected for MLR, a new MNLR model was built. The descriptors in the proposed data matrix were utilized for the training set. The MNLR equation is obtained as follows,

    with $N_{\rm{training}}$=15, $R$=0.870, $R^2$=0.758, $R^2_{\rm{CV}}$=0.710, $\bar r^2_{\rm{m(LOO)}}$=0.692, $\Delta r^2_{\rm{m(LOO)}}$=0.168, RMSE$_{\rm{train}}$=0.357, $N_{\rm{test}}$=5, $R^2_{\rm{pred}}$=0.601, $\bar r^2_{\rm{m(test)}}$=0.512, $\Delta r^2_{\rm{m(test)}}$=0.031, and RMSE$_{\rm{test}}$=0.559.

    MNLR model (Eq.(6)) was cross-validated by LOO method and its appropriate $R^2_{\rm{CV}}$ value was 0.710. A valid QSAR model always shows $R^2_{\rm{CV}}$ value larger than 0.5 [34]. Here, the metric values $\bar r^2_{\rm{m}}$ and $\Delta r^2_{\rm{m}}$ present that the QSAR model is acceptable and significant.

    Robustness and predictive ability of the model were further supported by $R^2_{\rm{pred}}$ value (0.601) of the test set data. MNLR model showed better performance than the MLR model when applying internal, external, and metric values. FIG. 4 and FIG. 5 show the correlations of the predicted and observed activities and the residual graphs of the absolute numbers, respectively. It can be observed that the data were evenly distributed around the regression line.

    Figure 4.  Correlations between the experimental pIC$_{50}$ and predicted pIC$_{50}$ of training and test sets analyzed by MNLR.

    Figure 5.  Graphical representation of active molecule observations vs. standardized residues by MNLR (training set).

  • Validation metrics is a criterion to evaluate the quality of observed QSAR models. Leave-one-out cross-validation (LOO-CV) and $r_{\rm{m}}^2$ metric were employed as internal validation protocol.

    The training data set is mainly rebuilt to eliminate one compound from the data set for LOO-CV. Then QSAR model is re-created based upon the leftover one molecule of the training set, keeping the descriptor combination originally selected for MLR and MNLR models. The activity of a deleted compound is calculated based upon the resulting QSAR equation. The sequence repeated until the training set has been deleted once and predicted for all compounds. Activity data were found from all the training set compounds that data are used for the calculation of several internal validation parameters. Finally, the predicted residual sum of squares (PRESS) and cross-validated $r_{\rm{CV}}^2$ was used to judge model predictive [35].

    To evaluate the prediction ability of the model, $r_{\rm{m}}^2$ metric is used as an internal validation. A scale datum set is needed to obtain $r_{\rm{m}}^2$ metrics [36]. The $\bar r_{\rm{m(LOO)}}^2$ and $\Delta r_{\rm{m(LOO)}}^2$ parameters are used for the internal validation of the training set. The $\Delta r_{\rm{m(LOO)}}^2$ is the absolute difference between $r_{\rm{m}}^2$ and $r_{\rm{m}}'^2$, $\Delta r_{\rm{m(LOO)}}^2$ is required to be $<$0.2, and the $\bar r_{\rm{m(LOO)}}^2$ is the average value of $r_{\rm{m}}^2$ and $r_{\rm{m}}'^2$, $\bar r_{\rm{m(LOO)}}^2$ is needed to be $>$0.5. In this work for the MLR method the $\bar r_{\rm{m(LOO)}}^2$=0.574 and $\Delta r_{\rm{m(LOO)}}^2$=0.062 are obtained. As well as from the MNLR method, $\bar r_{\rm{m(LOO)}}^2$=0.692 and $\Delta r_{\rm{m(LOO)}}^2$=0.168 are received, which indicates that those two models have better prediction ability. An acceptable QSAR model is close to the actual value of pIC$_{50}$, which utilizes high predictive ability [37]. According to Golbraikh and Tropsha's criteria, models are considered to be acceptable if the following circumstances are all satisfied [29]. (i) $r^2_{\rm{CV}}$$>$0.5. It concludes that the MLR method gives $r^2_{\rm{CV}}$=0.601 and for MNLR $r^2_{\rm{CV}}$=0.710. (ii) $R^2_{\rm{pred}}$$>$0.6. In this model of MLR $R^2_{\rm{pred}}$=0.691 and for MNLR $R^2_{\rm{pred}}$=0.758. (iii) ($r^2$$-$$r_0^2$)/$r^2$$<$0.1 and 0.85$\leq$$k$$\leq$1.15 or ($r^2$$-$$r'^2_0$)/$r^2$$<$0.1 and 0.85$\leq$$k'$$\leq$1.15. In this MLR model, it observes that ($r^2$$-$$r_0^2$)/$r^2$=0.053, ($r^2$$-$$r'_0^2$)/$r^2$=0.001, $k$=0.991, and $k'$=1.003. In the other method for MNLR, ($r^2$$-$$r_0^2$)/$r^2$=0.06, ($r^2$$-$$r'_0^2$)/$r^2$=0.001, $k$=1.017, and $k'$=0.979. (iv) $|$$r^2$$-$$r'^2$$|$$<$0.3. This MLR model shows 0.016 and MNLR 0.050.

    $r_{\rm{m}}^2$ test is also used as external validation. For an acceptable prediction, $\Delta r_{\rm{m(test)}}^2$ is required to be $<$0.2, and the value of $\bar r_{\rm{m(test)}}^2$ is need to be $>$0.5. Table Ⅴ shows the comparison between internal validation and external validation on different two models. The presented MLR model shows $\Delta r_{\rm{m(test)}}^2$=0.108 and $\bar r_{\rm{m(test)}}^2$=0.529 and for the MNLR $\Delta r_{\rm{m(test)}}^2$=0.031 and $\bar r_{\rm{m(test)}}^2$=0.512, indicating that the QSAR model is acceptable.

    Table Ⅴ.  Comparison of model performance between MLR and MNLR.

  • To visualize the applicability domain on Williams plot, the standardized residuals vs. leverage is displayed in FIG. 6 [38]. The distance between a compound from the centroid of $X$ is shown by leverage. The leverage of the compound in the original variable area is outlined as $h_i$ [39].

    Figure 6.  Williams plot for the presented MLR model where solid red lines for outlier detection present the $\pm$3 unites of standardized residuals and dashed red line presents the warning leverage.

    The descriptor vector is considered as $x_i$, and descriptor matrix "$x$" is resultant from the training set of descriptor values. The cautionary leverage ($h^*$) is shown in Eq.(8):

    Here $n$ belongs to the number of training compounds and $p$ stands for the number of predictor variables. A compound through $h_i$$>$$h^*$ extremely influences the regression performance; it does not appear as an outlier due to its small standard residuals, therefore it is excluded [40]. The domain value applicable to standard residuals is generally used as the cut-off value for an acceptable model since points from average to standard $\pm$3 standardized residuals will cover 99% of the information normally distributed [41].

    Therefore, the characterization of the applicability domain was evaluated by the arrangement of leverage and standardized residual value. FIG. 6 shows the training and test sets where there is no outlier with standard residual greater than 3$\delta$. In addition, all the molecules have leverage lower than the warning $h^*$ value of 1.00.

    $E_{\rm{LUMO}}$ and $Q_{\rm{MAX}}$ show negative coefficients in the MLR. The negative values of the regression coefficient contribute negatively to the value of pIC$_{50}$, whereas the positive coefficient of the variables $E_{\rm{HOMO}}$ and DM indicates that the greater value of variables suggests the greater value of the pIC$_{50}$. Increasing the $E_{\rm{LUMO}}$ and $Q_{\rm{MAX}}$ will ultimately decrease pIC$_{50}$, while increasing $E_{\rm{HOMO}}$ and DM will increase pIC$_{50}$. The coefficient of $Q_{\rm{MAX}}$ supports MNLR model for the similar reason. Besides, MNLR shows a negative sign for the coefficient of $E_{\rm{LUMO}}$ and a positive sign for the coefficient of DM. These are responsible because larger DM implies larger polarity, while more negative $E_{\rm{LUMO}}$ contributes to molecular activation. Based on these analyses, the MNLR model with Eq.(6) establishes a preferred relationship between pIC$_{50}$ and the molecular descriptors.

    Here, the obtained two models are used to predict pIC$_{50}$ values. Table Ⅵ displays the observed and calculated pIC$_{50}$ values according to the different two models. The MNLR model has a better predictive ability.

    Table Ⅵ.  Observed and calculated pIC$_{50}$ by different methods.

  • QSAR model of twenty MBPM derivatives has been established using quantum chemical descriptors for predicting the binding affinity constants (pIC$_{50}$) towards the programmed death cells-ligand 1 (PD-L1). The robustness of the two constructed models shows good stability as well as high predictive power, evaluated with the internal and external validations. The selected four descriptors contributing electronic and charge properties that are responsible for the activity of MBPM derivatives. $E_{\rm{HOMO}}$, $E_{\rm{LUMO}}$, DM, and $Q_{\rm{MAX}}$ are responsible for the greater activity of the studied compounds. It could be utilized fruitfully with other descriptors to the development of explicit predictive QSAR models. Both MLR and MNLR pass various validation tests like LOO-CV, $r^2_{\rm{m}}$ metric, $r^2_{\rm{m}}$ test and "Golbraikh and Tropsha's criteria", which support the reliability of models. We found that the MNLR model is more accurate than the MLR. The developed QSAR models of this study may be useful for understanding the action mechanism of MBPM derivatives towards the PD-L1. Finally, the accuracy and predictability of the proposed model were analyzed by comparing the major statistical parameters such as $R$ or $R^2$ with diverse statistical tools and descriptors. The MNLR model could be helpful for further discovery of new PD-L1 inhibitors.

  • This work was supported by the Natural Science Foundation of Jiangsu Province (BK20181128), 333 Project of Jiangsu Province (BRA2016518), and Jiangsu Provincial Medical Youth Talent (QNRC2016626).

Reference (41)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return