1QSAR & Cheminformatics Laboratory, Department of Chemistry, Bareilly College, Bareilly, India.
Shalini Singh*
Rakesh Kumar, Shalini Singh, (2024). In silico modeling of 5-Chloro-2-thiophenyl-1,2,3 triazolymethyldihydro quinolines inhibitors as Mycobacterium Tuberculosis target. Pharmacy and Drug Development. 3(2). DOI: 10.58489/2836-2322/033
© 2024 Shalini Singh, this is an open-access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
5- chloro-2-thiophenyl-1,2,3-triazolymethyldihydroquinolines, MOPAC, PRECLAV, inhibitors.
The silico study on Mycobacterium Tuberculosis the inhibitors with a series of 5- Chloro-2-thiophenyl-1,2,3-triazolymethyldihydroquinolines is reported. A large set of Preclav descriptors have been used to obtain four parametric models. This study presents quantitative structure activity relationships (QSAR) study on a pool of 15 compounds. The quality of prediction is high enough (Se =,.0583 r2 = 0.9607, F =97.7573, Q =7686). A heuristic algorithm selects the best multiple linear regression (MLR) equation showed the correlation between the observed values and the calculated values of activity is very good.
Influenza and tuberculosis spreads by virus and bacteria, respectively, Influenza, caused by influenza virus, and tuberculosis, caused by Mycobacterium tuberculosis, together pose greatest threat to human health throughout the world.[1] The swine flu pandemic of 2009-10 claims lives of 1800 people with majority of them from Mexico. The flu epidemic of 1918, which was caused by the same strain (H1N1) of the virus, killed 5200 million people, about 3% of world's population at that time. Enormous human suffering caused by the recent visit of the Nobel corona virus, a recent version of influenza is so well known to all of us. The toll worldwide has caused figure of 4 million. In the 2009 South African cohort, a large number of patients who died from the H1N1 epidemic also had active TB.[2] Similarly, the effects of pandemic influenza were initially felt in patients with 'chronic respiratory' conditions, including those with active tuberculosis.[3]
The global health community has set an ambition target for the post 2015, ENO-TB strategy of a 25% reduction in incidence and 75% reduction in mortality between 2015 and 2025, and by 2035 a 95% reduction in mortality and 90% reduction in incidence [4]. As per the news item published in the daily, “The Times of India” dated 16th February 2020, The Health Minister of India, Dr. Harsh Vardhan informed that the country has even more ambitious target of eliminating the disease from the country by the year, 2025 and said that, the mission Indradhanush is already operational for the purpose.
A more recent study found that people with co-infection with influenza face an increased risk of death compared to people hospitalized with TB mono infection. It can be caused by a poor immune response to both Mycobacterium tuberculosis and influenza. Influenza can promote active tuberculosis in TB-infected individuals. And persons with tuberculosis may be more vulnerable to influenza infections [5]. The increasing severity of these two diseases, along with tuberculosis, has created difficulties in the treatment and drugs of moist influenza virus. And the data point to an urgent need for the discovery and development of new drugs that prevent the development of both diseases.
Series of with 5-Chloro-2-thiophenyl-1,2,3-triazolylmethyldihydroquinolines as dual
was prepared and assayed as inhibitors of Mycobacterium tuberculosis and influenza virus: inhibitors [6]. In our QSAR study we used, as calibration set, Sulfocoumarins 15 derivatives. The goals of our QSAR study are the identification of molecular features (significant molecular fragments included) having largest influence on biochemical activity and the estimation of activity for some not yet synthesized molecules in prediction set. The dependent property in our QSAR study is 'activity'. The values of the activities are KI MIC (μg/mL), i.e. A=Log = log (3120000//C), the chemical structures and the observed (experimental) activities of the molecules in calibration set are presented in Figure -1 and Table -1. The data were taken from the literature [6].
MOPAC created out files of the molecule; Based on the output, the PRECLAV [9][7] software calculated, for each molecule, more than 1000 whole molecule descriptors, specific to this program.
The program PRECLAV computes type (2) multilinear QSARs.
A=C0+∑Ck•Dk (1) i=1
where
A is (the value of) activity; C0 is the free term (intercept); Ci are coefficients (weighting factors); Di are (the value of) significant descriptors; k is the number of descriptors
The square of Pearson linear correlation r2 of observed/computed values, the
Fisher function F, the standard error of estimation SEE, and the quality function Q (5)(7) are criteria for the quality of prediction for the molecules in calibration set.
F = r2/ (1 – r2) • (N-p)/p (2)
SEE = [(ΣΔ2)/(N-1) ]1/2 (3)
Q = r2 • (1 – p/N) (4)
where: p is number of descriptors; N is number of molecules in the calibration set; Δ is difference
The descriptors included in the best (by Q function) QSAR are named 'predictors'.
The relative utility of predictors is computed by the formula (6).
U = (R2 – r2) / (1 – r2) (5)
where
R2 is the square of Pearson correlation between the observed values and the computed
values (using p predictors)
r2 is the square of Pearson correlation between the observed values and the calculated
values (using the p-1 predictors, i.e. the QSAR equation without the analyzed predictor)
After computation of U (5) for each predictor, the values of U are normalized by the highest of them (the highest value for U becomes 1000). The predictors with high enough value of U (U > 500) can be considered 'with high relative utility'.
PRECLAV (5)(7) calculates square of cross-validated correlation r2CV using LHO (Leave Half Out) method. However, this usual method is applied after ordering of molecules in calibration set according to the observed values of activity. Therefore, the cross-validated function r2CV is a measure of homogeneity of calibration set from the point of view of predictors' set, i.e. from the point of view of structure-activity relationship. A low value (< 0.4) of r2CV means 'the QSAR for molecules having high values of activity and the QSAR for molecules having low values of activity include the same descriptors, but very different weighting factors'. Actually, the computation of r2CV is a very drastic 'internal validation test'.
After computing the Acalc values of the activity for the prediction set molecules, the program computes the average value Acalcm and the standard deviation s of the estimated values. The program considers 'high values' the values fulfilling the criterion (7)8 and 'low values' the values fulfilling the criterion (8)(9).
Acalc > Acalcm + 0.5 • s (6)
Acalc < Acalcm – 0.5 • s (7)
Applicability of domain and detection of Outliers
A QSAR model can be used for selection new compounds if its domain of application is defined. The need to exemplify the model applicability domain is also reflected in the OECD guidelines for QSAR model validation (10-11). QSAR model should only be used for making predictions of compounds fall within the specified domain may be considered reliable [12]. Extent of extrapolation is one simple approach to define the applicability of the domain. It is based on the calculation of the hat diagonal (leverage) hi for each chemical, where the QSAR model is used to predict its activity:
hi=¼xiT(XTX)-1xi (9)
In equation 9, xi is the descriptor-row vector of the query molecule and X is the k x n matrix containing the k descriptor values for each one of the n training molecules. A hat diagonal (leverage) value >3(k + 1) ⁄ n leverage warning limit is considered large. To visualize the applicability of domain of a developed QSAR model, William plot was used. In the William plot, |RStudent| versus leverage values (hi) are plotted. This plot could be used for an immediate and simple graphical detection of both the response outliers and structurally influential compounds in a model. It must be noted that compounds with high value of leverage and good fitting in the developed model can stabilize the model. On the other hand, compounds with bad fitting in the developed model may be outliers. Thus, combination of leverage and the |RStudent| could be used for assigning the applicability of domain.
The statistical computations were conducted using the specific formulas and procedures of PRECLAV program alagorthim. Using only the “significant” descriptors PRECLAV computes ten thousand QSPR type (3) multilinear equations. The quality of the obtained equations is reflected by the value of the Q function and also by values of some usual statistical functions. During the PRECLAV MLR analysis, we observed that the equation with highest value of the Q function is 3-parametric model and that this model also has the highest predictive power and are as follows:
Dependent property: A
Intercept = 3.0435
Statistical outliers: 0
C1 = -9.4544 D1 = asr (Average net charge of C atoms) (U=1000)
C2 = -0.0183D2 = psa (Molecular orbital maximum bonding contribution) (U=882)
C3 = 0.0025, D3 = kac (Gravitation index (all atoms) (U=918)
Whereas the quality of correlation is described by the statistical indices:
Se = .0583, r2 = 0 .9607, F =97.7573, Q =0. .7686
Intercorrelations of predictors:
asr psa 0.0041
asr kac 0.1195
psa kac 0.0198
The minimum correlation descriptor/activity is computed for kac (r2 =0.0025). The minimum intercorrelation between descriptors asr and psa (r2 =0.0041) and the maximum between descriptors is computed for asr/ kacpair (r2=0.1195). So the, collinearly between the predictors not found.
asr 1000
psa 882
kac 918
The high usability of the asr (U=1000) descriptor mostly influence the anti TB activity because the utility value of this descriptor high among the three descriptor in the QSAR model.
A negative coefficient for asr descriptor refers to an lower the value of this descriptor increases in the activity profile of the molecules with an in the value of this descriptor as seen in the case of compound nos. 1, 10, 1, The negative coefficient of psa(U=882) descriptor signifies their influence conducive to the anti TB activity profile of the molecules.
The positive coefficient of these descriptors kac (U=912) descriptor s signifies their influence the anti TB activity.
R |student| (cross-validated leave one out standardized residuals) is one of the best single diagnostics for capturing large residuals. This diagnostics confirm that the zero compounds are outliers in calibration set.
In this study, molecules of analyzed database include 20 virtual fragments but only five virtual fragments are considered significant. The percentages, in weight, of molecular fragments are well correlated (directly or inversely) with the values of inhibitory activity, the presence of substituted CH2 groups comp.no.1, (r = -0.5281), CH3 comp. No. 1 (r =-0.4981), C2comp.no.1, (r = 0.4861), C5H2N S Comp.1 (r =0.4856).and C2HN3 comp. No. 1 (r =)0.4847 favorable to activity.
The signifiant molecular fragments are
CH2 included in tb1 molecule r = 0.5281
CH3 included in tb1 molecule r = 0.4981
C2 included in tb1 molecule r = 0.4861
C5H2N included in tb1 molecule r = 0.4856
C2HN3 included in tb1 molecule r = 0.4847
The all fragments are present in one molecules. The all significant fragments are present in one molecule only. All fragments are favorable to activity.
In external validation test the validation set includes the molecules 4,7,10and 11 in Table 1. star
In order to confirm our findings we have compared the estimated values of the activities with the experimental (observed) ones (Table 1). This has further been demonstrated in Figure 3; a linear relationship between observed and estimated activities in a scatter plot indicates that linearity assumption is appropriate. We observed that the estimated activities are very close to the experimental activities.
TABLE 1: The experimental Antimycobacterial activity of dihydroquinoline-1,2,3-triazole derivativesitro (MIC (μg/mL)), observed activity A (A = log (3120000/MIC), estimated activities, residual, standardized Residual, R|Student| hat diagonal,predictors and of the calibration set molecules 1-15 , with predicted Value(A) of the not yet synthesized ones 16-24.
Cases |
KI (nM) |
Obs. |
Est. |
Residual |
Standardized Residual |
RStudent |
Hat Diagonal |
Comp.NO. |
|
Hat Diagona |
1. |
3.12 |
5 |
5.013 |
-0.013 |
-0.3112 |
-0.2981 |
0.6081 |
16 |
4.363 |
0.1946 |
2. |
25 |
4.096 |
4.135 |
-0.039 |
-0.638 |
-0.6199 |
0.1534 |
17. |
4.896 |
0.3926 |
3. |
12.5 |
4.397 |
4.37 |
0.027 |
0.4553 |
0.4382 |
0.1588 |
18. |
4.896 |
0.4083 |
4.* |
25 |
4.096 |
4.114 |
-0.018 |
-0.2827 |
-0.2705 |
0.1012 |
19. |
4.716 |
0.3304 |
5. |
25 |
4.096 |
4.12 |
-0.024 |
-0.395 |
-0.3793 |
0.1728 |
20. |
5.238 |
0.8853 |
6. |
25 |
4.096 |
4.073 |
0.023 |
0.3786 |
0.3633 |
0.1662 |
21. |
4.384 |
0.2041 |
7.* |
12.5 |
4.397 |
4.318 |
0.079 |
1.264 |
1.3036 |
0.0984 |
22. |
4.332 |
0.1665 |
8. |
25 |
4.096 |
4.049 |
0.047 |
0.876 |
0.866 |
0.3226 |
23. |
4.173 |
0.1782 |
9. |
12.5 |
4.397 |
4.372 |
0.025 |
0.426 |
0.4096 |
0.2211 |
24. |
4.54 |
0.2166 |
10.* |
6.25 |
4.698 |
4.781 |
-0.083 |
-1.7202 |
-1.9183 |
0.4678 |
|
|
|
11.* |
6.25 |
4.698 |
4.583 |
0.115 |
2.07 |
2.5261 |
0.2838 |
|
|
|
12. |
25 |
4.096 |
4.139 |
-0.043 |
-0.7719 |
-0.7567 |
0.272 |
|
|
|
13. |
25 |
4.096 |
4.072 |
0.024 |
0.4305 |
0.414 |
0.2946 |
|
|
|
14. |
25 |
4.096 |
4.206 |
-0.11 |
-1.9876 |
-2.3673 |
0.2966 |
|
|
|
15.. |
25 |
4.096 |
4.108 |
-0.012 |
-0.2345 |
-0.2242 |
0.3826 |
|
|
|
Fig-1
Fig -2
Table -2: Drugs like descriptor of predicted compound
TABLE 3: Training set an test set (test set in bold ,4,7,10,11)
Fihure-2 Graphs of observed vs. estimated activity in the calibration set and validation set.
Fig. 3 Normal Probability Plot of Residuals of obs.
Fig.4 |RStudent| of observed vs. Hat Diagonal
We have developed a computer representation of the pharmacophore model; this also includes information on the available space at important substituent positions. Figure -5 represent pharmacophore models with most active compound (Compound no. 1) which is generated by Brood. The model displays seven pharmacophore elements (three hydrogen bond donors and four hydrogen bond acceptors) which are used to develop and describe the interaction between ligands and the target receptor from the ligand point of view.
Fig. 5- chloro-2-thiophenyl-1,2,3-triazolymethyldihydroquinolines
As discussed earlier, we used |RStudent| of observed inhibitory activity calculated by the obtained models and hat diagonal (leverage) for assigning applicability of domain (AD). Values of leverage could be calculated for both calibration set and prediction set compounds shown in (Table 1). Applicability of domain for the developed model is shown in William plot Figure 4 Influential compounds are points with leverage value higher than the warning leverage limit. It can be seen in the William plot; all molecules in calibration set lie in the application domain of the developed model. None of the molecules have a hat diagonal (leverage) value higher than warning leverage limit (0.666), and also, none of the molecules have higher |RStudent| (cross validated LOO Standardized residuals) than threshold limit |RStudent| < 2.
In table 2 predicted compound shows a goog value of drugs like compound (dragon software) Dragon consensus drug-like score (DLS_cons) accounts for the results provided by all the implemented drug-like scores (DLS_01 to DLS_07), it being calculated as their mean.
While the term ‘drug-like’ is used for compounds resembling existing drugs, the term ‘lead-like’ for compounds possessing the structural and physico-chemical profile of a quality lead. Lead-like scores are filters used to select those compounds qualified to be a lead in drug discovery. Compared to drugs, leads have, on average, smaller molecular complexity (smaller molecular weight, less rings and rotatable bonds), smaller polarizability, are less hydrophobic (their logP is 0.5 – 1.0 units less than that of drugs), and have lower drug-like scores [M.M.Hann et al., J. Chem. Inf. Comput. Sci. 2001, 41, 856-864]. Therefore, in general, physico-chemical property values used as a measure of lead-likeness should be smaller than those traditionally used for drug-likeness.
Probability Plot of Residuals
If the residuals are normally distributed, the data points of the normal probability plot will fall [13]. along a straight line through the origin with a slope of 1.0. Major deviations from this ideal
Fig. 4 reflect departures from normality. Stragglers at either end of the normal probability plot indicate outliers, curvature at both ends of the plot indicates long or short distributional tails,
convex or concave curvature indicates a lack of symmetry, and gaps or plateaus or segmentation
in the normal probability plot may require a closer examination of the data or model. Of course,
use of this graphic tool with very small sample sizes is not recommended. If the residuals are not normally distributed, then the t-tests on regression coefficients, the F-tests,and any interval estimates are not valid. This is a critical assumption to check. In figure3 shows the excellent correct graph which we have mentioned above.
Statistically significant linear QSAR models imply the proposal of TB inhibitory activity for data representation, data modeling and data prediction. Averege net cgarge of carbon atom play a significant role for activity- and CH2, CH3, C2, C2HN3 fragments are favorable for the activity. The predicted compound shows the excellent drug like compound which is not olnly higher value but all these compound shows the very ggood character for drug like compounds so the very interesint achievement of all these compounds which I have got before synthesis.
Thus, attempts have been made to design and develop novel drugs against TB inhibitor activity on a rational basis so as to decreases the test and fault issue and predict the biological activity before synthesis.
This article is dedicated to the memory of the late Prof. Padmakar V. Khadikar (1936–2012).