Data-driven score tuning for ChooseLD: A structure-based drug design algorithm with empirical scoring and evaluation of ligand–protein docking predictability

Akihiro Masuda; Daichi Sadato; Mitsuo Iwadate

doi:10.2142/biophysico.bppb-v21.0021

Abstract

Computerized molecular docking methodologies are pivotal in in-silico screening, a crucial facet of modern drug design. ChooseLD, a docking simulation software, combines structure- and ligand-based drug design methods with empirical scoring. Despite advancements in computerized molecular docking methodologies, there remains a gap in optimizing the predictive capabilities of docking simulation software. Accordingly, using the docking scores output by ChooseLD, we evaluated its performance in predicting the bioactivity of G-protein coupled receptor (GPCR) and kinase bioactivity, specifically focusing on Ki and IC₅₀ values. We evaluated the accuracy of our algorithm through a comparative analysis using force-field-based predictions from AutoDock Vina. Our findings suggested that the modified ChooseLD could accurately predict the bioactivity, especially in scenarios with a substantial number of known ligands. These findings highlight the importance of selecting algorithms based on the characteristics of the prediction targets. Furthermore, addressing partial model fitting with database knowledge was demonstrated to be effective in overcoming this challenge. Overall, these findings contribute to the refinement and optimization of methodologies in computer-aided drug design, ultimately advancing the efficiency and reliability of in-silico screening processes.

Significance

The paper presents an evaluation of the accuracy of molecular docking, a crucial concern in in-silico screening. The fine-tuning of ChooseLD using registered data yielded enhanced accuracy, particularly in predicting Ki and IC₅₀ values for G-protein coupled receptors and kinases in cases with numerous collected ligands, compared to that of AutoDock Vina, a force-field-based approach. This report provides insights into a suggested docking simulation approach, discovered through a comparative analysis of predictive data from ChooseLD and Vina.

Introduction

In-silico screening, facilitated by computer-aided molecular docking methods, serves as an important technique in modern drug design. This approach is classified into ligand-based drug design (LBDD) and structure-based drug design (SBDD) [1]. LBDD involves the construction of drug structures based on known chemical information [2], whereas SBDD predicts ligand–protein interactions using three-dimensional structures [3]. Typically, these two methods are combined. ChooseLD [4,5, Supplementary Text S1, Table S1] was designed to evaluate the interaction between proteins and candidate compounds using SBDD methodologies. The analytical design of this approach is summarized in the Supplementary Text S1. In SBDD, three methods are established: force field-, empirical-, and knowledge-based designs [1]. Among these methods, ChooseLD predominantly employs empirical scoring, characterized by knowledge-based scoring. The processing speed of the empirical-based method is fast owing to its utilization of a simple empirical term; however, its estimation performance depends on the quality of the knowledge database.

Contemporary databases, such as ChEMBL [6], play a vital role in drug discovery research by providing mass-cleaned datasets. Docking simulation methodologies and the databases are progressing, although the predictive capabilities of molecular docking are not fully complete and the contributing factors are not fully understood. This study aimed to enhance the predictive capabilities of ChooseLD by tuning its prediction parameters using ChEMBL-provided data. Subsequently, the obtained calculated scores were compared to those of AutoDock Vina (version 1.1.2) [7] (subsequently denoted as “Vina”), leveraging a force field-based scoring methodology as the main algorithm in SBDD. This comparative analysis revealed that the learning model established using modified ChooseLD was efficacious in predicting bioactivity data associated with proteins with numerous experimentally analyzed ligands in homologous protein structures.

Materials and Methods

Data Preparation

The Assay IDs registered as curated by an expert and with a confidence score of 9 were obtained from the ChEMBL31 database on the ChEMBL website [8]. Based on the obtained Assay IDs, compound structures in the Simplified Molecular Input Line Entry System (SMILES) format with experimental Ki and IC₅₀ values were extracted. Simultaneously, the amino-acid sequences of target proteins corresponding to each Assay ID were extracted without any additional conditions. Each protein could have an individual Assay ID, and multiple IC₅₀ and Ki values can be obtained for the same compound. In the present study, all extracted data were used to treat each “protein-assay-affinity value” as a single dataset. The obtained SMILES data were converted into MOL files with 3D structures using RDKit [9], employing the Universal Force Field (UFF) [10] as the force field. Because ChEMBL only provides information on the amino acid sequences of target proteins, the 3D structures could not be obtained directly. Therefore, homologous protein structures corresponding to the constructed data were identified through BLAST searches [11] based on the Protein Data Bank (PDB) [12] (as of August 12, 2022). The protein–ligand complex structure with the highest BLAST score was selected as the base template for homology modeling. Subsequently, the predicted ligand structure and/or location were fitted to the active site of the base protein template structure employing the protein–ligand 3D location information of secondary or subsequent BLAST outputs (subsequently denoted as “PDB ligands”). During structural fitting, the least-squares fitting was implemented using CEAlign from PyMOL [13], and estimated ligand models in contact with proteins outside the active site or penetrating the template structure were removed. Subsequently, we conducted a 3D homology modeling of the protein structure with ligands in FAMS [14] guided by the local alignment results between the highest-scoring protein sequence from protein BLAST and amino acid sequences from ChEMBL. FAMS is a homology modeling method characterized by iterative database searches and simulated annealing. Modeling with FAMS ensures that PDB ligands do not overlap spatially with the modeled protein.

In-silico Screening Based on Empirical Scores

The ChooseLD score, total score for docking evaluation, contains three outputs: BaseScore, Overlap, and Hc_index, expressed as Equation (1).

T o t a l S c o r e = B a s e S c o r e ( k 1 ) + O v e r l a p ( k 2 ) + k 3 × H c _ i n d e x

(1)

First, the pharmaceutical candidate compound is moved to the docking position by performing singular value decomposition (SVD) using a randomly selected set of coordinates from the common MACCS fingerprints of PDB ligands and pharmaceutical candidate compounds. A single docking score is calculated using the coordinates. BaseScore exhibits higher values when pharmaceutical candidate compounds are situated in proximity to PDB ligands, as illustrated in the score calculation schema model depicted in Figure 1. Conversely, overlap represents a score that decreases when pharmaceutical candidate compounds are embedded within the protein model structure and increases when they are positioned proximal to the surface (Figure 1). The definitions for the inside and the surface of the proteins were determined based on the Chemistry Data Booklet. Details are described in the Supplementary Text S1.

Figure 1 Overview of ChooseLD score calculation. (A) Schema of BaseScore calculation. Scores were obtained through a two-step process: (1) Compilation of a dataset comprising protein-ligands with high homology. (2) Selection of drug candidate compounds and their fitting to these ligands based on matching the Tanimoto index based on the open source 166-bit version MACCS fingerprints using RDKit. The BaseScore is calculated based on the similarity and density of the fingerprints used in the fitting. (B) Schematic illustration of the overlap score model: the positional relationship between each atom of the fitted pharmaceutical candidate compound and the target protein. The definitions for the inside and the surface of the protein were determined based on the method described in the Supplementary Text S1. (C) Conceptual diagram of Hc_index calculation method: the hydrophobic interaction was calculated based on AlogP and the distance “r” (measured in Å) between each atom of the protein and each atom of the pharmaceutical candidate compound. This interaction exhibited exponential decay with increasing distance between the atoms. (D) Formula used for calculating Hc_index.

The Hc_index is calculated based on the AlogP value, which indicates the hydrophobicity strength of atoms within the protein, as determined using RDKit. This estimation considers the effect of the surrounding space of hydrophobic atoms (Figure 1). Equation (2), determined based on a previous study [5], can be used to calculate the hydrophobic effect of a single atom within a pharmaceutical compound. AlogP, instead of logP, was used to perform calculations for each atom of the target protein and candidate compounds. In Equation (2), “r” represents the distance between the i-th atom of the ligand molecule and the j-th atom of the receptor molecule, respectively. Equation (2) is shown as follows:

H c i = − ∑ { j ∈ U | r ( i , j ) < r t h } A l o g P i × e − 0.1213 × r ( i , j ) 2 × A l o g P j

(2)

where U and r(i,j) represent the neighbor of the i-th atom of the ligand molecule and the distance between the i-th atom of ligand molecule and the j-th atom of the receptor molecule, respectively. In the present study, “rth” was set to 10 Å. Extending this calculation to the entire molecule, the Hc_index was obtained as a cumulative sum of the hydrophobicity indices for all atoms, as expressed in Equation (3), and was numerically quantified to assess the overall hydrophobic interaction strength of the molecule. Thus, Hc_index is expressed as follows:

H c _ i n d e x = ∑ i ∈ a t o m s o f l i g a n d H c i

(3)

ChooseLD has three parameters: k1 to k3. In this case, we used values that were tuned in a previous study (Supplementary Text S1) for the entire ChEMBL28 dataset, specifically 1.0 for k1, 15.0 for k2, and 100.0 for k3. Since 90% of the ChEMBL31 dataset is composed of data from the ChEMBL28 dataset, tuning parameters were assumed to be similar and used without further modification. A more detailed explanation can be found in the Supplementary Text S1. Additionally, ChooseLD is designed to operate with an execution time of approximately 1 min, including data preparation such as retrieving PDB ligands, making it almost the same as Vina in terms of execution time.

Machine Learning of Score and Bioactivity Data

To convert mass bioactivity data of drug candidate compounds into a machine-learning-compatible format, we converted the Ki and IC₅₀ values from nanomolar units to integer values on a logarithmic scale with a base of 10 (subsequently denoted as “experimental values”). These experimental values are standardized to a uniform scale, such as pIC₅₀ and pKi, and are represented as shown in Equations (4) and (5). The range of pIC₅₀ and pKi values was 6–14, but values close to 0 are desirable for training our machine learning model. In the present study, we introduced qIC₅₀ and qKi Equations (6) and (7) instead of typical pIC₅₀ and pKi values. The difference between typical pIC₅₀/pKi values and our qIC₅₀/qKi values are that the signs are reversed, and the units are expressed in nM instead of M.

p I C 50 = − L o g 10 ( I C 50 )

(4)

p K i = − L o g 10 K i

(5)

q I C 50 = L o g 10 ( I C 50 ) + 9 = − p I C 50 + 9

(6)

q K i = L o g 10 K i + 9

(7)

In the following texts, qIC₅₀^exptl/qKi^exptl and qIC₅₀^calc/qKi^calc represent the experimental and predicted values, respectively. Three scores were used as training factors, namely BaseScore, Overlap, and Hc_index, obtained from ChooseLD along with the logarithm of the bioactivity data. The predictive model was constructed using scikit-learn (version 0.23.2) [15], specifically employing its Support Vector Classifier (SVC) algorithm within the Support Vector Machine (SVM) framework, with a five-fold cross-validation-supervised learning approach. To determine the optimal hyperparameters for the model, we applied the grid search method using the GridSearchCV object. After identifying the best parameters, the final model was retrained using the entire dataset. The regularization parameter C was explored within the range [0.1, 1, 10, 100], the kernel coefficient γ within [1, 0.1, 0.01, 0.001], and the types of kernels considered were [“Linear Kernel,” “Radial Basis Function Kernel,” and “Sigmoid Kernel”]. Gamma parameter is not used in linear kernel models as it is not an included option. The SVM models in ChooseLD are calculated using the “Linear Kernel,” “RBF Kernel,” and “Sigmoid Kernel” as shown in Equations (8), (9), and (10).

L i n e a r _ K ( x , i ) = B a s e S c o r e ( x ) × B a s e S c o r e ( i ) + O v e r l a p ( x ) × O v e r l a p ( i ) + H c _ i n d e x ( x ) × H c _ i n d e x ( i )

(8)

R B F _ K ( x , i ) = e − γ [ ( B a s e S c o r e ( x ) − B a s e S c o r e ( i ) ) 2 + ( O v e r l a p ( x ) − O v e r l a p ( i ) ) 2 + ( H c _ i n d e x ( x ) − H c _ i n d e x ( i ) ) 2 ]

(9)

S i g m o i d _ K ( x , i ) = t a n h [ γ ( B a s e S c o r e ( x ) × B a S e s c o r e ( i ) + O v e r l a p ( x ) × O v e r l a p ( i ) + H c _ i n d e x ( x ) × H c _ i n d e x ( i ) ) + c ]

(10)

Furthermore, a similar learning model was constructed using the calculated scores of Vina, and the results of both models were compared. As Vina was reported to yield better prediction compared to Autodock 4 [16], we used the output of Vina, which includes binding affinity energies encompassing physical interactions such as electrostatic interactions [7]. As Vina score cannot be output separately by type but only combined, the results were used in this study as training data. The Vina score calculation involved a 50×50×50 grid (--size x, y, z) centered on the centroid coordinates of the PDB ligand (--center x, y, z), as obtained from CEAlign. The large grid size is used to explore the entire protein because it is not known where the pharmaceutical candidate compound will bind. Compounds whose counterparts are G-protein coupled receptors (GPCRs) or kinases possessing available qKi^exptl values and qIC₅₀^exptl assay data from ChEMBL were selected for learning and binding prediction. The biological names of the GPCRs and kinases used in the analysis are listed in Supplementary Tables S2 and S3. Molregno, compound ID within ChEMBL, are present in large quantities; therefore, a dedicated website (http://www.bio.chuo-u.ac.jp/iwadate/DrugSearch/) has been set up to list them.

Predicted Data Analysis and Statistics

Only GPCRs and kinases were included in this study to circumvent the challenges of correctly identifying the activity pocket and to mitigate potential inaccuracies in predictions associated with enzymes featuring cofactors and/or associated with small molecules. The accuracy rate of each method was determined by dividing the number of matched data with another model by the total output number. To assess the differences in the results of ChooseLD and Vina prediction, the root mean square deviation (RMSD) of the maximum common substructure between the PDB ligand and that of the predicted structure from each tool were calculated. Statistical comparison of the two grouped RMSD values was performed using a two-sided Mann-Whitney U-test. As the PDB ligands selected for each PDB ligand have different maximum common substructures using RDKit, we calculated the distances between the center-of-mass coordinates of each PDB ligand and those of the predicted structures from each tool as a common indicator and performed similar tests. Cliff’s delta (d) was also calculated as an effect size for statistical criterion owing to the large sample size. The tests focused on the differences in protein-PDB ligand distances between the predicted ChooseLD and Vina results or prediction success and failure groups. The distance data were grouped under the following conditions: the prediction tools grouping (ChooseLD and Vina) and Vina prediction accuracy (success or fail). Statistical significance was set at p<0.05. Clliff’s delta was interpreted as follows: negligible (|d|<0.147), small (|d|<0.33), medium (|d|<0.474), and large (|d|>0.474) according to a previous study [17], and values in the small, medium, and large categories were considered statistically significant. Subsequently, we extracted the qIC₅₀^calc and qKi^calc values for GPCR and kinase data corresponding to the cases where the predictions of Vina matched the experimental values. The cases wherein the difference between the predicted and experimental values by ChooseLD exceeded 1 were defined as “failure.” These instances were counted and calculated as a percentage. The number and distribution of ligands collected per target protein were analyzed.

Results

The prediction accuracy of each tool is presented in Figure 2. Both Vina and ChooseLD exhibited better accuracy than random prediction. Notably, ChooseLD demonstrated considerably higher prediction accuracy for kinase qKi^exptl values. For Vina, a marked difference with random prediction in qIC₅₀^exptl estimation was observed, and the gap in prediction accuracy with ChooseLD appeared narrow.

Figure 2 Comparative analysis of the prediction accuracy between ChooseLD and Vina. (A) Percentage of correct responses for random prediction, ChooseLD, and Vina for each target protein and activity type. N represents the number of data. (B) Frequency distribution of ChooseLD predictions for each experimental value (–3 to +4). (C) Frequency distribution of Vina predictions for each experimental value (–3 to +4).

We conducted a comparative assessment of the two qKi values for GPCR which had the largest cases to analyze the correlation of predicted experimental values with qKi^exptl. The comparison between ChooseLD and Vina results is summarized in Table 1 and Table 2, respectively. Both Vina and ChooseLD results exhibited inaccuracies in predicted values below –1 and above 5. A notable difference was observed in the output range between the two software packages, and ChooseLD could predict across a broader range. Comparing concordance rates, Vina achieved more correct predictions for qKi value of 0 (ChooseLD: 2,982, Vina: 3,373), while ChooseLD outperformed Vina in predicting values other than 0. The analysis of the quantitative distribution of ChooseLD predictions revealed peaks in qKi^exptl range of 0–2 (blue, green, and yellow lines in Figure 2). Vina predictions exhibited a higher frequency of 0 or 1 values (Figure 2). A similar result was observed for qIC₅₀ values. The final scoring equations for ChooseLD and Vina, as a result of the training, are provided in Supplementary Tables S4 and S5. As the RBF kernel was used for all learning models, all subsequent data presented reflect the results obtained using the RBF kernel.

Table 1 Distribution of combinations of qKi^exptl and ChooseLD-predicted values for GPCRs

Prediction	qKi^exptl
Prediction	–3	–2	–1	0	1	2	3	4	>4
–1	0	0	0	2	0	1	0	0	0
0	0	4	93	2982	2453	2261	1265	180	30
1	0	2	63	2196	2623	2207	1249	242	30
2	1	4	71	2009	2181	2585	1417	267	36
3	0	0	10	185	193	264	289	81	12
4	0	0	0	9	13	8	20	16	1

Rows represent values predicted by ChooseLD, whereas columns represent qKi^exptl. Areas where the qKi^exptl and predicted values match are highlighted in gray. The table provides a summary of the results of multiple affinity prediction results for the qKi^exptl values of GPCRs. GPCR, G protein-coupled receptor; Ki: inhibition constant.

Table 2 Distribution of combinations of qKi^exptl and Vina-predicted values for GPCRs

Prediction	qKi^exptl
Prediction	–3	–2	–1	0	1	2	3	4	4<
–1	0	0	0	0	0	0	0	0	0
0	1	4	91	3373	2998	2903	1486	213	42
1	0	5	74	2129	2317	2399	1452	360	39
2	0	1	71	1874	2124	2004	1283	204	26
3	0	0	1	7	24	20	19	9	1
4	0	0	0	0	0	0	0	0	0

Rows represent predicted values by Vina, whereas columns represent qKi^exptl. Areas where the qKi^exptl and predicted values match are highlighted in gray. Dark gray-colored values indicate instances where the predicted output was present in ChooseLD but did not return a predicted value in Vina. The table provides a summary of the results of multiple affinity prediction results for qKi^exptl values of GPCRs. GPCR, G protein-coupled receptor; Ki: inhibition constant.

Figure 3 display instances where ChooseLD predicted ligand-GPCR binding more accurately than Vina. For example, ChooseLD accurately predicted a score of 0, whereas Vina inaccurately predicted a score of 3 in cases where qKi^exptl value was 0. In these cases, the structures predicted by ChooseLD exhibited spatial overlap with the PDB ligands, whereas those predicted by Vina were distant from the PDB ligands. Figure 3 depict cases where Vina could accurately predict the structure. In these cases, the structures predicted by ChooseLD did not overlap, and predicted models were located in PDB ligands that were not spatially well-aligned where the experimental value was 0, whereas Vina accurately predicted a score of 0 (incorrectly predicted as 3 by ChooseLD). In contrast, the structures predicted by Vina were spatially closer to the PDB ligands.

Figure 3 Predicted docking structures near binding sites by ChooseLD and Vina. (A) Representative image of models correctly predicted using ChooseLD. The qKi^exptl value was 0; and ChooseLD correctly predicted 0, whereas Vina incorrectly predicted 3. (B) Zoomed-out view of (A). Registered data ID (protein name): target, CHEMBL264 (Histamine H3 receptor); assay ID, 82800; molregno ID (molecular name), 21534 (immepip); PDB ID: 4U16. (C) Representative image of Vina correctly predicted models. The qKi^exptl value was 0, and Vina correctly predicted 0, whereas ChooseLD incorrectly predicted 3. (D) Zoomed-out view of (C). Registered data ID (protein name): target, CHEMBL270 (Mu opioid receptor); assay ID, 436682; molregno ID (molecular name), 381517 (Dmt-Pro-Tmp-Phe-NH2); PDB ID: 7UL4. Green, Protein Data Bank (PDB) ligands; yellow, ChooseLD predicted structure; Red, Vina predicted structure. Molregno ID is the compound ID within ChEMBL.

To analyze causal factors associated with docking model difference, the RMSD of the maximum common substructure using RDKit between the PDB ligand and that of the predicted structure from each tool were calculated and statistically compared (Table 3 and Table 4). We also included the results obtained from analyzing the distances between the center-of-mass coordinates (Table 5 and Table 6). ChooseLD output was significantly closer to the PDB ligands than the predicted results of Vina for the experimental value 0 data, which was the same experimental value used in the analysis resulting in Figure 3 (Table 3 and Table 5). Similar results were observed for the other experimental values, suggesting that ChooseLD tended to predict the structure approaching PDB ligands at close distance (Supplementary Tables S6–S15). Furthermore, differences were also observed in the distance between Vina output-PDB ligands in cases where Vina made successful predictions or not. Distances were significantly smaller in cases where Vina made successful predictions that corresponded to the structure accurately predicted to bind the PDB ligand, particularly when considering qIC₅₀^calc values (Table 4 and Table 6 and Supplementary Tables S16–S26). “Success” and “Failure” refer to whether Vina successfully predicted the digit of binding-affinity value. Note that this does not indicate whether the docking position was a “Success” or “Failure.”

Table 3 Comparison of RMSDs of the maximum common substructure of PDB ligands and Vina- or ChooseLD-predicted models for data with an experimental value of 0

GPCR
Activity type	Number of data	Vina median (Å)	Vina range (Å)	ChooseLD median (Å)	ChooseLD range (Å)	p-value	d
qKi^calc	7329	5.311	0.888–30.792	4.269	0.350–17.275	0.001>	0.337
qIC₅₀ ^calc	3050	5.962	1.275–33.201	5.095	0.819–21.676	0.001>	0.228
Kinase
Activity type	Number of data	Vina median (Å)	Vina range (Å)	ChooseLD median (Å)	ChooseLD range (Å)	p-value	d
qKi^calc	160	4.89	1.685–28.989	2.906	0.991–14.807	0.001>	0.650
qIC₅₀ ^calc	1473	5.619	1.816–31.017	3.296	0.666–19.272	0.001>	0.574

Rows represent predicted GPCRs and kinases at qKi^calc and qIC₅₀^calc values, respectively. Columns represent activity type, the number of the data, median distance with range from each tool output, p-value calculate using Mann-Whitney’s U-test, and Cliff’s delta (d). RMSD, root mean square deviation; PDB: Protein Data Bank; GPCR, G protein-coupled receptor; Ki: inhibition constant; IC₅₀: half-maximal inhibitory concentration.

Table 4 Comparison of RMSDs of the maximum common substructure of Vina-outputs-PDB ligands between the cases of successful and unsuccessful predictions by Vina for data with an experimental value of 0

GPCR
Activity type	N_True ^a	N_Fail ^b	N_True median (Å)	N_True range (Å)	N_Fail median (Å)	N_Fail range (Å)	p-value	d
qKi^calc	3336	3993	5.311	1.458–24.568	5.311	0.888–30.792	0.170	–0.019
qIC₅₀ ^calc	711	2339	5.61	1.275–23.408	6.051	1.370–33.201	0.001>	–0.155
Kinase
Activity type	N_True ^a	N_Fail ^b	N_True median (Å)	N_True range (Å)	N_Fail median (Å)	N_Fail range (Å)	p-value	d
qKi^calc	0	160	N.A.	N.A.	4.89	1.685–28.989	N.A.	N.A.
qIC₅₀ ^calc	2	1471	4.663	4.487–4.839	5.624	1.816–31.017	0.377	–0.362

Rows represent predicted GPCRs and kinases at qKi^calc and qIC₅₀^calc values, respectively. Columns represent activity type, the number of the data, median distance with range from each case, p-value calculated using Mann-Whitney’s U-test, and Cliff’s delta (d). ^a: total number of successfully predicted compounds; ^b: total number of unsuccessfully predicted compounds. Kinase_qKi^calc values were unavailable because the corresponding data did not exist (shown as N.A., not available). RMSD, root mean square deviation; PDB: Protein Data Bank; GPCR, G protein-coupled receptor; Ki: inhibition constant; IC₅₀: half-maximal inhibitory concentration.

Table 5 Comparison of distances between the center-of-mass coordinates of PDB ligands and Vina- or ChooseLD-predicted models for data with an experimental value of 0.

GPCR
Activity type	Number of data	Vina median (Å)	Vina range (Å)	ChooseLD median (Å)	ChooseLD range (Å)	p-value	d
qKi^calc	7383	2.741	0.041–28.374	1.412	0.031–14.744	0.001>	0.480
qIC₅₀ ^calc	3080	3.374	0.164–31.694	1.698	0.040–15.834	0.001>	0.439
Kinase
Activity type	Number of data	Vina median (Å)	Vina range (Å)	ChooseLD median (Å)	ChooseLD range (Å)	p-value	d
qKi^calc	160	4.593	0.186–25.513	0.606	0.071–9.998	0.001>	0.843
qIC₅₀ ^calc	1473	3.608	0.079–28.909	1.037	0.033–13.507	0.001>	0.546

Table 6 Comparison of distances between the center-of-mass coordinates of Vina outputs-PDB ligands between the cases of successful and unsuccessful predictions by Vina for data with an experimental value of 0

GPCR
Activity type	N_True ^a	N_Fail ^b	N_True median (Å)	N_True range (Å)	N_Fail median (Å)	N_Fail range (Å)	p-value	d
qKi^calc	3373	4010	2.452	0.057–25.725	2.966	0.041–28.374	0.001>	–0.162
qIC₅₀ ^calc	713	2367	2.490	0.194–23.707	3.604	0.164–31.694	0.001>	–0.291
Kinase
Activity type	N_True ^a	N_Fail ^b	N_True median (Å)	N_True range (Å)	N_Fail median (Å)	N_Fail range (Å)	p-value	d
qKi^calc	0	160	N.A.	N.A.	4.593	0.186–25.513	N.A.	N.A.
qIC₅₀ ^calc	2	1471	1.513	1.469–1.558	3.618	0.079–28.909	0.054	–0.789

Rows represent predicted GPCRs and kinases at qKi^calc and qIC₅₀^calc values, respectively. Columns represent activity type, the number of the data, median distance with range from each case, p-value calculated using Mann-Whitney’s U-test, and Cliff’s delta (d). ^a: total number of successfully predicted compounds; ^b: total number of unsuccessfully predicted compounds. Kinase_qKi^calc values were unavailable because the corresponding data did not exist (shown as N.A., not available). PDB: Protein Data Bank; GPCR, G protein-coupled receptor; Ki: inhibition constant; IC₅₀: half-maximal inhibitory concentration.

To identify the factors that negatively affected the predictions of ChooseLD, we analyzed the characteristics of data in cases where Vina made accurate predictions. Kinase prediction exhibited a markedly lower failure rate than that of GPCR, and qIC₅₀^calc values demonstrated a lower incorrect rate than qKi^calc values (Table 7).

Table 7 Failure rate of ChooseLD

	qKi^calc	qIC₅₀ ^calc
GPCR	6.15	4.51
Kinase	1.83	1.06

Failure cases were identified as instances where Vina predicted accurately and the difference between predicted and experimental values of ChooseLD exceeded 1 for qKi and qIC₅₀ values in both GPCRs and kinases. All values are presented as percentages (%).

To assess this difference, the number and distribution of ligands collected per target protein were analyzed (Figure 4). The ligand count for kinases reached up to approximately 30 PDB ligands, whereas that of GPCRs was up to 21 ligands. This observation suggests that the number of ligands in homologous proteins of GPCRs tends to be smaller than that of kinases.

Figure 4 Distribution of collected ligand numbers for ChooseLD scoring. Frequency distribution of the number of PDB ligands in homologous proteins identified using a BLAST search. The frequency of PDB ligands derived from qIC₅₀^exptl data is displayed on the left vertical axis, whereas that derived from qKi^exptl values is on the right vertical axis. The horizontal axis denotes the number of ligands used in ChooseLD, which is concentrated at 30, as the maximum value of ligand compounds was set at 30 when detecting homologous proteins with ligands in PDB from BLAST search results.

Discussion

In this study, we established the predictive model of the logarithm of the bioactivity data (qKi^exptl and qIC₅₀^exptl values) for GPCRs and kinases using a learning model based on ChooseLD, which incorporates BaseScore, Overlap, and Hc_index, as well as another learning model that utilizes the Vina score. Notably, this model exhibited more effective learning than the model using random scores. Differences in predictive accuracy were observed between GPCRs and kinases, specifically the each qKi^calc and qIC₅₀^calc values. Concordantly, analysis of the correlation between the predicted and experimental values revealed peaks at positions where ChooseLD predictions aligned with the experimental values (Figure 2). This observation suggests that the three ChooseLD scores provided highly accurate and precise predictions. However, the peak positions did not coincide when the experimental values were below –1 and exceeded 2 (Figure 2), indicating increased prediction difficulty for experimental values outside the range of 0–2. This trend was more pronounced for Vina. In virtual screening, it is crucial to preferentially select compounds that have or do not have activity. However, the current method does not provide a model that could distinguish between the presence and absence of activity. In this study, we classified the experimental values based on their orders of magnitude for training and treated all data as “active.” However, we suspect that treating data with values greater than 4 digits as “inactive” and training the model to differentiate between “active” and “inactive” compounds would lead to a more virtual screening-oriented learning model.

In instances where Vina predictions were accurate, and ChooseLD encountered failure, cases were identified where multiple ligands exhibited spatial overlap in a disordered manner (Figure 3). The elevated failure rate in GPCR is attributed to insufficient ligand information for accurate empirical method prediction (Figure 4). The Vina prediction process involves grid exploration from the initial placement [7], independent of the reduction in the number of ligands. This stands in contrast to the structure prediction by ChooseLD, which relies on more ligand information. Consequently, Vina yielded relatively accurate results in these instances. Based on these insights, three major factors were considered to be associated with the failure rate of ChooseLD: the spatial ordering of PDB ligands, the adequacy of the number of PDB ligands, and the nature of the prediction target (qKi^exptl or qIC₅₀^exptl values).

In this study, the spatial orderliness of PDB ligands was predicted based on the homology of amino acid sequences using BLAST. Despite high homology between proteins, variations in microstructure and the binding site environment among individual proteins could lead to mispredictions. Therefore, improving molecular positioning through targeted repositioning is pivotal for accurate structure prediction. To address this challenge, we propose transitioning to a selective fitting method that references a single ligand with high accuracy instead of multiple ligands. Subsequently, docking can be performed using the optimal conformation obtained from force-field-based docking software, such as Vina and Rosetta [18]. This modification is speculated to be particularly effective for ChooseLD, with careful tuning, in cases where the empirical scoring method fails (Figure 3). In contrast, the modified ChooseLD established in this study exhibited sufficient performance for cases with well-aligned PDB ligands (Figure 3) and similar instances.

The number of collected PDB ligands is a factor required for accurate prediction. Our results suggest that a small number of collected ligands passively decreased structural predictability, as evidenced by the lower failure rate for kinases when a larger number of ligands was obtained than that for GPCRs (Table 7 and Figure 4). On the condition that the number of PDB ligands is shortened and/or the size of the drug candidate compounds is remarkably smaller than the pocket size, the exploration space could restrict unintentionally obtained inadequate prediction results. To address this limitation, broadening the range of obtaining PDB ligands from BLAST search results may be preferable. In cases where obtaining sufficient ligands is challenging, force field-based docking tools exemplified by Vina could represent a more favorable approach.

In conclusion, understanding the differences in the characteristics of qKi^exptl and qIC₅₀^exptl values can provide valuable insights for result interpretation. Our results suggest that ChooseLD exhibited a higher failure rate in predicting qKi^exptl than qIC₅₀^exptl values. qKi^exptl values indicate the binding strength between an enzyme and an inhibitor, whereas qIC₅₀^exptl represents the inhibition potential (as a measure of drug amount) obtained from biochemical assays, reflecting experimentally different properties [19]. In this study, ChooseLD demonstrated better predictive performance for qIC₅₀^calc than for qKi^calc (Table 7), suggesting that the empirical scoring based on PDB ligand collection may partially contain information about the enzyme-catalyzed reaction, which is the target of inhibition. A plausible factor contributing to these observations is the strength of physical interactions. In the present process, predictions were solely based on the structure without considering other factors associated with protein–ligand binding (or atom-atom bonds). However, the molecular interaction is affected by the molecular competition of ligands or substrate saturation in the physiological micro-environment, representing a potential limitation in the present prediction paradigm [20,21]. Therefore, we speculated that incorporating physical interactions that are currently not primarily considered by ChooseLD could potentially enhance the accuracy of qKi^calc, bringing them closer to that of qIC₅₀^calc.

This study focused on GPCRs and kinases for high-throughput execution without extending our method to other types of proteins owing to complexities associated with the spatial placement of ligands. While overcoming the above limitations in handling large-volume data with a high-throughput approach remains difficult, we have identified some potential improvements. In addition, through the analysis of the differences between ChooseLD and Vina outputs, we clarified the potential advantage of machine learning in docking simulation. Further comparison analysis and evaluation of machine-learning-based methods, including our tools, are required to better understand the advantages offered by each learning method. This approach is our next objective for predicting ligand affinity for proteins in future studies. The method established in this study may have broader applicability with further tunings, extending beyond GPCRs and kinases to other types of proteins. Moreover, addressing the limitations revealed in this study regarding the ligand data selection process could pave the way for high-throughput execution.

Conclusion

ChooseLD, which employs empirical scoring, is recommended when a sufficient number of spatially ordered PDB ligands are available. Software leveraging force-field-based screening, such as Vina, holds an advantage in predicting proteins with limited available ligand information or ligands exhibiting poor spatial orderliness. The use of the learning model of ChooseLD that leverages empirical scoring is expected to be beneficial for targets extending beyond GPCRs and kinases, provided there is sufficient database information on proteins and ligands with proper selection.

Conflict of Interest

All authors declare that they have no conflict of interest.

Author Contributions

M.I. conceived the study. A.M. designed and conducted the computational study, analyzed the data. D.S and M.I supervised the study. A.M. and D.S. wrote the manuscript. All authors approved the final version of the manuscript.

Data Availability

The data supporting the findings of this study are available within the article. More detailed data are available on the data repository website (http://www.bio.chuo-u.ac.jp/iwadate/DrugSearch/).

Acknowledgments

This research was supported by a grant from the Chuo University Basic Research Fund. We would like to thank Editage (www.editage.com) for English language editing.

References

[1] Ferreira, L. G., Dos Santos, R. N., Oliva, G., Andricopulo, A. D. Molecular docking and structure-based drug design strategies. Molecules 20, 13384–13421 (2015). https://doi.org/10.3390/molecules200713384
[2] Acharya, C., Coop, A., Polli, J. E., Mackerell, A. D. Recent advances in ligand-based drug design: Relevance and utility of the conformationally sampled pharmacophore approach. Curr. Comput. Aided Drug Des. 7, 10–22 (2011). https://doi.org/10.2174/157340911793743547
[3] Kalyaanamoorthy, S., Chen, Y. P. Structure-based drug design to augment hit discovery. Drug Discov. Today 16, 831–839 (2011). https://doi.org/10.1016/j.drudis.2011.07.006
[4] Takaya, D., Takeda-Shitaka, M., Terashi, G., Kanou, K., Iwadate, M., Umeyama, H. Bioinformatics based Ligand-Docking and in-silico screening. Chem. Pharm. Bull. (Tokyo) 56, 742–744 (2008). https://doi.org/10.1248/cpb.56.742
[5] Takaaki, I., Hideaki, U., Mitsuo, I. In-Silico screening based on the protein ligand interaction of experimental data. Journal of the Institute of Science and Engineering (Chuo University). 25, 83–91 (2020).
[6] Mendez, D., Gaulton, A., Bento, A. P., Chambers, J., Veij de, M., Félix, E., et al. ChEMBL: Towards direct deposition of bioassay data. Nucleic Acids Res. 47, D930–D940 (2019). https://doi.org/10.1093/nar/gky1075
[7] Trott, O., Olson, A. J. AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31, 455–461 (2010). https://doi.org/10.1002/jcc.21334
[8] Davies, M., Nowotka, M., Papadatos, G., Dedman, N., Gaulton, A., Atkinson, F., et al. ChEMBL web services: Streamlining access to drug discovery data and utilities. Nucleic Acids Res. 43, W612–W620 (2015). https://doi.org/10.1093/nar/gkv352
[9] RDKit: Open-Source Cheminformatics. https://www.rdkit.org/
[10] Rappe, A. K., Casewit, C. J., Colwell, K. S., Goddard, W. A. III., Skiff, W. M. UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations. J. Am. Chem. Soc. 114, 10024–10035 (1992). https://doi.org/10.1021/ja00051a040
[11] Altschul, S. F., Gish, W., Miller, W., Myers, E. W., Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990). https://doi.org/10.1016/S0022-2836(05)80360-2
[12] Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., et al. The protein data bank. Nucleic Acids Res. 28, 235–242 (2000). https://doi.org/10.1093/nar/28.1.235
[13] DeLano, W. L. The PyMOL molecular graphics system. DeLano Scientific, Palo Alto, CA, USA. (2002). https://www.pymol.org/
[14] Takeda-Shitaka, M., Terashi, G., Chiba, C., Takaya, D., Umeyama, H. FAMS complex: A fully automated homology modeling system for protein complex structures. Med. Chem. 2, 191–201 (2006). https://doi.org/10.2174/157340606776056160
[15] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Bertrand, T., Grisel, O., et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
[16] Chang, M. W., Ayeni, C., Breuer, S., Torbett, B. E. Virtual screening for HIV protease inhibitors: a comparison of AutoDock 4 and Vina. PLoS One. 5, 8 e11955 (2010). https://doi.org/10.1371/journal.pone.0011955
[17] Romano, J., Kromrey, J. D., Coraggio, J., Skowronek, J. Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen’s d for evaluating group differences on the NSSE and other surveys? Annual Meeting of the Florida Association of Institutional Research. 2006.
[18] Meiler, J., Baker, D. ROSETTALIGAND: Protein-small molecule docking with full side-chain flexibility. Proteins 65, 538–548 (2006). https://doi.org/10.1002/prot.21086
[19] Cer, R. Z., Mudunuri, U., Stephens, R., Lebeda, F. J. IC50-to-Ki: A web-based tool for converting IC50 to Ki values for inhibitors of enzyme activity and ligand binding. Nucleic Acids Res. 37, W441–W445 (2009). https://doi.org/10.1093/nar/gkp253
[20] Martinez-Hackert, E., Sundan, A., Holien, T. Receptor binding competition: A paradigm for regulating TGF-β family action. Cytokine Growth Factor Rev. 57, 39–54 (2021). https://doi.org/10.1016/j.cytogfr.2020.09.003
[21] Kaushik, S., Chang, C. A. Molecular mechanics study of flow and surface influence in ligand-protein association. Front. Mol. Biosci. 8, 659687 (2021). https://doi.org/10.3389/fmolb.2021.659687

Corresponding author

Register with J-STAGE for free!