2023 Volume 71 Issue 5 Pages 360-367
Computational screening is one of the fundamental techniques in drug discovery. Each compound in a chemical database is bound to the target protein in virtual, and candidate compounds are selected from the binding scores. In this work, we carried out combinational computation of docking simulation to generate binding poses and molecular mechanics calculation to estimate binding scores. The coronavirus infectious disease has spread worldwide, and effective chemotherapy is strongly required. The viral 3-chymotrypsin-like (3CL) protease is a good target of low molecular-weight inhibitors. Hence, computational screening was performed to search for inhibitory compounds acting on the 3CL protease. As a preliminary assessment of the performance of this approach, we used 51 compounds for which inhibitory activity had already been confirmed. Docking simulations and molecular mechanics calculations were performed to evaluate binding scores. The preliminary evaluation suggested that our approach successfully selected the inhibitory compounds identified by the experiments. The same approach was applied to 8820 compounds in a database consisting of approved and investigational chemicals. Hence, docking simulations, molecular mechanics calculations, and re-evaluation of binding scores including solvation effects were performed, and the top 200 poses were selected as candidates for experimental assays. Consequently, 25 compounds were chosen for in vitro measurement of the enzymatic inhibitory activity. From the enzymatic assay, 5 compounds were identified to have inhibitory activities against the 3CL protease. The present work demonstrated the feasibility of a combination of docking simulation and molecular mechanics calculation for practical use in computational virtual screening.
Theoretical calculation has become indispensable for drug discovery due to the progress of computer technology. Virtual screening is one of the powerful techniques for supporting the development of drugs.1,2) Chemical screening is the initial step to identify candidates for lead compounds. High throughput screening with a chemical library is one of the most popular approaches for finding hit compounds in experiments.3) On the other hand, computational screening is advantageous in terms of cost and efficiency. No tangible compounds are required for a search using computational screening. A considerably large database can be handled as a chemical library, and acceptable drug candidates or unique chemical scaffolds are sometimes found efficiently.4)
Docking simulation is frequently used for virtual screening. In docking simulation, each compound in a chemical database is bound to the target macromolecule. The docking score is an index to evaluate the affinity between a compound and its target molecule. Therefore, the docking score is usually used as a clue to select candidates from a chemical database. Many docking simulation programs have been released,5–7) and those programs have pros and cons in terms of accuracy in the predicted binding pose and the calculated binding score. The prediction accuracies of several docking programs were compared in some studies.8) Those studies suggested that the docking pose in rank first is only sometimes the correct solution. Hence, several binding poses are usually generated for reliable prediction. Choosing an adequate binding pose among the generated ones is a critical process in prediction. Therefore, re-evaluation of the binding score has often been attempted to enhance prediction accuracy. Re-evaluation by the docking program by altering the force-field parameters is one approach.9) Re-evaluation by molecular dynamics simulation or machine learning is another approach.10,11)
The coronavirus infectious disease, caused by severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2), quickly spread worldwide, and the pandemic caused depression in socioeconomic activities around the world. Despite the rapid development of novel RNA vaccines, re-expansion has frequently been reported due to the emergence of mutated viruses and other factors. Chemotherapy with multiple drugs acting on different targets is effective for treating infectious diseases caused by mutant viruses. Therefore, there is an urgent need to develop several kinds of inhibitors for SARS-Cov-2. Remdesivir has been approved as an antiviral drug for SARS-Cov-2.12) The target of remdesivir is a viral polymerase. The viral polymerase is also a target of favipiravir13) and molnupiravir,14) which have been developed as oral administrative drugs for SARS-Cov-2. On the other hand, S-217622, an orally administered drug, inhibits the enzymatic activity of 3-chymotrypsin-like (3CL) protease.15)
It has been reported that a compound with an alpha-ketoamide structure exhibits inhibitory activity against viral 3CL protease with an IC50 value of 0.67 µM and is bound to the active center of the enzyme.16) Another peptidomimetic compound with a benzothiazole ketone moiety can strongly block SARS-Cov-2 replication.17) The compound displays a fine in vitro profile for absorption, distribution, metabolism and excretion (ADME) and sound in vivo pharmacokinetics. Bicyclo-octenes fused to N-substituted succinimides are unique scaffolds for non-covalent 3CL protease inhibitors.18) Many of the existing protease inhibitors have been examined for their inhibitory activity in viral proliferation.19) Boceprevir inhibits the 3CL protease with IC50 value of 4.1 µM and a 50% effective concentration (EC50) value of 1.9 µM. Hence, 3CL inhibitors can block viral growth in the measurement with cell culture. The human immunodeficiency virus type 1 (HIV-1) protease inhibitors, nelfinavir and lopinavir, show growth inhibition of SARS-Cov-2.20) However, no marked inhibitory activity against the 3CL protease was observed in experiments for HIV-1 protease inhibitors.19)
In this work, we performed computer screening to find inhibitory agents for the 3CL protease of SARS-Cov-2 using a chemical database consisting of approved and investigational compounds. In the first step, the binding poses of compounds at the active site of the 3CL protease were predicted by docking simulations. The docking simulations were performed by AutoDock Vina.21) This software has been widely used for predicting the binding of low molecular-weight compounds to their target proteins.22) However, a recent study in which the accuracies of docking programs were compared showed that the reliability of the binding score of AutoDock Vina needs to be improved.8) Therefore, we re-evaluated the docking score with molecular mechanics (MM) calculation in the second step. MM calculation was applied to all of the generated docking poses. The binding pose was optimized by the energy-minimizing method, and the binding score was predicted by the atom pair-wise force field.23) MM calculation is useful for analyzing intermolecular interactions such as antibody recognition of antigens24,25) and the amyloid peptide addiction to a membrane surface.26) MM calculation is also useful for analysis of compound–protein interactions27) and is utilized for the molecular design of candidate inhibitors.28)
The compounds selected by the computer screening were examined in experiments for their potency to block the enzymatic activity. We purchased 25 compounds that were expected to exert inhibitory activity against the 3CL protease. The inhibitory potency was monitored for every compound at a constant concentration. Five compounds showing inhibitory activity were selected. Then, the 50% inhibitory concentrations of the five selected compounds was measured. The optimized binding structures of the five compounds were closely examined. The information obtained in this study will be helpful for modifying chemical structures to enhance the potency of compounds.
Computational chemical screening was carried out by using a combination of docking simulation and MM calculation. AutoDock Vina21) was employed for the docking simulation to generate binding poses of compounds to the target protein. AutoDock Vina is one of the most frequently resorted programs in chemical screening because of its computational accuracy and applicability to a large-scale dataset. The calculation model for the 3CL protease was built from the crystal structure, PDB#: 6y2g.29) A search box of 56 × 50 × 50 Å was set at the cavity near Cys145 and His41 of the 3CL protease. Seven residues; Leu21, His41, Met49, Asn142, Cys145, Glu166, and Gln189, were assigned to be flexible residues. Five poses were generated for each compound. A computation under a heavy search condition was executed for every compound with an option of exhaustiveness = 20.
The binding pose was optimized by MM calculation with a program named Orientation.23) The atom pair-wise force field potentials can evaluate the binding affinity between proteins, small molecules, or lipid molecules. A parallel computation effectively processes all of the compounds in the database in complex with the 3CL protease.24) Initially, a geometry optimization was executed by the steepest descent method, in which energy-minimizing calculation was repeated for 50000 cycles without setting the positional constraint for any atom. Then, the binding affinity was predicted from the interaction energies relevant to the compound. The intra-molecular interaction was described by the Amber ff14SB force field,30) and the inter-molecular interaction was by a force field with the Morse-type potential function.23) The binding score was given by the sum of energies evaluated by the force field. The input parameter file for MM calculation was generated by a utility program, AmberTools 20.31)
Computational Screening with a Test DatasetAn experimental study reported the identification of the 3CL protease inhibitors.19) Hence, the performance of virtual chemical screening was evaluated with the compounds used in that study. Since AutoDock Vina cannot handle compounds containing a boric acid skeleton, 51 of the 55 compounds tested in that study were used to evaluate the computational screening performance. (1) The file formats of the 51 compounds were converted into pdbqt. The atom coordinates of the 3CL protease were also prepared with the pdbqt format, in which rigid and flexible residues are separated into two files. (2) Simulation of docking to the 3CL protease was performed for every compound, and five binding poses were generated by prediction. (3) The binding structure was optimized by MM calculation, setting every predicted binding pose from the docking simulation as a starting model. (4) The binding structure with the best binding score was selected for the individual compounds. (5) The binding scores of Orientation were compared with the reported inhibitory activities of the test compounds.
Procedure for Chemical ScreeningA chemical database consisting of approved drugs and chemicals in clinical investigations was searched for candidate compounds for inhibitors of the 3CL protease. The screening was carried out stepwise using docking simulation and MM calculation. (1) A library containing 8820 compounds was downloaded from the Drugbank database. The file format of each chemical was converted into pdbqt, an input form of AutoDock Vina. (2) Docking simulations were performed for each compound with the 3CL protease of SARS-Cov-2 set as a target. Five binding poses were generated for each compound. (3) Based on the score of the docking simulation, 20000 binding poses were extracted for MM calculation. (4) Geometry optimization was performed by MM calculation setting the predicted binding pose from AutoDock Vina set as a starting model. (5) Based on the binding score, 2000 binding structures were selected to re-evaluate binding affinity with considering the contact with the target domain. The binding score is scaled by a factor depending on the area ratio in contact with solvent waters and the target protein. The proportion of solvent-contacting area around the compound is calculated in the optimized binding structure. The binding score is divided by (1+ Const * proportion) to lessen the estimated binding affinity. Const is set to 1.0, and the proportion ranges from 0.0 to 1.0. When the compound is completely surrounded by solvent, the proportion is 1.0 and the binding score is reduced to half. (6) From the binding score, the top 200 structures were extracted from the 2000 binding structures. The same compounds with different binding poses were occasionally included in the 200 extracted chemicals. (7) All of the 200 chemicals were visually checked to narrow down the candidates. Finally, 25 compounds were selected for purchase and assay for compound potency.
Protein Expression and PurificationA recombinant 3CL protease was expressed as a double-tag form in which glutathione S-transferase (GST) was conjugated at the N-terminal side and a 6 × His-tag was fused at the C-terminal side with a pGEX-6p-2 plasmid vector. Since the recognition sequence by the 3CL protease, AVLQ|SGFR, was inserted at the beginning of the 3CL protease, the N-terminal GST-tag was cleaved by auto-processing. Since the amino sequence, VTFQ|GP, was coded at the end of the 3CL protease, the C-terminal 6 × His-tag was cleaved by the human rhinovirus 3C (HRV-3C) protease. An Escherichia coli competent cell, Rosetta 2 (DE3) pLysS, was transformed with the expression vector. Pre-incubation was performed with 10 mL Luria–Bertani (LB) medium at 37 °C. Then, the cell culture was scaled up to 1 L medium. At an OD600 value of 0.6, 0.2 mM isopropyl β-D-thiogalactpyranoside (IPTG) was added to the medium for induction. Then, the medium was incubated at 18 °C overnight, and the cultured cells were pelleted down by 30-min centrifugation at 5000 × g. The cell pellet was re-suspended in 40 mL of buffer solution (50 mM Tris–HCl pH 7.5, 10 mM imidazole, 1 mM phenylmethylsulfonyl fluoride: PMSF). The cell membrane was disrupted by an ultrasonic homogenizer. The lysate was centrifuged at 14000 × g for 30 min. The remaining debris was removed from the supernatant by passing the supernatant through a 0.45 µm micro-filter.
The 6 × His-tag-conjugated 3CL protease was purified by metal-affinity chromatography using a solution of 50 mM Tris–HCl pH 7.5 with 10 mM imidazole as a starting buffer and a solution consisting of 50 mM Tris–HCl pH 7.5 with 500 mM imidazole as an elution buffer. The collected fraction was dialyzed against 1 L of buffer solution (50 mM Tris–HCl pH 7.5, 50 mM NaCl) overnight at 4 °C to remove imidazole. The 6 × His-tag was cleaved by HRV-3C protease with incubation of the mixture of the sample and HRV-3C protease at 4 °C for 24 h. The cleaved 6 × His-tag, HRV-3C protease, and uncleaved conjugated protein were removed by Ni-nitrilotriacetic acid (NTA) agarose resin using a solution of 50 mM Tris–HCl pH 7.5 with 10 mM imidazole as a starting buffer and a solution of 50 mM Tris–HCl pH 7.5 with 500 mM imidazole as an elution buffer. The target protein was collected from the flow-through fractions. The 3CL protease was further purified by anion exchange chromatography. The starting buffer was 20 mM potassium phosphate pH 8.0 with 25 mM NaCl and the elution buffer was 20 mM potassium phosphate pH 8.0 with 1000 mM NaCl. The protein was finally purified by gel filtration with a running buffer of 20 mM N-(2-hydroxyethyl)piperazine-N’-2-ethanesulfonic acid (HEPES) at pH 7.5, 200 mM NaCl, and 2 mM mercaptoethanol.
Measurement of Inhibitory ActivityAn inhibition assay was performed by the fluorescence resonance energy transfer (FRET) method. A fluorogenic peptide, Dabcyl-KTSAVLQ-SGFRKME-Edans, was utilized as a substrate for hydrolysis by the 3CL protease. The fluorescence intensity was monitored with a microplate reader. The wavelengths were 360 nm for excitation and 485 nm for emission with a bandwidth of 20 nm. The fluorogenic peptide was hydrolyzed by the 3CL protease in a reaction buffer of 50 mM Tris–HCl pH 7.3 with 1 mM ethylenediaminetetraacetic acid (EDTA). In each well, 50 µL reaction buffer was mixed with 0.1 nM 3CL protease, 0.5 µM fluorogenic substrate, and a compound. Then, the mixture was incubated at 30 °C for the enzymatic reaction. The fluorescence was monitored every 1 min for 50 min.
First, the inhibitory activity was measured for every compound at the concentration of 50 µM. Second, potent compounds were selected from the first measurements. The inhibitory activity was measured again for the selected compounds at concentrations of two-fold serial dilution from 64 µM. The IC50 value was obtained by plotting the inhibition rates against the different inhibitor concentrations.
A previous study provided results of inhibitory assays against the SARS-Cov-2 3CL protease for a collection of protease inhibitors of bioactive compounds.19) The assays in that study were performed by FRET enzymatic measurement and viral replication suppression in cell culture. We evaluated the performance of our computational screening methodology by making use of the assay results reported in the study. More than 50 inhibitors were tested in that study,19) and several compounds that exhibit inhibitory activity against the 3CL protease were identified (Supplementary Table S1).
Docking simulations were carried out by AutoDock Vina for the compounds tested in the above-mentioned study, targeting the 3CL protease. All of the binding scores are ascendingly ordered (Fig. 1(a)). The vertical axis in the graph is the binding score. The lower the score value is, the more strongly the compound is bound to the 3CL protease. The binding scores obtained by AutoDock Vina range from −9.8 to −5.6. Compound 36 showed the lowest score of −9.8, and compound 48 showed the highest score of −5.6. The solid black columns in Fig. 1(a) represent the top 3 compounds in inhibitory potency experimentally measured in the above-mentioned study; i.e., compounds 28, 29, and 43. Compound 28 is boceprevir, an approved HCV NS3-4A protease inhibitor (IC50 = 4.13 µM for SARS-Cov-2 3CL protease). Compound 29 is narlaprevir, an HCV NS3-4A protease inhibitor (IC50 = 5.73 µM). Compound 43 is a calpain inhibitor, MG-132 (IC50 = 3.90 µM). The binding scores of these 3 compounds are −9.1 for compound 28, −7.8 for compound 29, and −6.3 for compound 43. The binding score of compound 28 is below −9, and the rank is the 4th in the 51 chemicals. Compound 29 is ranked at a median position; 22nd, and compound 43 is ranked 46th. Therefore, the binding scores obtained by AutoDock Vina are poorly correlated with the experimentally measured inhibitory concentrations.
(a) Binding scores obtained by AutoDock Vina. Three back columns represent the best three compounds in inhibitory activity in experimental measurements. (b) Binding scores obtained by molecular mechanics calculation. (c) Scaled binding score obtained by dividing the value in (b) by the number of heavy atoms for each compound.
All of the binding poses generated by AutoDock Vina were optimized by MM calculations. The binding scores were also obtained by MM calculations with the optimized binding poses. The MM binding scores are ascendingly ordered (Fig. 1(b)). The range of MM binding scores is from −36 to −13. Like AutoDock Vina, the binding affinity is predicted to be stronger for a lower MM binding score. The lowest score value is −36.1 for compound 28, and the second lowest is −35.811 for compound 29. The binding scores of these two compounds are outstanding compared to the others. These two compounds were shown to be active in the experimental inhibitory assays. The binding score of compound 43 is −27.978, and it is ranked 11th. Therefore, the binding score obtained by MM calculation can predict the inhibitory activity at a satisfactory level.
The ligand efficiency is sometimes helpful for selecting candidate compounds. Hence, the MM binding scores were divided by the number of heavy atoms for the respective compounds (Fig. 1(c)). The divided binding scores range from −1.1 to −0.4. None of the three active compounds, 28, 29, and 43, show a binding value lower than −1. Accordingly, the divided MM binding score hardly selects the active compounds adequately.
Computational Search for 3CL InhibitorsDocking simulation using AutoDock Vina was performed for 8820 compounds in the Drugbank database with the SARS-Cov-2 3CL protease as the target. Since five binding poses were generated for each compound, 44100 poses were obtained in the prediction. The binding scores of all of the generated poses are shown in a histogram (Fig. 2(a)). The binding scores obtained by AutoDock Vina range from −11.5 to −1.0. The distribution of binding scores has a single peak at −7.1 in frequency and shows asymmetric populations around the peak. About 1200 binding poses are beyond the binding score of −9.0.
The binding poses for which the binding score was lower than the red line were passed for the subsequent structure optimization. (b) Distribution of the frequency of binding scores obtained by MM calculation. The optimized binding structures for which the binding score was lower than the red line were passed for the subsequent score re-evaluation. (c) Distribution of the frequency of binding scores re-evaluated by MM calculation. The binding structures for which the binding score was lower than the red line were visually inspected to select candidate compounds for the experimental assay.
MM calculations were carried out for the top 20000 binding poses generated by the docking simulation. The binding structure was optimized for every pose. The binding scores obtained by the MM calculations show a symmetric population (Fig. 2(b)). The peak appears at −21.5, and the distribution is considerably broad. The frequency of the binding scores exhibits a long tail at lower values.
The binding score was re-evaluated for the top 2000 optimized binding structures with the influence of water molecules taken into account (Fig. 2(c)). The binding scores shown in Fig. 2(c) correspond to the scores in Fig. 1(b) in the calculations for the test dataset. The binding scores range from −41 to −22, and the shape of the histogram is quite asymmetric. The top 200 compounds were focused on, and their chemical structures and binding poses were visually inspected in detail. It is necessary to examine whether a compound well fits the deep hollow sites at the binding pocket. Some compounds show high hydrophobicity in spite of searching for the chemical database with approved and investigational compounds. Accordingly, highly hydrophobic compounds were avoided for selection. Finally, we selected 25 compounds for purchase. The 25 candidate compounds are listed in Supplementary Table S2.
Enzymatic Inhibitory Assay of Candidate CompoundsIn vitro measurements for the enzymatic inhibitory activity were carried out for the 25 compounds (Fig. 3). A FRET assay was performed to evaluate the inhibitory activity of the compounds. A fluorescent probe and its quencher with a peptide linker connected them were used in the FRET assay. The peptide linker has an amino acid sequence recognized and cleaved by the 3CL protease. Therefore, less fluorescence is observed for a compound inhibiting the protease enzymatic activity more strongly. FRET measurement was performed twice for all 25 compounds at a constant concentration of 50 µM.
The assay was performed by the FRET technique with a constant compound concentration of 50 µM. The five compounds denoted by the solid black columns are selected for subsequent determination of IC50 values.
Based on the results of the assay, we focused on several compounds for which the average activity rate of the two measurements was less than 30% or for which either of the measurements gave an enzymatic activity lower than 10%. Compounds 8, 9, and 12 were selected from the former condition and compounds 5 and 14 were selected from the latter condition. Compound 2 shows a large negative value (Fig. 3). The negative value is attributed to the chemical property of compound 2 in absorption and emission of fluorescence. Compound 2 was therefore excluded from the selection.
The 5 selected compounds were further evaluated for their inhibitory activity with different concentrations in a two-fold dilution series. The final concentration of each compound was changed from 1 to 64 µM. A dimethyl sulfoxide (DMSO) solution without a compound was set as a blank. FRET measurement was performed twice, and the enzymatic activity was plotted against the compound concentration (Fig. 4). The IC50 was obtained for all of the 5 compounds (Table 1). Compounds 5, 12, and 14 showed good inhibitory activities. Compound 5 exhibits a markedly low IC50 value of 1.1 µM. Some compounds show light-emission or -absorbance properties, which occasionally disturb the FRET measurements. Hence, it is not safe to judge whether a compound blocks the enzymatic activity or not only from the measurement with a single concentration. The dependency of the inhibition rate on the compound concentration makes an evaluation of inhibitory potency robust. The measurements with different concentrations indicated that compounds 2 and 24 had no inhibitory activity, although they showed noticeable FRET signals in Fig. 3.
The inhibitory assay was carried out in duplicate by the FRET technique with a dilute series of compound concentrations.
Compound | Name | Molecular weight | Log P * | IC50 (µM) |
---|---|---|---|---|
5 | Endovion | 495.2 | 4.2 | 1.1 |
8 | Sufugolix | 667.7 | 6.1 | 19.1 |
9 | Bevirimat | 584.8 | 9.1 | 13.1 |
12 | Boceprevir | 519.7 | 3.1 | 7.8 |
14 | Adomeglivant | 555.6 | 8.1 | 8.3 |
* The value was computed by XLog P3 3.0.
The binding poses predicted by the calculations are depicted for the 5 identified compounds with schematic illustrations of the compound–protein interaction (Supplementary Fig. S1). Compound 5 makes hydrogen bonds with His164 and Gln189 at the center of the binding pocket (Supplementary Fig. S1(a)). Compound 5 also has hydrogen bonds with Leu141, Gly143, and His163 at the S1 pocket. In addition, π–π interaction is observed at the S2 pocket, in which an aromatic ring of the compound makes a contact with the side chain of His41. These bond networks significantly stabilize the compound accommodation.
For compound 8 (Supplementary Fig. S1(b)), the methoxy-urea moiety positioned at the S4 pocket is stabilized by hydrogen bonds with Gln189, Thr190, and Gln192. Hydrophobic interactions are observed at the S1 and S1′ pockets, in which the thieno-pyrimidine skeleton fits well in the space connecting these two pockets. The compound also has hydrogen bonds with Gly143 and Ser144. The side chain of His41 and the phenyl ring of the compound are aligned in parallel at the S2 pocket, generating the π–π interaction.
The binding of compound 9 is stabilized by the formation of hydrogen bonds with Thr190 and Gln192 at the S4 pocket and Gly143 at the S1′ pocket (Supplementary Fig. S1(c)). The shape of compound 9 is Chat of the enzymatic active site, making the cholesterol skeleton contact with the space from the S1′ to S4 pockets and oriented to the direction perpendicular to the His41 side chain imidazole ring.
The binding of compound 12 is stabilized by a hydrogen bond network with Glu166 and Gln189 in the S4 pocket and with His41, Gly143, Ser144, and His163 at the S1 and S1′ pockets (Supplementary Fig. S1(d)). The S2 pocket houses the dimethyl-azabicyclo-hexane moiety.
Compound 14 has hydrogen bonds with Glu166 and Gln192 at the S4 pocket and His41 at the S1′ pocket (Supplementary Fig. S1(e)). In addition, a CF3 group is hooked at the S2 pocket. A large hydrophobic moiety, 4-tert-butyl-dimethyl-biphenyl group, is located at the S1 pocket.
Due to the global pandemic of SARS-Cov-2 infectious disease, many computational studies have been carried out using the virtual screening for 3CL protease inhibitors. A chemical database consisting of approved drugs has frequently been utilized because of the urgent requirement for effective chemotherapeutic agents. Ghahremanpour et al. identified 5 compounds from a library consisting of about 2000 approved oral drugs.32) The identified compounds and their IC50 values were manidipine (4.8 µM), boceprevir (5.4 µM), lercanidipine (16.2 µM), bedaquiline (18.7 µM), and efonidipine (38.5 µM). Boceprevir was also identified in the present work. The inhibitory activity reported for boceprevir is compatible with the results of our assay.19) Three compounds; manidipine, lercanidipine, and efonidipine, commonly have a nitro-benzyl moiety. Considering the cytotoxicity owing to the nitro-benzyl group, we did not select these compounds for candidates in our work. Despite the weak inhibitory activity of perampanel, a chemical optimization was performed through organic synthesis, enzymatic inhibitory assay, and X-ray crystallography.33)
Virtual screening using the Drugbank database was carried out in a study with docking simulation and subsequent binding energy evaluation by molecular dynamics (MD) simulation.34) In the study, eight candidate compounds in the Drugbank were selected for 3CL inhibitors. None of those eight compounds were selected in the present work. In another study in which computational screening was carried out by the combination of docking and subsequent MD simulation, four compounds were selected as inhibitory compounds.35) All of those four compounds bear phosphate groups, and none of those compounds were selected in our work. In another virtual screening with the Drugbank, two tyrosine kinase inhibitors, nilotinib, and bemcentinib, were selected as potent drug repositioning candidates against SARS-Cov-2 3CL protease.36) Those two compounds exhibited EC50 values of 2.6 and 1.1 µM in a cell culture assay. Neither of those kinase inhibitors was selected in our work. A computational screening was also performed to repurpose U.S. Food and Drug Administration (FDA)-approved drugs in order to identify potential inhibitors for the SARS-Cov-2 3CL protease.37)
Two independent research groups performed X-ray crystal analysis on the complex of boceprevir and SARS-Cov-2 3CL protease. The crystal structures reported by both groups, PDB#6zru38) and 7brp,39) are almost identical. The tert-butyl-urea moiety is positioned at the S4 pocket, forming hydrogen bonds with Gln189 and Glu166. The dimethyl-azabicyclo-hexane moiety is located in the S2 pocket. The tert-butyl group between the urea and azabicyclo-hexane faces the solvent. The carbonyl O atom of the amide group makes a hydrogen bond with His41. The carbamoyl group occupies the S1′ pocket, and the cyclo-butyl group is positioned near the S1 pocket. The predicted binding pose of boceprevir in our work is almost consistent with the crystal structures (Supplementary Fig. S2). The tert-butyl-urea and dimethyl-azabicyclo-hexane moieties occupy the S4 and S2 pockets, respectively. The tert-butyl group between these two moieties is exposed to the solvent. Only the positions of the carbamoyl and cyclo-butyl groups are altered from the crystal structures. The presence of water molecules causes this slight difference between the predicted pose and the crystal structure. A water molecule occupies the S1 pocket in the crystal structure. In contrast, no water molecules are present in the stage of docking simulation. Hence, the carbamoyl group is located at the S1 group and makes a firm hydrogen bond with His163. The generation of the hydrogen bond significantly lowers the binding score. The predicted binding mode was selected as the best one from the generated five binding poses in the MM calculation.
Virtual screening for inhibitors against the SARS-Cov-2 3CL protease was performed by a combination of docking simulation and MM calculation. The binding poses of a compound to the target protein were generated by the docking simulation. The binding scores were evaluated by subsequent MM calculation. Initially, the virtual screening was carried out with a test set of 51 compounds. The test screening demonstrated that inhibitory compounds could be successfully found from a chemical dataset by using the combination approach. Next, a search for 3CL inhibitors was made with a database consisting of approved drugs and agents used in clinical trials. The docking simulation generated 44100 predicted poses in total. The binding scores ranged from −11.5 to −1.0. The distribution of binding scores had a single peak at −7.1. MM calculations were carried out for the top 20000 binding poses generated by docking simulation. The binding structure was optimized for every pose. The binding scores obtained by the MM calculations peaked at −21.5, and the distribution was considerably broad. The binding score was re-evaluated for the top 2000 optimized binding structures with the influence of water molecules considered. The top 200 compounds were focused on, and 25 compounds were selected for experiments by the visual inspection of chemical structures. The enzymatic inhibitory activity was assayed for the 25 compounds by the FRET method. The FRET assay was performed twice for all 25 compounds at a constant concentration of 50 µM. Based on the results of the assay at a constant concentration, 5 compounds were selected for measuring IC50. The 5 selected candidates were evaluated for their inhibitory activity with a dilution series of compound concentrations. FRET measurement was performed twice, and IC50 values were obtained for all 5 compounds. Among the 5 compounds, endovion exhibited a markedly low IC50 value of 1.1 µM. Boceprevir showed the second lowest IC50 value of 7.8 µM. Boceprevir was already shown to inhibit 3CL enzymatic activity. The predicted binding structure of boceprevir was compatible with the reported crystal structures. Therefore, the computational screening approach is effective for finding active compounds from chemical databases.
This work used computational resources of OBCX system provided by Information Technology Center of the University of Tokyo through the HPCI System Research Project (Project ID: hp200148, hp200244). A part of the calculation was executed at Research Center for Computational Science, Okazaki, Japan (Project: 22-IMS-C001). This study was supported by a Grant for Scientific Research C from Japan Society for the Promotion of Science (21K07050).
The authors declare no conflict of interest.
This article contains supplementary materials. Supplementary materials contain a list of the test compounds utilized for the evaluation, a list of the compounds selected for the inhibitor candidates, the predicted binding structures of the identified compounds, and a comparison of the binding pose between the crystal structure and the predicted one.