Biophysics and Physicobiology
Online ISSN : 2189-4779
ISSN-L : 2189-4779
Note
Microsecond molecular dynamics suggest that a non-synonymous mutation, frequently observed in patients with mild symptoms in Tokyo, alters dynamics of the SARS-CoV-2 main protease
Daisuke KurodaKouhei Tsumoto
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML
Supplementary material

2021 Volume 18 Pages 215-222

Details
Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which causes the coronavirus disease 2019 (COVID-19), spread rapidly around the globe. The main protease encoded by SARS-CoV-2 is essential for processing of the polyproteins translated from the viral RNA genome, making this protein a potential drug target. A recently reported mutation in the protease, P108S, may be responsible for milder symptoms observed in COVID-19 patients in Tokyo. Starting from a crystal structure of the SARS-CoV-2 main protease in the dimeric form, we performed triplicate 5.0-μs molecular dynamics simulations of the wild-type and P108S mutant. Our computational results suggest a link between the mutation P108S and dynamics of the catalytic sites in the main protease: The catalytic dyad become considerably inaccessible to substrates in the P108S mutant. Our results also demonstrate the potential of molecular dynamics simulations to complement experimental techniques and other computational methods, such as protein design calculations, which predict effects of mutations based on static crystal structures. Further studies are certainly necessary to quantitively understand the relationships between the P108S mutation and physical properties of the main protease, but the results of our study will immediately inform development of new protease inhibitors.

Significance

Based on a crystal structure of the SARS-CoV-2 main protease, we performed triplicate 5.0-μs molecular dynamics simulations of the wild-type and mutant P108S protease. This mutation might cause milder symptom observed in COVID-19 patients in Tokyo. Our simulations suggest that the mutation makes substrates less accessible to the catalytic dyad of the protease through rigidification of a loop structure at the entrance of the catalytic sites, which may then lead to milder manifestation of COVID-19. Although further experimental characterization is necessary, this is the first report that links the mutation P108S to the dynamics of the dimeric SARS-CoV-2 main protease.

Introduction

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the causative agent of the global pandemic of coronavirus disease 2019 (COVID-19) [1]. SARS-CoV-2 is an RNA virus, and, despite a genetic proofreading mechanism [2] that limits genetic diversity [3], the persistence of the pandemic may allow for selection of rare but favorable mutations [4].

Since the first report of the virus [5], numerous papers have been published in peer-reviewed journals [6] that detail results of medical and biochemical studies [7,8] and structural studies [9,10]. A potential target for therapy is the SARS-CoV-2 main 3C-like protease, which is essential for processing of polyproteins translated from the viral genome [11]. Previous computational studies of the main protease focused on interactions between the protease and potential inhibitors [1218]. One of the most popular computational methods for analysis of interactions between drugs and proteins is molecular dynamics (MD) simulation, which is frequently employed to complement experimental measurements [1926]. For instance, Suarez and Diaz characterized dynamics of the main protease in the dimeric and monomeric forms with and without a peptide substrate, respectively, through 2.3-μs MD simulations, revealing instability of the unbound-state of the main protease [27]. Similarly, Grottesi et al. conducted 2.3-μs MD simulations of the main protease in the monomeric and dimeric forms without substrates, quantifying conformational dynamics of loop regions near the catalytic site [28]. These computational studies provided insight into dynamic characteristics of the main protease, which could not be obtained by visual inspection of static crystal structures.

The mortality rates due to COVID-19 appear to be lower in Japan than in Western countries (https://covid19.who.int). Recently, based on statistical analyses of viral genome sequences derived from COVID-19 patients hospitalized in Tokyo, Abe et al. hypothesized that the accumulation of mutations may have contributed to the decrease in clinical virulence [29]. Their phylogenetic analysis identified four non-synonymous mutations in the viral genome sequence inversely correlated with COVID-19 severity, and subsequent sequence comparison and structure-based prediction suggested that two mutations, P108S and P151L, in the main protease and the nucleocapsid protein, respectively, contribute to a milder clinical course of COVID-19. The authors further demonstrated in vitro that the P108S mutation reduced the enzymatic activity.

In this study, we employed triplicate 5.0-μs MD simulations to investigate effects of the non-synonymous mutation, P108S, on dynamics of the SARS-CoV-2 main protease. The main protease consists of three structural domains (Fig. 1). The catalytic dyad (His41 and Cys145) is formed by association of domains I (10–99) and II (100–182), whereas the domain III (198–303) is important for the self-dimerization [30]. The mutation P108S is located in domain II. In a previous study, Amamuddy et al. analyzed dynamics of the wild-type (WT) main protease and 50 natural variants, including P108S, in the dimeric form through a single 100-ns MD simulation of each mutant [31]. They used various metrics, such as root mean square deviations (RMSDs), residue fluctuations, and geometrical calculations, to suggest structural variations caused by mutations. Based on visual inspection, the authors argued that mutations located in a hinge region that connects rigid and flexible regions, such as the P108S mutation, control regulatory motions of the main protease. However, the impacts of the P108S mutation on dynamics and substrate recognition of the main protease were unclear.

Figure 1 

Structure of the SARS-CoV-2 main protease and the position of the P108S mutation (PDB ID: 7BRO) [32]. Protein structures were generated with UCSF Chimera [36].

In this study, our computational results demonstrated that the systems studied are stabilized only after 4.0 μs of simulation, after which different dynamics between the WT protease and the P108S mutant emerged. We quantified the difference in dynamics based on 5.0-μs MD trajectories and discuss possible effects of the mutation on the substrate binding. Although further studies are certainly necessary, our results suggest that the symptoms caused by COVID-19 may become milder due to emergent rigidification of a loop structure at the entrance of the catalytic site of the main protease, as is caused by the mutation at position 108.

Material and Methods

Preparation of initial structure for simulations

The initial structure for simulations of the main protease was retrieved from a previously determined crystal structure [PDB ID: 7BRO] [32]. Missing residues and the mutation P108S were generated from the WT structure with Modeller [33]. The protonation states of histidine residues were assigned using a structure-based prediction method, PROPKA [34], through the PDB2PQR website [35], with a crystal structure of the WT protease. Thus, the protonation states of histidine residues were the same in the WT and the mutant in this study. Protein structures were visualized with UCSF Chimera [36].

Molecular dynamics simulations

MD simulations of the WT main protease and the P108S mutant were performed using GROMACS 2018.6 [37] with the CHARMM36m force field [38]. The structures were solvated with TIP3P water [39] in a rectangular box such that the minimum distance to the edge of the box was 15 Å under periodic boundary conditions. The protein charge was neutralized with added Na+ or Cl, and additional ions were added to imitate a salt solution of concentration 0.15 M. Each system was energy-minimized by the steepest descent algorithm (5,000 steps) and was subsequently heated from 50 K to 298 K during 150 ps. The simulations were continued by 200 ps with NVT ensemble and further by 300 ps with NPT ensemble; during these steps, protein atoms were held fixed, whereas non-protein atoms freely moved. Further simulations were performed with NPT ensemble at 298 K for 5.0 μs without any restraints other than the LINCS algorithm to constrain bonds involving hydrogen atoms [40]. The time step was set to 2 fs throughout the simulations. A cutoff distance of 12 Å was used for Coulomb and van der Waals interactions. Long-range electrostatic interactions were evaluated by means of the particle mesh Ewald method [41]. A snapshot was saved every 100 ps. We repeated the same calculations three times with different initial velocities for 30-μs simulation time in total.

Analysis of MD trajectories

Trajectory analysis of each 5.0-μs simulation was performed based on the last 0.5 μs. RMSDs and distances between atoms were computed with the GROMACS package [37]. Cross-correlation analysis and principal component analysis (PCA) were performed for trajectories, in which a snapshot was saved every 500 ps, with the Prody Python library [42]. After removal of the rotational and translational motions, a positional covariance matrix of Cα-atoms of the whole protease was calculated and diagonalized to obtain the PCs. To remove random fluctuations at the terminal regions, we removed the first and last 10 residues, respectively, of each chain for PCA. PCs were visualized through Normal Mode Wizard in VMD [42,43]. Fraction of native contacts or Q-values were computed with the MDTraj Python library [44] using the equation below [45]:

  
QX=1N(i,j)S11+expβ(rijX-λrij0)

where the sum runs over the N pairs of native contacts (i,j), rij(X) is the distance between i and j in configuration X, r0ij is the distance between i and j in the native state, β is a smoothing parameter taken to be 5 Å–1, and the factor λ accounts for fluctuations when the contact is formed, taken to be 1.8 for the all-atom simulations [45].

Results

The P108S mutation suppresses plasticity of catalytic dyad

We began by calculating Cα-RMSDs of the protease to evaluate convergence of our simulations (Fig. 2A). The standard deviations of the Cα-RMSDs converged within 0.5 Å during the simulations. However, when the fraction of the native contacts was computed based on the crystal structure as the reference, we found that the Q-value decreased until about 4.0 μs (Fig. 2B). Although the previous work of mutant protease relied on a 100-ns simulation time [31], our Q-value calculations implied that μs-scale simulations were necessary to obtain insight into dynamics of the main protease. Therefore, we decided to use the trajectories from the last 0.5 μs of our 5.0-μs simulations for subsequent analyses. In addition, one of the mutant simulations (RUN-1) appeared to behave differently from the other two simulations (Fig. 2), suggesting that the simulation was trapped in a different state. Therefore, our discussion of the mutant analyses below is mainly based on the simulations of the other converged trajectories.

Figure 2 

Convergence of the MD trajectories. (A) RMSDs of Cα atoms in the main protease and (B) Q-values as function of simulation time for simulations with three different initial velocities for each protein.

To examine more specific dynamics, we computed Cα-RMSDs of each domain. Although the main protease is a dimeric protein consisting of chains A and B, for brevity we report here the results obtained for chain A unless otherwise noted. Results obtained from analysis of chain B are provided in the Supplementary Figures S1 and S2. Since we performed the simulations of each protein three times with different initial velocities, we observed some variations of the results even in the same system. There were more variations in Cα-RMSDs of domains I and II (Figs. 3A and 3B) than in Cα-RMSDs of domain III, which was quite stable (Fig. 3C). The catalytic residues, 41His and 145Cys, are located in domain I and in domain II, respectively. Therefore, we also computed Cα-RMSDs of the catalytic residues. When we superposed Cα-atoms of the domains I and II and computed Cα-RMSDs of the catalytic residues, we observed larger variation after 1.0 μs in the P108S catalytic residues than in those of the WT protease (Fig. 3D). Interestingly, the variation became small at around 4.0 μs, and two of the mutant simulations appeared to reach stable states. Therefore, it is tempting to speculate that the mutation P108S in the main protease, frequently observed in patients with mild symptoms in Tokyo, affects dynamics of the catalytic residues, thereby causing milder symptoms.

Figure 3 

RMSDs of Cα atoms in the chain A main protease (A) domain I (amino acids 10–99), (B) domain II (amino acids 100–182), (C) domain III (amino acids 198–303), and (D) catalytic residues (41His/145Cys) of WT and P108S mutant as function of simulation time.

To further quantify the difference in dynamics between the WT and P108S mutant, we computed the distances between center of gravity of the catalytic residues. The distance fluctuated along the simulation time with clear distinctions between the WT and the P108S mutant (Fig. 4). In the simulations of the WT protease, distributions of the distances between the catalytic dyad were wider in range (Fig. 4C), than observed in the simulations of the mutant (Fig. 4D). Although there are some overlaps of the distributions among three trajectories of the WT protease, there appeared to be positional variability of the catalytic residues. On the other hand, in two of the three mutant simulations, the distributions almost entirely overlapped. These observations suggest that the catalytic dyad need to show plasticity to recognize and catalyze substrates.

Figure 4 

Distance between the catalytic residues is less variable in the WT protease than in the P108S mutant. (A and B) Distance of the center of gravity of side chain atoms between the catalytic residues (41His and 145Cys) in A) WT and B) P108S mutant in the main protease (chain A) as a function of time. (C and D) Frequency at indicated distance for C) WT and D) P108S mutant. Vertical dotted line corresponds to average distance observed in the simulations of the WT.

Principal component analysis suggests rigidification of a loop structure at the entrance of the catalytic site

To further quantify the effect of the P108S mutation on the catalytic dyad, we performed PCA of the MD trajectories based on Cα atom of the protein ensembles. In the WT simulations, the first PC correspond to the motion of a loop structure (44–53) at the entrance of the catalytic site either in the chain A or the chain B (Fig. 5 and Supplementary Fig. S3). On the other hand, in the mutant simulations, the fluctuation of the corresponding loop region was much smaller, suggesting rigidification of the loop structure upon the mutation. Therefore, it is most likely that substrates could not access the catalytic dyad due to steric hindrance of the rigidified loop region. In addition, the first and second PCs of the mutant simulations tend to be the motions of domain III (198–303) (Fig. 5 and Supplementary Fig. S3).

Figure 5 

Principal component analysis of protein ensembles generated by MD simulations. (A) First and second principal components along the protein chain. Red, green, and blue lines correspond to the PCs of RUN-1, RUN-2, and RUN-3 of the simulations, respectively. (B) First PC visualized through Normal Mode Wizard in VMD [42,43].

To more directly examine correlations of motions between the P108S mutation and other residues, we also evaluated cross-correlation of residues in the MD trajectories. When residue-wise correlations were evaluated, fluctuations of the residue at the position 108 were not correlated well with fluctuations of the catalytic residues (correlations <0.5, Supplementary Fig. S4). However, the fluctuations of the residue at the position 108, which is in domain II, were correlated with many residues in the domains I and III and even in the other chain (Supplementary Fig. S4). In addition, motions of catalytic residues appeared to be correlated with each other, and with many other residues along the chains (Supplementary Figs. S5 and S6). These results support our hypothesis that the mutation at position 108 affects dynamics of the entire protein. Together with the fact that domain III contributes to the self-association of the two chains in the main protease [30] and the major PCs of the mutant simulations correspond to motions in domain III, lower residue correlations in domain III of the mutant relative to the WT protease may be one of the reasons that COVID-19 patients infected by virus with the mutation P108S have milder symptoms.

Discussion and Conclusion

Based on microsecond-scale MD simulations, we identified a possible link between a non-synonymous mutation, P108S, frequently observed in patients with mild COVID-19 symptoms in Tokyo, and dynamics of the catalytic site in the main protease. With only data from static crystal structures, effects of mutations on dynamics of protein structures are difficult to infer. Further, our finding that the catalytic dyad became less accessible upon the P108S mutation were visible only after simulations longer than 4 μs.

It is worth noting that Abe et al. [29] conducted hydrogen deuterium exchange mass spectroscopy (HDX-MS) to examine difference in dynamics between WT and the P108S mutant protease. They reported the structural perturbations of two regions, 128–141 and 161–176. On the other hand, the rigidified loop region we identified in this study is the region of 44–53. This difference can be rationalized by the fact that the HDX-MS was performed for the monomeric state of the protease, whereas our simulations were based on the dimeric state; dynamics will differ between the monomeric and dimeric states. In addition, the reported reduction of the catalytic activity was mainly due to the increase of Km, which became twice upon the mutation while Kcat decrease only a little [29]. Dominant effects of Km suggest that the P108S mutation reduced the enzymatic activity by interfering substrate binding rather than preventing release of the products, in agreement with our results that the loop region at the entrance of the catalytic site stiffened, making the catalytic dyad less accessible.

There have been many studies of drug binding to the main protease of SARS-CoV-2. Considering the differences in flexibility of the loop region at the entrance of the catalytic sites we observed between the WT and P108S mutant, the mechanism of drug binding might be different in the mutant from that of the WT main protease. Therefore, considering the prevalence of the mutation in patients, drug binding to the P108S mutant protease should be studied. It is also worth mentioning that our calculations dealt only with the P108S mutation, and we did not evaluate other mutations, which were predicted to be functionally neutral, found in patients in a previous study [46]. Together with P108S, those mutations may also affect physicochemical properties of the main protease. Further analyses could be done, for instance, by using physicochemical measurements to experimentally characterize the mutant and by using hybrid quantum mechanics/molecular mechanics simula­tions to understand catalytic mechanism of the mutant, as has been done with the WT protease [47].

Our results also suggest the importance of repeating MD simulations and of using μs-scale simulation times. Here, we repeated the same calculations three times with different initial velocities for each system, and the results of our simulations were highly variable even in the same system. Therefore, caution should be taken when conclusions are drawn from single trajectories of MD simulations, even when the results seem plausible. Further, Q-values in our simulations decreased until about 4.0 μs indicating that ns-scale simulations may not provide insight into protein dynamics. We also demonstrated the potential of MD simulations to complement experimental techniques as well as the other computational methods, such as protein design calculations [4851], where effects of mutations are predicted based on static crystal structures with side chain rotamer sampling. Further studies are certainly necessary to quantitively understand relationships between the mutation P108S and physical properties of the main protease, but our findings will immediately inform design of new protease inhibitors.

Acknowledgements

The supercomputing resources in this study were provided in part by the Human Genome Center at the Institute of Medical Science, The University of Tokyo, and by Research Center for Computational Science, Okazaki, Japan. This work was funded in part by the Japan Society for the Promotion of Science (grant numbers JP19H04202 to D.K. and JP16H02420, JP19H05766 and JP20H02531 to K.T.), by the Japan Agency for Medical Research and Development (grant numbers JP20wm0325002s to D.K. and JP18am0101094j, JP18dm0107064h, JP18mk0101081h, JP18fm0208030h, JP18fk0108073h, and JP18ak0101100h to K.T.), and JST CREST (JPMJCR20H8 to K.T).

Conflict of Interests

None declared.

Author Contributions

K.T. conceived the study. D.K. designed and conducted the computational study, analyzed the data, and wrote the manuscript. All the authors approved the final version of the manuscript.

References
 
© 2021 THE BIOPHYSICAL SOCIETY OF JAPAN
feedback
Top