2022 Volume 97 Issue 3 Pages 111-121
We have previously estimated the evolutionary rate (number of substitutions/site/million years) of mitochondrial cytochrome b gene (Cytb) sequences in rodents and moles to be about 0.11 at more recent divergence times of tens of thousands of years, and to decrease rapidly to about 0.03 at more distant divergence times. Because this time dependency is thought to be caused by the removal of mildly deleterious substitutions in later generations, we focused in this study on the abundance of nonsynonymous substitutions. We collected 23 haplogroups of Cytb with signals of late Quaternary population expansion events from rodents and moles and categorized them into three groups for comparison based on predicted expansion start time: 5,000–15,000 years ago (Group I), ca. 53,000 years ago (Group II) and 130,000–230,000 years ago (Group III). We counted the numbers of nonsynonymous and synonymous substitutions in all haplogroups. The rates of nonsynonymous substitutions were lowest in Groups II and III (0.08–0.22), whereas those in Group I varied markedly. We further classified Group I into two subgroups based on high (0.29–0.43) and low (0.09–0.20) nonsynonymous substitution rates, which were likely to be associated with the start of the expansion within 10,000 years and at around 15,000 years ago, respectively. The Group II and III networks had two- or three-step star-shaped structures and tended to exhibit frequent and less frequent nonsynonymous substitutions on exterior and interior branches, respectively. Based on temporal dynamics, nonsynonymous mitochondrial DNA (mtDNA) substitutions in small mammals accounted for at most 40% of all substitutions during the early evolutionary stage and then rapidly declined, dropping to approximately 15%. The results of this study provide a good explanation of the time-dependent trend in the mtDNA evolution rate predicted in previous work.
In animals, mitochondrial DNA (mtDNA) is used to assess demographic histories in recent evolutionary times, such as in the late Quaternary (e.g., the last 150,000 years). Using the correct evolutionary rate of a molecular marker is essential for elucidating the history of a population. The rate of evolution has been calibrated based on fossil evidence, and it is generally difficult to establish calibration points based on fossils in short time frames, although methods based on the analysis of ancient DNA have been reported (Lambert et al., 2002; Martínkovám et al., 2013; Tong et al., 2018). Calibration points can also be set by incorporating biogeographic events to estimate evolutionary rates (Soares et al., 2009), such as rapid population growth (Herman and Searle, 2011; Suzuki et al., 2015) and past migration between islands currently separated by deep sea (Hanazaki et al., 2017). The time dependency of the evolutionary rate has been noted: the mtDNA evolutionary rate is faster when the divergence time is recent and slower as it becomes more distant (Ho et al., 2005, 2011; Soares et al., 2009; Suzuki et al., 2015), as similarly reported for viral genomes (Duchêne et al., 2014; Aiewsakun and Katzourakis, 2016). In animal mtDNA, this trend is more pronounced for branching times within the last 100,000 years (Ho et al., 2005, 2011; Soares et al., 2009; Suzuki et al., 2015). However, the mechanism of the time dependency is not fully understood.
Until now, the possible involvement of several factors concerning the time dependency of the mtDNA evolutionary rate has been suggested, such as natural selection, ancestral polymorphism, calibration error, model designation mistakes and sequencing errors (García-Moreno, 2004; Ho et al., 2011). Although the time-dependent evolutionary rate is considered the result of certain combinations of these influences, the most plausible idea is that purifying selection acts on mildly deleterious substitutions, gradually removing them over a particular evolutionary time scale (Nachman et al., 1994; Rand and Kann, 1996; Rand, 2001; Ho et al., 2005, 2011; Stewart et al., 2008; Endicott et al., 2009; Loogväli et al., 2009; Peterson and Masel, 2009). The mtDNA evolutionary rate in plants is low compared to animals, largely because animals do not have efficient DNA replication repair enzymes to replicate mtDNA accurately (Wu et al., 2020). Hence, in animals, it is reasonable to think that a particular portion of newly introduced substitutions is mildly deleterious and would be subjected to removal during subsequent generations. In protein-coding mtDNA genes, nonsynonymous substitutions are the best candidates for weakly deleterious substitutions (Nachman et al., 1994; Rand and Kann, 1996; Rand, 2001; Ho et al., 2005, 2011; Stewart et al., 2008; Endicott et al., 2009; Loogväli et al., 2009; Peterson and Masel, 2009). In Mus musculus, the rate of mtDNA substitution has been compared among laboratory strains that have been established during the last hundred years, revealing a relatively fast rate and a higher proportion of nonsynonymous substitutions than in their wild relatives (Goios et al., 2007). It is therefore important to focus on the evolutionary dynamics of nonsynonymous substitutions to explore the factors responsible for the time dependency.
An effective way to assess the dynamics of nonsynonymous substitutions is to trace the temporal dynamics of nonsynonymous substitutions in different time frames; the validity of this approach has been shown in viral genome research (Duchêne et al., 2014). In this context, clusters of haplotypes (haplogroups) generated by rapid expansion events are worthy of attention. Haplogroups integrate substitutions that occurred over a period of time and are useful for understanding the status of substitutions within a given time frame. Comparing such clusters that arose at different times is helpful for understanding the dynamics. In our previous studies on wild rodents and moles (Suzuki et al., 2015; Hanazaki et al., 2017; Honda et al., 2019; Nakamoto et al., 2021), we revealed several Cytb haplogroups exhibiting a rapid expansion event signature. These haplogroups can be classified into three groups based on the estimated expansion start time, i.e., post-last glacial maximum (LGM) (< 15,000 years ago), early marine isotope stage (MIS) 3 (53,000 years ago) and post-penultimate glacial maximum (PGM) (130,000 years ago) events. In addition, the post-LGM is characterized by two outstanding events: the post-Younger Dryas (YD) (11,500 years ago) and the post-last glaciation event (15,000 years ago). These in turn allow us to calculate evolutionary rates, based on the modal (peak) values of a unimodal mismatch distribution, defined as τ, which is a statistical value proportional to the time since expansion. The resulting evolutionary rate is 0.11 substitutions/site/million years (myr) for divergence within the past 10,000 years, followed by a sharp decrease to approximately 0.03 substitutions/site/myr for divergence that occurred more than 130,000 years ago (Suzuki et al., 2015; Hanazaki et al., 2017; Honda et al., 2019; Suzuki, 2021). To trace the dynamics of nonsynonymous substitutions, it is also reasonable to address those lying on the younger and older branches of the haplogroup network.
Several haplogroups exhibiting rapid expansion signals have been detected in wood mice (Apodemus) and voles (Myodes) inhabiting the Japanese Archipelago (Suzuki et al., 2015; Hanazaki et al., 2017; Honda et al., 2019). It is also possible to obtain sequence data showing rapid expansion signals in other Apodemus species from Europe (Michaux et al., 2003). This demographic trend was confirmed in distantly related taxa, namely moles (Mogera) from Japan (Nakamoto et al., 2021). In addition, other rodent data are related to more recent rapid expansion events, namely those of the house mouse (Mus musculus), covering expansion events over the last several thousand years (Rajabi-Maham et al., 2008; Suzuki et al., 2013; Li et al., 2021). In the house mouse, an evolutionary rate of 0.4 substitutions/site/myr has been proposed to explain the mtDNA diversity of the current eastern European subspecies, M. m. domesticus, which evolved with the development of early agriculture in Europe (Rajabi-Maham et al., 2008). Furthermore, an evolutionary rate of 0.4 substitutions/site/myr has been proposed for the field vole (Microtus) by assigning the mtDNA diversity of the present species to post-YD warming (Herman and Searle, 2011). Such a high evolutionary rate is very different from the aforementioned rate of 0.11/site/myr (Suzuki et al., 2015; Hanazaki et al., 2017; Honda et al., 2019; Li et al., 2021), and it is necessary to settle this controversy.
Here, we compared Cytb haplogroup data of rodents (Apodemus, Microtus, Myodes and Mus) and moles (Mogera) with rapid population expansion signals, dividing them into three groups according to the assumed timing of the expansion events. We tracked changes in the proportion of nonsynonymous substitutions over time. In addition, we addressed substitutions occurring on younger and older branches of the haplogroup networks. From these analyses, we propose that the temporal dynamics of nonsynonymous substitutions reflect their nature as mildly deleterious substitutions, which facilitates our understanding of evolutionary trends in the time-dependent mtDNA substitution rate in animals.
We examined 23 haplogroups of the mitochondrial cytochrome b gene (Cytb, 948–1140 bp) including 13 from murines (A. argenteus, A. speciosus, A. sylvaticus and M. musculus) and voles (Microtus agrestis, Myodes smithii, My. rufocanus and My. rutilus) (haplogroups a–m, Table 1) and 10 from moles (Mogera imaizumii and Mo. wogura) (haplogroups 1–10, Table 2). Mismatch distribution analysis was performed for rodents (Supplementary Fig. S1) and moles (Supplementary Fig. S2) with Arlequin v3.5.1.2 (Excoffier and Lischer, 2010), providing modes of pairwise nucleotide differences (τ values). We categorized the 23 haplogroups into three groups for convenience, according to expansion start times: post-LGM (Group I), early MIS 3 (Group II) and post-PGM (Group III) (Suzuki et al., 2015; Hanazaki et al., 2017; Honda et al., 2019; Nakamoto et al., 2021; Suzuki, 2021), in which the range of the τ values for each group was as follows: Group I, τ < 4.1; Group II, 5 < τ < 6.3; and Group III, τ > 7.2. Haplogroup m of A. sylvaticus from Sicily, which had an estimated expansion start time of 230,000 years ago (Suzuki, 2021), was integrated into Group III.
Group | Species | Main geographic region (Haplogroup) | n | Length (bp) | π | τ (95% confidence interval)a | Tb (Years ago) | Number of substitutions | pNc | ||
---|---|---|---|---|---|---|---|---|---|---|---|
N | S | ||||||||||
Group I | |||||||||||
a. | Mus musculus musculus | Korea and Japan (MUS-1c)d | 63 | 1140 | 0.00092 | 1.17 (0.70–1.75)**,** | 4665 | 6 | 13 | 0.32 | |
b. | Mus musculus castaneus | East Asia (CAS-1)d | 51 | 1140 | 0.00171 | 1.94 (1.22–2.40)**,** | 7735 | 9 | 13 | 0.41 | |
c. | Apodemus argenteus | Hokkaido, Japane | 72 | 1140 | 0.00212 | 2.55 (2.05–3.03)**,** | 10167 | 18 | 45 | 0.29 | |
d. | Apodemus speciosus | Hokkaido, Japan (IIb-A)e | 30 | 1140 | 0.00257 | 3.11 (2.05–4.02)*,** | 12400 | 3 | 24 | 0.11 | |
e. | Myodes rutilus | Hokkaido, Japanf | 35 | 1140 | 0.00332 | 4.08 (2.06–5.02)*,** | 16268 | 4 | 28 | 0.13 | |
Group II | |||||||||||
f. | Mus musculus domesticus | Western Europe (DOM)d,g | 107 | 958 | 0.00482 | 5.01 (4.26–5.49)**,** | 55635 | 19 | 74 | 0.20 | |
g. | Myodes smithii | Western Honshu, Japanf | 13 | 1140 | 0.00467 | 5.82 (4.15–7.46)*,** | 54311 | 5 | 22 | 0.19 | |
h. | Myodes smithii | Central Honshu, Japanf | 33 | 1140 | 0.00508 | 5.9 (3.43–7.32)*,** | 55058 | 4 | 46 | 0.08 | |
i. | Myodes rufocanus | Hokkaido, Japanf | 59 | 1140 | 0.00499 | 6.28 (3.18–7.24)*,** | 58604 | 11 | 61 | 0.15 | |
Group III | |||||||||||
j. | Apodemus sylvaticus | Italy and Balkansh | 16 | 948 | 0.00672 | 7.23 (4.16–9.52)**,** | 136189 | 4 | 41 | 0.09 | |
k. | Microtus agrestisj | Northern Britaini | 81 | 1140 | 0.00657 | 7.35 (6.92–8.53)i*,** | 115132 | 23 | 125 | 0.16 | |
l. | Apodemus speciosus | Honshu, Japane | 23 | 1140 | 0.00638 | 7.84 (5.34–11.50)*,** | 122807 | 12 | 69 | 0.15 | |
m. | Apodemus sylvaticus | Sicily, Italyh | 11 | 948 | 0.0118 | 12.32 (8.51–14.14)–,* | 232068 | 5 | 38 | 0.12 |
Group | Species | Main geographic region (Haplogroup)a | n | Length (bp) | π | τ (95% confidence interval)b | Tc (Years ago) | Number of substitutions | pNd | ||
---|---|---|---|---|---|---|---|---|---|---|---|
N | S | ||||||||||
Group I | |||||||||||
1. | Mogera wogura | Tokai and Kansai Districts (Mwo-Ia-1) | 12 | 1140 | 0.00106 | 1.32 (0.20–2.28)–,** | 5263 | 1 | 4 | 0.20 | |
2. | Mogera wogura | Kyushu and westernmost Honshu (Mwo-IIIb) | 14 | 1140 | 0.00164 | 2.14 (1.19–3.25)–,** | 8533 | 3 | 6 | 0.33 | |
3. | Mogera wogura | Chugoku District (Mwo-IIa) | 48 | 1140 | 0.00221 | 2.55 (1.25–3.55)**,** | 10167 | 4 | 19 | 0.17 | |
4. | Mogera imaizumii | Northern Honshu (Mim-Ia-1) | 28 | 1140 | 0.00197 | 2.69 (1.38–3.81)–,** | 10726 | 1 | 10 | 0.09 | |
5. | Mogera robusta | Korea and Primorye, Russia (Mwo-IV) | 14 | 1140 | 0.00262 | 2.89 (1.49–3.85)*,** | 11523 | 5 | 11 | 0.31 | |
6. | Mogera wogura | Kyushu (Mwo-IIIa-1) | 12 | 1140 | 0.00241 | 3.08 (1.01–4.68)*,** | 12281 | 6 | 8 | 0.43 | |
7. | Mogera wogura | Kyushu and nearby islands (Mwo-IIIa) | 15 | 1140 | 0.00324 | 3.32 (1.69–5.80)**,** | 13238 | 9 | 15 | 0.38 | |
8. | Mogera imaizumii | Northern Honshu (Mim-Ia) | 36 | 1140 | 0.00267 | 3.51 (1.74−4.96)–,* | 13995 | 3 | 19 | 0.14 | |
Group II | |||||||||||
9. | Mogera wogura | Tokai and Kansai Districts (Mwo-Ia) | 20 | 1140 | 0.00295 | 5.48 (2.81–8.59)–,* | 51138 | 4 | 15 | 0.21 | |
Group III | |||||||||||
10. | Mogera wogura | Kyushu and westernmost Honshu (Mwo-III) | 47 | 1140 | 0.00646 | 8.61 (5.67–10.51)–,** | 130218 | 14 | 50 | 0.22 |
We constructed median-joining (MJ) networks for the rodent haplogroups (Fig. 1) and obtained networks from a previous study on moles (Fig. 2 in Nakamoto et al., 2021). There were differences in patterns among the haplogroup categories, ranging from simple (e.g., haplogroup MUS-1c in Fig. 1A) to complicated, particularly those of Group III. We performed mismatch distribution analysis for rodents (Supplementary Fig. S1) and moles (Supplementary Fig. S2), showing modes of τ values, which were grouped into three groups (Groups I, II and III). We counted the numbers of nonsynonymous and synonymous substitutions by plotting them on the MJ networks. The proportions of nonsynonymous substitution values (pN) were plotted against τ (Fig. 3) and the expansion start time (Supplementary Fig. S3) on scatterplots. In the rodent plot (Fig. 3A, Table 1), pN values for older expansion events (Groups II and III) were relatively low (0.08–0.20), whereas those for more recent events (Group I) varied from 0.11 to 0.41, demonstrating that pN tended to decrease with τ. Similarly, the proportions in moles were relatively low in Groups II (0.21) and III (0.22) and varied from 0.09 to 0.43 in Group I (Fig. 3B, Table 2). Overall, the proportions in the two taxonomic groups tended to decrease from approximately 0.4 to 0.15, with a decrease in evolutionary rate from younger to older divergences.
Median-joining networks constructed from Cytb sequences (~1140 bp). The networks of 13 haplogroups (see Table 1 for details) represent three rodent groups: the house mouse Mus musculus (A, B, F), voles of the genera Myodes (E, G, H, I) and Microtus (K), and wood mice of the genus Apodemus (C, D, J, L, M). The sizes of the circles correspond to the number of sequences featuring the same haplotype. Bars on branches indicate the number of substitutions between haplotypes. Substitutions predicted to be redundant for calculating the nonsynonymous substitution rate are marked with asterisks. Thick red bars indicate the positions (in bp) of nonsynonymous substitutions. In the network of Mus musculus castaneus (B), the subcluster branching out from the main cluster (CAS-1a; Suzuki et al., 2013) is shaded. In the network of M. m. domesticus haplogroups (HG; Fornuskova et al., 2014) (F), HG-E and HG-F, for which evidence of expansion was observed, are shaded.
Median-joining networks constructed from Cytb sequences (1140 bp). The networks represent the 10 mole haplogroups analyzed in this study (see Table 2 for details). See also Nakamoto et al. (2021) for details of the clustering patterns and codes of the clusters analyzed. Substitutions predicted to be redundant for calculating the rates of nonsynonymous substitutions are indicated with asterisks. The sizes of the circles correspond to the number of sequences featuring the same genotype. Bars on branches indicate the number of substitutions between genotypes. Thick red bars indicate the positions (in bp) of nonsynonymous substitutions.
Correlation between τ value and the proportion of nonsynonymous substitution (pN) of mtDNA sequences. We categorized 13 rodent (A) and 10 mole (B) haplogroups into three groups (I, II and III) based on their τ values for the Cytb gene sequence (see Tables 1 and 2 for details). In these plots we used τ values normalized to a length of 1140 bp in haplogroups f, j and m. Assignment of nonsynonymous and synonymous substitutions responsible for the time-dependent evolutionary rates of rodent mtDNA observed in Groups I, II and III (C). The time-dependent evolutionary rates have been assessed in previous studies: 0.11, 0.047 and 0.028 substitutions/site/myr for Groups I, II and III, respectively (Honda et al., 2019; Suzuki, 2021). Setting possible representative values of pN, partitioned evolutionary rates of non synonymous (Nμ) and synonymous (Sμ) substitutions were calculated for each group of haplogroups (see text for details).
We detected a trend of multiple occurrences of nonsynonymous substitutions at the same site. Three independent alanine-to-threonine substitutions at the same site (site 67) in the nucleotide sequence were revealed in the Mi. agrestis network (Fig. 1K). Substitutions at site 67 were visible in other haplogroups of both groups of rodents (MUS-1c, CAS-1, My. rutilus, My. smithii and My. rufocanus) and moles (Mwo-I and Mwo-IV). In moles, synonymous substitutions at Site 1001 were found in four haplogroups: Mwo-I, Mwo-II, Mwo-III and Mwo-IV (Fig. 2). For substitutions with multiple occurrences, a portion tended to be present in the interior branches.
To examine the trend of pN values with respect to new and old substitutions, we counted the numbers of both types of substitution on the exterior and interior branches in the four haplogroups belonging to Groups II and III and which had more nonsynonymous substitutions (N > 9). The results revealed a trend of relatively frequent nonsynonymous substitutions on the exterior branches directly connecting terminal haplotypes and less frequent nonsynonymous substitutions on interior branches (Supplementary Fig. S4). We performed a G-test and found significant and nearly significant trends indicating the frequent appearance of nonsynonymous substitutions in the exterior branches in three haplogroups: M. m. domesticus (DOM), Mi. agrestis (northern Britain) and Mo. wogura (Mwo-III) (Table 3).
Species | Haplogroup (Code)a | Exterior branches | Interior branches | G-test | |||||
---|---|---|---|---|---|---|---|---|---|
N | S | pNb | N | S | pNb | Statistic | P | ||
Mus musculus domesticus | DOM (f) | 18 | 57 | 0.24 | 1 | 17 | 0.06 | 3.79 | 0.05 |
Myodes rufocanus | Hokkaido (i) | 5 | 46 | 0.10 | 6 | 16 | 0.27 | 3.39 | 0.07 |
Microtus agrestis | Northern Britain (k) | 21 | 76 | 0.22 | 2 | 49 | 0.04 | 9.64 | 0.00 |
Mogera wogura | Mwo-III | 14 | 24 | 0.37 | 0 | 26 | 0.00 | 17.23 | 0.00 |
Total | – | 58 | 203 | 0.22 | 9 | 108 | 0.08 | 13.24 | 0.00 |
The primary reason for the time dependency of the evolutionary rate of animal mtDNA has been presumed to be long-term purifying selection acting on mildly deleterious substitutions (Nachman et al., 1994; Rand and Kann, 1996; Rand, 2001; Ho et al., 2005, 2011; Stewart et al., 2008; Endicott et al., 2009; Loogväli et al., 2009; Peterson and Masel, 2009). As nonsynonymous substitutions are the best candidates for substitutions that can be categorized as mildly deleterious in mtDNA protein-coding sequences, we focused on the dynamics of nonsynonymous substitutions by measuring pN using 23 haplogroups exhibiting a genetic signature of rapid expansion. We categorized the haplogroups into Groups I, II and III, which correspond to the post-LGM (the last 15,000 years), early MIS 3 (53,000 years ago) and post-PGM (130,000 years ago) time frames, respectively (Suzuki, 2021). The overall networks, especially those of the haplogroups in Groups II and III, were star-shaped, which can be explained by a primary rapid population expansion, followed by additional expansions in multiple different geographic locations. For example, in Group III, the first expansion occurred after the PGM, and subsequent bottlenecks and further expansions are assumed to have occurred after MIS 4 and the LGM.
The pN was high in recently generated haplogroups (Group I) and low in more ancient haplogroups (Groups II and III). In our analysis, no apparent difference was found between Groups II and III, with both producing pN values of approximately 0.15, whereas Group I produced a higher pN that peaked at 0.43 (Tables 1, 2). In addition, nonsynonymous substitutions were located more frequently on the exterior branches of the networks than on the interior branches (Table 3). These results suggest that a substantial proportion of newly introduced nonsynonymous substitutions can be categorized as mildly deleterious, which would render them more likely to be eliminated during evolution.
The pN values for Group I moles varied drastically and were not tightly associated with the τ values, in contrast with rodents. This may be a result of the small number of substitutions observed in each haplogroup, and some unknown factor may govern the accelerated level of nonsynonymous substitutions (Soares et al., 2009). However, the overall time-dependent pattern of pN values of moles (Fig. 3) and their evolutionary rate (Nakamoto et al., 2021) were similar to those of rodents despite their distant phylogenetic relationship, suggesting that common basic systems shape the dynamics of nonsynonymous substitutions in these two lineages.
A related question is when these mildly deleterious nonsynonymous substitutions were removed. The pNs ranged from 0.09 to 0.43 during the early stage after emergence (Group I) and were significantly higher than those in Groups II and III (0.08–0.22). Thus, Group I was further classified into two subgroups based on high (0.29–0.43) and low (0.09–0.20) rates of nonsynonymous substitutions, with turning points likely associated with the period from 10,000 to 15,000 years after the start of expansion. This finding indicates that a large portion of the nonsynonymous substitutions was rapidly eliminated. Overall, these results are consistent with the time-dependent evolutionary rate curve (Suzuki, 2021). On the other hand, some nonsynonymous substitutions will continue to be slowly eliminated over a long evolutionary time period (Subramanian, 2009). In the present study, nonsynonymous substitutions were also found to be present on older branches (Supplementary Fig. S4), suggesting a substantial difference in the levels of deleteriousness among the nonsynonymous substitutions; those with milder levels may require many millennia for elimination.
The evolutionary rate of mtDNA decreases from 0.11 to 0.028 substitutions/site/myr in the time frame from 10,000 to 130,000 years ago, based on calculation of genetic distance using p-distance. The values of pN at each stage were revealed in the present study, being approximately 0.4 and 0.15 for the new (Group I) and old (Groups II and III) divergences, respectively (Fig. 3C). The contribution of nonsynonymous and synonymous substitutions to the evolutionary rate can be estimated at each time point. The higher evolutionary rate of 0.11 substitutions/site/myr can be divided into two fractions of nonsynonymous (40%) and synonymous substitutions (60%); 0.044 and 0.066 substitutions/site/myr, respectively (Fig. 3C). On the other hand, the low evolutionary rate of 0.028 substitutions/site/myr is considered to be the combination of 0.004 and 0.024 substitutions/site/myr for nonsynonymous (15%) and synonymous (85%) substitutions, respectively. The evolutionary rates of nonsynonymous and synonymous substitutions representing Groups II and III are thus considered to be reduced by 91% and 36%, respectively, compared to Group I. This implies that in addition to most of the nonsynonymous substitutions, approximately one third of the synonymous substitutions are also removed at a later stage after their generation. In other words, a significant amount of the initially generated synonymous substitutions can be considered mildly deleterious. In fact, it has been suggested that some synonymous substitutions are mildly deleterious (Rand, 2001). Further research is needed to understand the evolutionary trends of the mitochondrial synonymous substitutions.
The evolutionary rate of the early stages of mtDNA remains controversial. In the present study, we included the haplogroup of Mi. agrestis from northern Britain in Group III based on the Cytb expansion parameter (τ = 7.35; Table 1). The pN is low (0.16, Table 1), implying an old generation of the haplogroup. The network pattern was complex and appeared to comprise haplotypes that were affected by two or more rapid expansion events at different periods (Fig. 1K). An evolutionary rate of 0.028 substitutions/site/myr gives an estimate of 115,000 years ago for the expansion start time, which is consistent with the ancient origin. By contrast, in a previous study of Mi. agrestis, the expansion start time was assigned to be the abrupt warming period (11,000 years ago) after the end of the Younger Dryas, resulting in the evolutionary rate of ~0.4 substitutions/site/myr (Herman and Searle, 2011). Because the fraction of nonsynonymous substitutions of this haplogroup is low (pN = 0.16), this high evolutionary rate seems to assume that most of the synonymous substitutions will disappear at a later time. A more detailed understanding of the dynamics of synonymous substitutions is required to better understand the time-dependent evolutionary rate.
The present analysis suggests that valuable zoogeographical information can be obtained by analyzing the pN of a given haplogroup. For example, the mole haplogroup Mwo-Ib, which was derived from samples obtained along the east Fuji River, Tokai District (Nakamoto et al., 2021), had high N rates (Fig. 2), suggesting that the haplotypes have diversified relatively recently. High pN values were observed in the house mouse subclusters HG-E and HG-F (Fig. 1F) (Fornuskova et al., 2014), corresponding to the introduction of agriculture in western Europe during the mid-Holocene (6000–8000 years ago) (Shennan et al., 2013; Suzuki, 2021). Thus, it may be possible to use the pN as an indicator to identify a recent population expansion event during the Holocene period when a variety of factors are expected to have been involved, such as climate change, seaward movement, giant volcanic eruptions, the development of agriculture and dairy farming, and anthropogenic migration.
This study provides important information regarding the time-dependent evolutionary rate of animal mtDNA in small mammals. The proportion of nonsynonymous substitutions decreased over time, indicating purifying selection acting on mildly deleterious substitutions during evolution. A higher pN was associated with newly introduced substitutions, such as during the last 10,000 years, whereas extra nonsynonymous substitutions were likely eliminated by 15,000 years after their generation. Hence, our results provide a rationale for the evolutionary rate curve that we have previously proposed for rodents and moles (Suzuki et al., 2015; Hanazaki et al., 2017; Nakamoto et al., 2021; Suzuki, 2021). Some nonsynonymous substitutions remained and were not excluded even over a long time period, suggesting a substantial difference in the levels of deleteriousness among the nonsynonymous substitutions. The removal of nonsynonymous substitutions alone is insufficient to explain the decrease in the evolutionary rate from 0.11 to 0.03 substitutions/site/myr; future studies should address this issue.
We used mtDNA (Cytb) haplogroups exhibiting population expansion signals and sequence data that we had previously analyzed (Suzuki et al., 2015; Hanazaki et al., 2017; Honda et al., 2019; Suzuki, 2021). The dynamics of nonsynonymous substitutions were examined in 13 haplogroups from seven rodent species: Apodemus argenteus from Hokkaido (Suzuki et al., 2015); Apodemus speciosus from Hokkaido, Honshu, Shikoku and Kyushu (Suzuki et al., 2015); Myodes rufocanus from Hokkaido (Honda et al., 2019); Myodes rutilus from Hokkaido (Honda et al., 2019); Myodes smithii from western and central Honshu (Honda et al., 2019); and Mus musculus haplogroups “MUS-1c,” “CAS-1” and “DOM” (Suzuki et al., 2013; Fornuskova et al., 2014) (Table 1). In addition, sequence data for Apodemus sylvaticus from mainland Italy, the Balkans and Sicily (Michaux et al., 2003), and Microtus agrestis from northern Britain (Herman and Searle, 2011), were obtained from the GenBank, European Molecular Biology Laboratory (EMBL) and DNA Data Bank of Japan (DDBJ) databases. We used one sequence per haplotype for each location where A. sylvaticus was collected. We examined 10 haplotypes in the mole Cytb sequence datasets that were associated with significant population expansion signals (Nakamoto et al., 2021).
Demographic analyses and nonsynonymous substitution rate measurementNucleotide diversity (π) was calculated and deviation from the standard neutral model was tested by calculating Tajima’s D and Fu’s Fs, using Arlequin v3.5.1.2 software. Mismatch distribution analysis was performed to test the validity of the sudden expansion model using a parametric bootstrap approach with 1,000 replicates, providing the expansion parameter τ, an estimate of the mode of mismatch distribution. Times (T) for the start of the population expansion were estimated based on the parameter τ (T = τ/2kμ), where k is the length of sequence and μ is the evolutionary rate in years.
In previous studies on Mus, Apodemus and Myodes, critical periods were assigned to those of the abrupt warming of post-LGM (< 15,000 years ago), early MIS 3 (53,000 years ago) and post-PGM (130,000 years ago), in which time-dependent evolutionary rates of 0.11, 0.047 and 0.028 substitutions/site/myr were predicted (Suzuki et al., 2015; Hanazaki et al., 2017; Honda et al., 2019; Nakamoto et al., 2021; Suzuki, 2021). In this study, we grouped the 23 haplogroups into three groups for convenience to assess the dynamics of nucleotide substitutions over time.
Nucleotide sites with nonsynonymous and synonymous substitutions were detected in the aligned mtDNA sequences with MEGA7 software (Kumar et al., 2016). In each of the haplogroups, we constructed a MJ network (Bandelt et al., 1999) with PopART v1.7 software (Leigh and Bryant, 2015) and counted total numbers of nonsynonymous (N) and synonymous (S) substitutions. To specify the locations of nonsynonymous substitution sites in the networks, we obtained location information for each substitution site using Network software version 4.6 (https://www.fluxus-engineering.com). Substitutions on branches inferred to be redundant (e.g., those on “four-cycles” in networks) were not considered when counting N and S. We calculated the proportion, pN, of nonsynonymous substitutions of the total substitutions counted (N + S).
We compared pN of neighboring exterior and interior branches in each of the constructed networks. The branches connected to the highest number of terminal haplotypes in a network were defined as exterior branches and the remaining branches were treated as interior branches. The frequencies of nonsynonymous substitutions were compared statistically between exterior and interior branches in each haplogroup using the G test (Sokal and Rohlf, 1981).
We thank Takashi Noda, Yutaro Suzuki and Kaori Hanazaki for providing valuable comments on an earlier draft of this manuscript. We thank two reviewers and the editorial board for helpful comments. This study was supported by a Japan Society for the Promotion of Science Grant-in-Aid for Scientific Research on Innovative Areas (JS18H05508 to H. S.).