Signature of positive selection in mitochondrial DNA in Cetartiodactyla

Acceleration of the amino acid substitution rate is a good indicator of positive selection in adaptive evolutionary changes of functional genes. Genomic infor-mation about mammals has become readily available in recent years, as many researchers have attempted to clarify the adaptive evolution of mammals by exam-ining evolutionary rate change based on multiple loci. The order Cetartiodactyla (Artiodactyla and Cetacea) is one of the most diverse orders of mammals. Species in this order are found throughout all continents and seas, except Antarctica, and they exhibit wide variation in morphology and habitat. Here, we focused on the metabolism-related genes of mitochondrial DNA (mtDNA) in species of the order Cetartiodactyla using 191 mtDNA sequences available in databases. Based on comparisons of the dN / dS ratio ( ω ) in 12 protein-coding genes, ATP8 was shown to have a higher ω value ( ω = 0.247) throughout Cetartiodactyla than the other 11 genes ( ω < 0.05). In a branch-site analysis of ATP8 sequences, a markedly higher ω value of 0.801 was observed in the ancestral lineage of the clade of Cetacea, which is indicative of adaptive evolution. Through efforts to detect positively selected amino acids, codon positions 52 and 54 of ATP8 were shown to have experienced positive selective pressure during the course of evolution; multiple substitutions have occurred at these sites throughout the cetacean lineage. At position 52, glutamic acid was replaced with asparagine, and, at position 54, lysine was replaced with non-charged amino acids. These sites are conserved in most Artiodactyla. These results imply that the ancestor of cetaceans underwent accelerated amino acid changes in ATP8 and replacements at codons 52 and 54, which adjusted metabolism to adapt to the marine environment.


INTRODUCTION
Identification of genetic changes leading to morphological and physiological adaptations is one of the central goals of evolutionary biology. An accelerated rate of amino acid substitutions in a particular gene during a particular evolutionary period is evidence that the gene plays an important role for adaptive evolution in a given species (Messier and Stewart, 1997). An accelerated rate of amino acid substitutions in protein-coding genes can be assessed by the dN/dS ratio (ω ), which is the ratio of the nonsynonymous substitution rate (dN) to the synonymous substitution rate (dS) (Yang, 1998;Yang and Nielsen, 1998). The ω value is used as an indicator of selective pressure acting on the protein-coding genes; values of ω < 1, ω = 1, and ω > 1 indicate purifying selection, neutral evolution, and positive selection, respectively (Yang, 2007). Recent studies in mammals have attempted to identify genes that contributed to adaptive evolution in particular species by comparing the substitution rates in multiple loci. For example, Chikina et al. (2016) detected accelerated substitution rates in genes involved in sensory systems, structure and metabolism in marine mammals, providing clues to better understand their adaptive evolution to a particular environment.
Natural selection may not operate over an entire coding region but rather at specific amino acid sites that are essential for changing the function of a given gene. Therefore, in the process of elucidating the genes that are involved in adaptive evolution, care should be taken to identify specific amino acid substitutions that effect functional changes in the genes (positively selected sites). Most studies only discuss acceleration of the substitution rate, although some have detected positive selection and positively selected sites (Finch et al., 2014;Tian et al., 2016).
Sequences of mitochondrial genomes, which are circular molecules of 14,000-20,000 bp in animals (Kolesnikov and Gerasimov, 2012), are now available in databases for many organisms. Mitochondrial DNA (mtDNA) in mammals encodes 13 proteins that constitute oxidative phosphorylation (OXPHOS) complexes: ND1-4, 4L, 5 and 6 of complex I (NADH dehydrogenase); cytochrome b (Cytb) of complex III (bc1 complex); COX1-3 of complex IV (cytochrome c oxidase); and ATP6 and ATP8 of complex V (ATP synthase) (Wallace, 2007). Most cellular energy is produced through the OXPHOS pathway, which takes place in the protein complexes embedded in the inner mitochondrial membrane (Saraste, 1999). Mutations of mtDNA greatly affect the metabolic activity of organisms, suggesting that mtDNA plays an important role in morphological evolution and environmental adaptation. Purifying selection is known to be one of the dominant forces of evolution on mitochondrial OXPHOS genes (Tomasco and Lessa, 2011). On the other hand, evidence of adaptive evolution acting on mtDNA has already been detected in previous reports. For instance, da Fonseca et al. (2008) carried out comparative sequence analysis on protein-coding genes of mitochondrial genomes across 41 mammal species and detected a number of substitutions that alter the biochemical properties of functional sites of specific proteins. Because mammals adapt to different habitats by changing their metabolic processes, an accelerated substitution rate in the mtDNA may be tightly linked with the adaptive evolution of some metabolic processes.
One of the most diverse orders of mammals, Cetartiodactyla, has 332 extant species grouped into 132 genera (Hassanin et al., 2012). This order includes artiodactyls (ruminants, pigs, peccaries, hippos, camels and llamas) and cetaceans (whales, dolphins and porpoises), which are found throughout all continents and seas, except Antarctica. Within this order, there is large morphological variation and great habitat diversity, as seen with cetaceans. As the ancestor of Cetacea originally lived on land (Gatesy et al., 2013), an adaptation to water with a change in metabolism was required to make the transition to aquatic living (Tomanek, 2014). Other examples are found within the tribes Lamini (belonging to the family Camelidae) and Caprini (family Bovidae); most species in these tribes live in high-altitude mountains and have adapted to low levels of oxygen, cold temperature and scarce food supply. Such stresses may promote specific directional evolution of mtDNA (Hassanin et al., 2009). Large variation in body size is also widely observed in this order, which includes very small species such as the family Tragulidae (weighing less than 3 kg) (Rössner, 2007) and large-bodied animals such as Giraffini (adult giraffes weigh in excess of 1,000 kg) (Brown et al., 2007) and whales (some species have a length greater than 10 m) (Ridgway, 1997). In general, difference in body size is strongly related to metabolism (Martin and Palumbit, 1993). Therefore, the order Cetartiodactyla is a good model for inferring the molecular evolution of mtDNA that is associated with environmental adaptation and morphological evolution, which are linked to metabolic processes. A few previous studies have reported adaptive evolution of mtDNA with a focus on specific taxa belonging to Cetartiodactyla. Although Hassanin et al. (2009) showed that the ω of ATPase increased during the evolution of Caprini, which live at high altitudes, they did not demonstrate the statistical significance of this increase. In addition, Caballero et al. (2015) identified that codon site 297 in the ND2 gene in three "river dolphins" (families Pontopoeidae, Lipotidae and Inidae) is under positive selection related to adaptation to the freshwater environment. However, significant positive selection in Cetartiodactyla has not been reported and comprehensive studies of this order are lacking.
In this study, we used ω to identify genes that show an accelerated evolutionary rate and positive selection of mtDNA in Cetartiodactyla. Subsequently, the relationship between molecular evolution of mtDNA and environmental adaptation, such as change of body size and habitat, was elucidated. Amino acid substitutions under positive selection were identified to infer functional changes in genes that correlated with adaptive evolution in specific lineages of Cetartiodactyla.

MATERIALS AND METHODS
Phylogenetic analyses The complete mitochondrial genomes from 210 species described in Hassanin et al. (2009) covering most of the family Cetartiodactyla were retrieved from the NCBI database (Supplementary Table  S1). Additionally, we downloaded two mitochondrial genomes, Orcinus orca and Equus caballus (Xiufeng and Arnason, 1994;Morin et al., 2010). From these sequences, we excluded 21 species for which gene annotations were incomplete and portions of protein-coding sequences were missing. We used 191 well-annotated mitochondrial genomes for further analysis. These whole sequences were aligned using MAFFT with default settings (Katoh and Standley, 2013). A phylogenetic tree was constructed using maximum likelihood (ML) analysis implemented in RAxML v8.2 (Stamatakis, 2014) with the GTR + CAT model. Bootstrap probabilities were computed using 100 replicates. Based on the phylogenetic tree generated from a full-length mtDNA alignment, evolutionary rates were investigated.
Identification of positive selection sites Among the phylogenetic branches of Cetartiodactyla, dN/dS analyses were carried out to infer selective pressure on each of 12 of the 13 protein-coding genes of mtDNA: ND1, ND2, COX1, COX2, ATP8, ATP6, COX3, ND3, ND4L, ND4, ND6 and Cytb. Because we identified many inappropriate annotations in the ND5 genes, such as the insertion of termination codons, we excluded this gene from analysis. The 191 sequences of each gene were aligned using the CLUSTALW algorithm with default settings in MEGA software (Thompson et al., 1994;Kumar et al., 2016). The ω values were calculated by the CODEML program implemented in the PAML v. 4.8 package (Yang, 2007). Two types of analyses were conducted to evaluate evolutionary rates by the following models. First, ω was calculated for each gene to identify genes showing a relatively high evolutionary rate. Site models M0 (one ratio model, model = 0, NSsite = 0, fix omega = 0) and M1a (neutral model, model = 0, NSsite = 1, fix omega = 0), which compute a single ω for all branches, were computed. Using model M0, dN and dS values for each branch were calculated assuming an identical ω among all branches. Model M1a was also computed using an identical ω among all branches under two categories: 1) sites under purifying selection (0 < ω < 1) and 2) sites under neutral evolution (ω = 1) . Next, focusing on the accelerated genes, we examined whether these genes were positively selected using the likelihood ratio test (LRT) between the M8 (selection model: model = 0, NSsite = 8, fix omega = 0) and M8a (neutral model: model = 0, NSsite = 8, fix omega = 1) site models (Swanson et al., 2003). Model M8 can detect positive selection for all branches allowing codons to evolve with ω > 1. Model M8a was used to calculate ω values for all branches under two categories: 1) sites under purifying selection (0 < ω < 1) and 2) sites under neutral evolution (ω = 1). The LRT between the M8 and M8a models was conducted to confirm positive selection in the selected genes.
We evaluated the evolutionary rate for each branch on the phylogenetic tree. For genes that showed high ω values in the M0 and M1a model analyses, we conducted ML analyses using the branch-site model (two-ratios model: model = 2, NSsite = 0, fix omega = 0) to calculate ω specific to the lineages. In this study, we focused on the branches that showed body size evolution, adaptation to high-altitude environment and adaptation to the aquatic environment. To infer positive selection on specific branches, further branch-site model analyses were conducted using the MA model (model = 2, NSsite = 2, fix omega = 0) and MA null model (model = 2, NSsite = 2, fix omega = 1) . The LRT between the MA and M1a models was conducted to assign significant increases in evolutionary rates to specific branches (Zhang et al., 2005). LRTs between MA and MA null models were conducted to infer positive selection on targeted branches . In the LRTs, we conducted chi-squared tests with the degrees of freedom set to 1 to examine the statistical significance of differences between the models using a threshold of P < 0.05. Because low dS values and saturation of substitutions violate the dN/ dS estimation, we excluded branches showing dS < 0.001 and dS or dN > 2 for all dN/dS estimations according to the criteria of Villanueva-Cañas et al. (2013).
In addition, amino acid sites under positive selection in the branches that showed significant accelerated evolutionary rates or positive selection were detected using Bayes empirical Bayes (BEB) analysis to identify sites under positive selection with posterior probabilities > 0.80 Tian et al., 2016) in the MA model of the CODEML analysis. As the CODEML model has the potential to be biased and tends to detect positive selection operating on conserved genes (Foote et al., 2011), we analyzed physicochemical changes due to amino acid replacement by TreeSAAP (Woolley et al., 2003), which measures the influence of the amino acid replacements based on 31 structural and biochemical amino acid property changes. Amino acids showing property changes in category-7 and -8 (two categories with radical property changes) with P ≤ 0.001 are considered to be under a strong difference in selective pressure.

RESULTS
Phylogenetic relationships Phylogenetic analyses were conducted using the complete mitochondrial sequences (20,866 bp) of 191 species of Cetartiodactyla (Fig. 1). Most nodes had a high bootstrap value (100%), and the topology was in good accord with previously reported trees (Hassanin et al., 2012).
Accelerated evolutionary rate of ATP8 in Cetartiodactyla The ω for the phylogenetic tree shown in Fig. 1 was calculated using the M0 and M1a models for the dS and dN rates in each of the 12 mitochondrial OXPHOS genes and was, overall, small (ω < 1) across all genes, suggesting the conservative evolution of mitochondrial genes ( Table 1). The ω values across all genes varied greatly, with relatively high values (M0: 0.247, M1a: 0.317) for ATP8 compared to lower values for the remaining genes (M0: ω < 0.05, M1a: ω < 0.10) ( Table 1). In the M0 model, the dN component of ω was also the highest (M0: 0.020) for ATP8 (Fig. 2, Table 1), indicating that the substitution rate in ATP8 is accelerated. However, no significant differences were detected in LRT between M8 and M8a (P = 0.078, Table 1), indicating that positive selection did not occur across all branches of the tree.
Evolutionary rate of ATP8 in each branch of the Cetartiodactyla tree Using the branch-site model (two-ratios model), we calculated ω values for ATP8 at each branch. The interior branch from the split of Hippopotamidae to the last common ancestor of Cetacea (node 205 to 204) showed a higher ω value (ω = 0.801) than the other branches (Fig. 3, Table 2). The LRT between M1a and MA on branches leading to the last common ancestor of Cetacea showed a significant difference (P < 0.05) ( Table 3), indicating that the evolutionary rate of ATP8 was accelerated after the split of Hippopotamidae and before the divergence of Cetacea.
The correlation between evolutionary rates and morphological and/or environmental adaptations in this order showed that the cetacean ancestral branch associated with marine adaptation had a significantly accelerated evolutionary rate (Fig. 3, Table 2). On the other hand, there was no increase in ω at the branch associated with highland adaptation and body size change. For example, the ω values on the branches leading to the Lamini (node 213 to Lgua_NC_011822: ω = 0.142) and the Caprini (node 270 to 271: ω = 0.118) ( Table 2) were lower than the ω across all the branches (M0: ω = 0.247). Likewise, although a drastic increase in body size occurred in some branches (Mysticeti, node 202 to 203; Giraffini, node 222 to 223; Pecora, node 220 to 221) (Hassanin and Douzery, 2003;Mitchell and Skinner, 2003;Gatesy et al., 2013), no significant increase in evolutionary rate related to the change in body size was detected among the branches. Because of the low substitution rates, neither of these branches showed a significant increase in ω values (Fig. 3, Table 2).
Positive selection of ATP8 in the cetacean branch Further analysis to confirm whether positive selection or relaxation of purifying selection caused acceleration of the evolutionary rate in the ATP8 gene at the cetacean ancestral branch was conducted by comparing the results from the neutral branch-site model (MA null) and the branch-site model allowing positive selection (MA) Zhang et al., 2005). A significant difference was detected in the LRT between these models at the cetacean ancestral branch (P < 0.05) ( Table 3). Specifically, positive selection was detected in ATP8 at this branch. There was no evidence of relaxed or positive selection in the descendant branches leading to each cetacean species. Thus, only the cetacean ancestral branch showed positive selection in the ATP8 gene.
Identifying sites under positive selection in ATP8 amino acid sequences Ancestral amino acid sequences at nodes 205 to 204 were calculated by CODEML in the PAML package. CODEML analysis with the MA model detected two substitutions (sites 52 and 54) that were under positive selection at the cetacean ancestral branch (BEB analysis, P > 0.9) ( Table 3).
Two substitutions at sites 52 and 54 were also positively selected in the cetacean ancestral branch based on analysis of the 21 samples, which included one representative sample for almost all families represented by the 191 species. Physicochemical analysis by TreeSAAP was performed on the extracted 21 samples. There were 11 amino acid substitutions in ATP8 (Supplementary Fig.  S1). Significant physicochemical amino acid changes in sequences at the cetacean ancestral branch in ATP8 were also identified by the algorithm implemented in TreeSAAP. At least one radical amino acid change (category-7 and category-8 on a scale of 1-8; at P ≤ 0.001) was detected at sites 52, 54 and 63 ( Supplementary Fig.  S1). Prominent substitutions were found at sites 52 and 54   in cetacean ATP8. Although these sites were highly conserved in the Artiodactyla included in this study, except for cetaceans (Fig. 4), substitutions only occurred at cetacean lineages resulting in different amino acid sequences  (dN) and ω (dN/dS ratio) of ATP8 in each branch. These values were computed by the branch-site model (two ratios model). Node numbers correspond to Fig. 1. Branches for which ω could not be calculated due to extremely low nonsynonymous or synonymous substitution rates are not displayed. dN and ω are represented by gray and black bars, respectively. A list of ω scores is given in Table 2.  in cetaceans. For example, site 52 is glutamic acid and negatively charged in artiodactyl species, but becomes asparagine and is not charged in cetaceans. Likewise, site 54 is conserved lysine and positively charged in Arti-odactyla. However, following multiple substitutions at this site along the cetacean lineage, the site is no longer positively charged and has undergone substitutions to one of the non-charged amino acids (alanine, threonine and methionine) in cetacean species.

DISCUSSION
Acceleration of evolutionary rate in mtDNA of Cetartiodactyla In this study, we found an accelerated dN rate and an increased ω in ATP8 in the order Cetartiondactyla. Previous studies covering wide samplings of mammals also noted a high nucleotide substitution rate in ATP8 (Pesole et al., 1999) and showed a high incidence of radical amino acid property changes per residue (da Fonseca et al., 2008). However, these studies targeted many phylogenetically distant species and results may therefore overlook sensitive changes in the evolutionary mode of this order. This study clearly showed that the substitution rate of the ATP8 gene could have been accelerated during diversification of the order Cetartiodactyla. ATP8 consists of mitochondrial ATP synthase (mtATPase), complex V. MtATPase consists of two functional domains, the membrane-extrinsic F 1 sector and the membrane-bound F 0 sector, and these domains constitute the rotor of mtATPase, which rotates during ATP synthesis/hydrolysis. ATP is produced through phosphorylation of ADP by rotation of F 1 and utilizing the energy of proton transfer through the membrane-embedded channel of F 0 (Jonckheere et al., 2012). Some of the subunits of F 0 form the stator stalk, which prevents futile rotation of mtATPase during ATP synthesis/hydrolysis (Stephens et al., 2003). A stator stalk consists of multiple subunits, and ATP8 is only one part of the integral components of the stator stalk ( Supplementary Fig. S2). Compared to other proteins composing the core subunits of OXPHOS complexes, ATP8 does not appear to have a pivotal function in complex V. Because ATP8 has no direct influence in the function of complex V, the functional restriction of this protein seems to be reduced. The relatively high substitution rate of ATP8 in Cetartiodactyla is thought to be due to the relaxed constraints of the protein, and the limited function of the ATP8 gene would allow accumulation of substitutions.
Although an unusual feature of the ATP8 gene may also be responsible for increased ω , it seems reasonable that acceleration of the evolutionary rate of the gene in Cetartiodactyla is not related to this feature. In the Cetartiodactyla, ATP8 overlaps with the ATP6 gene, and the length of the overlapping region is dependent on species (6 -46 nucleotides). With respect to the reading frame ( + 1) of ATP8, the ATP6 gene starts at the + 3 reading frame (Supplementary Fig. S3). Due to this feature, the dS value of the ATP8 gene is relatively low (Table  1). Because a substitution may be simultaneously syn-onymous and nonsynonymous in overlapping regions, the commonly used methods to calculate dS, dN and dN/dS are not suitable (Wei and Zhang, 2014). However, the length of overlapping regions of ATP8/6 is only up to 46 nucleotides in the datasets generated in this study, and ω of ATP8 excluding the overlapping region was also significantly higher than that of other genes. Thus, the evolutionary rate of the gene seems definitely to be accelerated in Cetartiodactyla.
No branch-specific acceleration or positive selection of ATP8 in Artiodactyla (terrestrial) Although the ω of ATP8 across all branches of the Artiodactyla phylogenetic tree was higher than that of other genes, no association with morphological evolution or environmental adaptation at specific branches was found. An accelerated evolutionary rate would be expected in branches associated with the evolution of increased body size, such as in the branch leading to Giraffini. However, these branches showed relatively low ω in ATP8. Regarding this gene, the increase in body size and acceleration in evolutionary rate were not correlated. Likewise, there was no significant increase of evolutionary rate in the branches related to highland adaptation. For proteincoding genes other than ATP8, analysis only with a model that assumes a single, constant ω among all branches was performed. There was no correlation between the increase in the substitution rate of ATP8 and body size evolution or highland adaptation.
Positive selection of ATP8 in the cetacean ancestral branch may be related to marine adaptation In this study, in addition to detecting positive selection of ATP8 in the cetacean lineage based on ω, it was confirmed by CODEML and TreeSAAP that substitutions at sites 52 and 54 were subjected to positive selective pressure. Amino acid sites 52 and 54 are mostly conserved in the Artiodactyla (glutamic acid, E and lysine, K). ATP8, constituting an integral component of the stator stalk, is related to the assembly of multiple subunits (Hadikusumo et al., 1988;Jonckheere et al., 2012). In particular, positively charged amino acid regions at the C terminus are strongly involved in both the assembly and function of the F 0 sector (Stephens et al., 2003). Residue 54K, which is considered to be under positive selective pressure in this study, is a positively charged amino acid present in the C terminus region. The substitution at 54K in a cetacean ancestor may have caused functional changes to this protein. Although the function of this residue is unknown, a substitution at 52E is also observed in the ancestor of cetaceans and carries a change in charge (glutamic acid, E to asparagine, N). It seems that this substitution is strongly related to the functional change in ATP8. Accumulation of substitutions, and especially substitutions at 52E and 54K, in cetacean ancestors may have contributed to cetacean-specific functional changes in ATP8.
We suspect that these changes are related to marine adaptation in cetaceans. Because the common ancestor of hippos and cetaceans had a change in habitat from land to sea, their ancestors may have been terrestrial or semiaquatic species (Gatesy et al., 2013). Thus, it is expected that physiological adaptation to the aquatic environment occurred in the cetacean ancestors. Because mammals have pulmonary respiration, hypoxia (a reduction in convective oxygen delivery in blood and tissues) is recognized as one of the biggest hurdles to adapting to the aquatic environment, and aerobic metabolism is known to be reduced under hypoxic conditions (Kooyman et al., 1981;Tomanek, 2014). In addition to hypoxia tolerance, brain size expansion and muscle functional changes that promote metabolism under anaerobic conditions compared to terrestrial mammals are characteristic phenotypic changes observed in the evolution of cetaceans (Kooyman et al., 1981;Marino et al., 2007). Since brain and muscle activity is directly related to metabolism and ATP synthesis (Erecinska and Silver, 1989;Korzeniewski, 1998), genes involved in this energy synthesis may be strongly related to marine adaptation in cetaceans. An increase in the mutation rate of several nuclear genes involved in metabolism has already been detected in marine mammals (Foote et al., 2015;Chikina et al., 2016). Positive selection of ATP8 located in mtDNA is involved in ATP synthesis and was detected here, and these results strongly suggest that changes in metabolic activity are important for marine adaptation in cetaceans.
Finally, hypotheses of how functional changes in ATP8 contribute to adaptation to the marine environment are considered. First, due to functional changes of ATP8, the function of the stator stalk to prevent futile rotation of mtATPase was strengthened. In cetaceans, reduction of oxygen consumption is critical because of hypoxic conditions. In the Cetacea, the functional change of ATP8 may have occurred to prevent futile rotation of mtATPase. This change would strongly suppress consumption of oxygen. Second, stator stalk function may have been lost due to the functional change of ATP8. In cetaceans, aerobic respiration becomes inactive due to hypoxia (Tomanek, 2014), making the function of limiting futile rotation of mtATPase less critical. Although it is difficult to elucidate the functional changes of ATP8 in detail based on the results of this study alone, the accumulation of amino acid substitutions, including substitutions at 52E and 54K of ATP8, is strongly suggested to be involved in the evolution of energy synthesis in cetaceans, and may have been utilized for adaptation to the marine environment.

CONCLUSION
In this study, the evolutionary rate of ATP8 was found to be relatively higher than that of other mitochondrial genes in Cetartiodactyla and positive selection was detected in the cetacean ancestral branch. Despite high conservation of mtDNA, ATP8 was exposed to relaxed selective pressures. Accumulation of amino acid substitutions and point mutations (at sites 52 and 54) under positive selective pressure of ATP8 was detected in the cetacean ancestors. These functional changes in cetacean ATP8 are suggested to have made changes to ATP synthesis status possible, and these changes are related to adaptation to the aquatic environment. Further research focusing on amino acid substitution rates in the ATP8 gene across mammalian species will contribute to a better understanding of the molecular evolution of mtDNA linked to adaptive evolution in mammals.