Genes & Genetic Systems
Online ISSN : 1880-5779
Print ISSN : 1341-7568
ISSN-L : 1341-7568
Full papers
Unveiling the expansion of keratin genes in lungfishes: a possible link to terrestrial adaptation
Yuki KimuraMasato Nikaido
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML
Supplementary material

2023 Volume 98 Issue 5 Pages 249-257

Details
ABSTRACT

Keratins are intermediate filament proteins that are important for epidermal strength and protection from desiccation. Keratin genes are highly duplicated and have diversified by forming two major clusters in the genomes of terrestrial vertebrates. The keratin genes of lungfishes, the closest fish to tetrapods, have not been studied at the genomic level, despite the importance of lungfishes in terrestrial adaptation. Here, we identified keratin genes in the genomes of two lungfish species and performed syntenic and phylogenetic analyses. Additionally, we identified keratin genes from two gobies and two mudskippers, inhabiting underwater and terrestrial environments. We found that in lungfishes, keratin genes were duplicated and diversified within two major clusters, similar to but independent of terrestrial vertebrates. By contrast, keratin genes were not notably duplicated in mudskippers. The results indicate that keratin gene duplication occurred repeatedly in lineages close to tetrapods, but not in teleost fish, even in species adapted to terrestrial environments.

INTRODUCTION

The structure of the epidermis changed dramatically during evolution from fish to tetrapods, which live in different environments—water and land, respectively (Alibardi, 2022). One remarkable difference between fish and tetrapods is the presence of keratinocytes. Although the origin of keratinocytes is unclear, they are found in the epidermis of lungfishes, which are extant fish most closely related to tetrapods (Smith and Coates, 1937; Alibardi and Joss, 2003). The African and South American lungfishes (Sarcopterygii) can survive during dry seasons, because they breathe with their lungs and form cocoons in the mud (Sturla et al., 2002). The epidermis of the Australian lungfish (Neoceratodus forsteri) can be labeled with AE2, a monoclonal antibody known as a keratinization marker in mammalian epidermis (Tseng et al., 1984), implying the existence of keratinocytes in lungfishes (Alibardi, 2001; Alibardi and Joss, 2003). Studies have shown that flat epithelial cells arise in the epidermis of the African lungfish (Protopterus annectens) during estivation (Smith and Coates, 1937; Sturla et al., 2002; Heimroth et al., 2018, 2021). These cells were proposed to be similar to those observed in the skin of amphibians (Sturla et al., 2002).

The stratum corneum is the outermost layer of the epithelium in tetrapods. Keratin is the main component of keratinocytes. Keratin proteins form intermediate filaments, which are classified as types I and II. Genes that encode keratin form clusters in tetrapod genomes (Zimek and Weber, 2005; Vandebergh and Bossuyt, 2012; Ehrlich et al., 2019). In teleost fish, the third whole-genome duplication resulted in the loss of the keratin gene cluster (Zimek and Weber, 2005). Recently, analysis of a wide range of fish genomes revealed that keratin gene clusters are present in amphibious non-teleost fish (Kimura and Nikaido, 2021). Furthermore, a relationship between the repertoire of keratin genes and habitat (i.e., underwater or terrestrial environment) in vertebrates has been proposed. For example, specific subclusters of keratin genes have been found to be expressed only during adult life stages in frogs (Suzuki et al., 2017). Certain keratin genes have been lost independently in mammals that have adapted to aquatic environments (Ehrlich et al., 2019). Although the investigation of keratin genes in lungfishes is important to understand vertebrate evolution from fish to tetrapods, no such study has been performed to date. Indeed, research on the keratin genes of P. annectens has been limited to gene expression, and has not covered the repertoire in the genome (Schaffeld et al., 2005, 2007; Kimura and Nikaido, 2021). The large size of the lungfish genome has hampered whole-genome-level analysis. Recently, however, advances in long-read sequencing technology have made it possible to determine large-size genomes, leading to the determination of the whole-genome sequences of N. forsteri and P. annectens (Meyer et al., 2021; Wang et al., 2021).

In addition to lungfishes, we have focused on mudskippers, which are amphibious fish belonging to the goby group of teleost fish (Actinopterygii). Two instances of independent terrestrial adaptation have occurred in mudskippers (Steppan et al., 2022). Periophthalmus magnuspinnatus, a species of mudskipper that is adapted to terrestrial environments, is recognized for its flattened cells that are organized in the outermost layer of the epidermis (Park, 2002; Zhang et al., 2003). Examining the keratin genes of mudskippers may provide insights into convergent evolution and the possible link between keratin gene repertoires and terrestrial adaptation. In this study, we identified keratin genes from the genomes of mudskippers as well as two lungfish species, and clarified their phylogenetic relationship with those of tetrapods. We found that the expansion of keratin genes within two major clusters occurred in lungfishes but not in mudskippers, indicating that the process of terrestrial adaptation differs between lungfishes and mudskippers in terms of keratin genes.

RESULTS

Keratin genes are duplicated in clusters in lungfish genomes

Keratin genes of lungfishes have been identified only from mRNA expression data, because of the absence of whole-genome data (Schaffeld et al., 2005; Kimura and Nikaido, 2021). In this study, we identified keratin genes in the genomes of the Australian lungfish (N. forsteri) and the African lungfish (P. annectens), which were recently determined and released (Fig. 1). The keratin gene clusters of lungfishes were flanked by the genes smarce1 and eif1 of cluster 1 and faim2 and eif4b of cluster 2 (Fig. 1). These syntenic relationships of lungfishes were conserved among tetrapods, coelacanth, reedfish and gar (Fig. 1), but not in teleost fish, which diverged after the third round of whole-genome duplication (Supplementary Table S1). Exceptions include krt44.5 of the Australian lungfish and krt48.2 of the African lungfish, which are located further apart from the clusters.

Fig. 1. Comparison of keratin gene clusters in vertebrates, including reedfish (Erpetoichthys calabaricus), spotted gar (Lepisosteus oculatus), coelacanth (Latimeria chalumnae), Australian lungfish (Neoceratodus forsteri), African lungfish (Protopterus annectens), a caecilian (Rhinatrema bivittatum), clawed frog (Xenopus tropicalis) and human (Homo sapiens). Non-keratin genes are indicated by black triangles, whereas type I and type II keratin genes are indicated by turquoise and pink, respectively. Genomic scaffolds or chromosomes are represented by black lines, and non-keratin orthologous genes between species are connected by blue dotted lines. Chr.3* of reedfish is unlocalized scaffolds of chromosome 3. The definition of hard keratin follows previous research (Ehrlich et al., 2020).

The distances between keratin genes in the clusters were much longer in lungfishes (Australian lungfish: up to 1.4 Mbp; African lungfish: up to 2.3 Mbp) than in humans (up to 0.4 Mbp) (Supplementary Table S1). The number of keratin genes in lungfishes was substantially higher than that in the coelacanth, their close relative in Sarcopterygii. The number of keratin genes was also higher in the African lungfish (33) than in the Australian lungfish (22) (Fig. 1, Fig. 2). A common feature of the two lungfishes was krt18, which has duplicated and is located at the end of cluster 2 (Fig. 1). The krt48 genes are conserved from cartilaginous fish to caecilians (Kimura and Nikaido, 2021), including lungfishes; however, the syntenic relationship between krt48 genes in terms of flanking genes was not conserved in all species (Fig. 1). In particular, one of the krt48 genes of the African lungfish is located on the same chromosome as cluster 2, although the gene was not close to this cluster, but separated by 450 Mbp.

Fig. 2. Summary of the number of keratin genes. Colors in the graph are as follows: cluster 1 (pink) and cluster 2 (turquoise), subcluster 3 (green), other type I keratin (dark red), and other type II keratin (blue). The arrow indicates that the third round of whole-genome duplication (WGD) occurred in the common ancestor of teleost fish. The estimated keratin gene duplication events are indicated by the three green lines.

Remarkably, the number of type II keratin genes in cluster 2 in lungfishes was higher than that in other fish (Fig. 1, Fig. 2). The total number of keratin genes in the African lungfish was comparable to that in the caecilian species Rhinatrema bivittatum (Fig. 2).

Phylogeny of keratin genes in sarcopterygians

A study using mRNA sequences of lungfish keratin genes showed that most lungfish keratins form a single monophyletic clade (Schaffeld et al., 2005). However, this study may have overlooked genes that are not expressed in particular tissues or at certain developmental times; therefore, whole-genome analyses have been necessary. In the present study, we constructed a phylogenetic tree of keratin genes identified from the whole genomes of sarcopterygians, including coelacanth, lungfishes and tetrapods (Fig. 3). krt18, krt8 and krt80 were conserved among all sarcopterygians, except for the Australian lungfish. Indeed, our homology search did not find krt8 in the genome of the Australian lungfish, although the gene has been found in all vertebrates investigated to date, adjacent and in the opposite sense to krt18 (Zimek and Weber, 2005; Vandebergh and Bossuyt, 2012; Ehrlich et al., 2019; Kimura and Nikaido, 2021). Thus, the absence of krt8 only in the Australian lungfish is evolutionarily implausible and implies an error in genome assembly.

Fig. 3. A phylogenetic tree of keratin genes in Sarcopterygii. The species included in the tree are as follows: coelacanth (Latimeria chalumnae; L.cha, dark blue), Australian lungfish (Neoceratodus forsteri; N.for, yellowish brown), African lungfish (Protopterus annectens; P.an, green), a caecilian (Rhinatrema bivittatum; R.bivi, pink), clawed frog (Xenopus tropicalis; X.tr, red), and human (Homo sapiens; H.sa, light purple). Three branches for which selection analyses were conducted are colored (see Table 1 for details). The results of selection analyses (dN/dS and P-values; ω and p) are shown near the branches. See main text for letters (A–G) in particular nodes of the tree. Circles on each node indicate bootstrap support. The scale bar indicates the number of amino acid substitutions per site.

We found three clades, named krt44, krt70 and krt100, in which keratin genes were highly duplicated specifically in lungfishes (Fig. 3). The krt44 clade is composed predominantly of keratin genes of the African lungfish (11 in the African lungfish and four in the Australian lungfish), whereas the krt70 clade is predominantly composed of those of the Australian lungfish (Fig. 3). The krt100 clade is composed of almost the same number of keratin genes in both African and Australian lungfishes.

Phylogeny and number of keratin genes in gobies and mudskippers

In addition to the lungfishes, we focused on mudskippers, in which terrestrial adaptation occurred independently of Sarcopterygii. We constructed a phylogenetic tree of keratin genes of two species each of aquatic gobies (Gobiidae) and terrestrial mudskippers (Oxudercidae) (Fig. 4). We included keratin genes of reedfish and zebrafish in the analysis to identify the orthology and timing of gene duplication in gobies and mudskippers during the evolution of Actinopterygii (ray-finned fish). In the krt18, krt8 and krt222 clades, orthologs were identified in most of the species analyzed. In particular, keratin genes orthologous to krt18.1_E.cal (A) and krt18.3_E.cal (B) were conserved among most actinopterygians (Fig. 4). The keratin genes were highly diversified in the clade including krt49 of reedfish; however, extensive duplications were not observed in gobies or mudskippers.

Fig. 4. A phylogenetic tree of keratin genes in Actinopterygii, including gobies and mudskippers. The species included in the tree are as follows: reedfish (Erpetoichthys calabaricus; E.cal, blue), zebrafish (Danio rerio; D.rer, black), gobies (Proterorhinus semilunaris; P.sem, red, and Rhinogobius similis; R.sim, pink) and mudskippers (Boleophthalmus pectinirostris; B.pec, green, and Periophthalmus magnuspinnatus; P.mag, yellow). See main text for letters (A and B) in particular nodes of the tree. Circles on each node indicate bootstrap support. The scale bar indicates the number of amino acid substitutions per site.

The total numbers of keratin genes identified in Actinopterygii were as follows: zebrafish (29), aquatic gobies Proterorhinus semilunaris (22) and Rhinogobius similis (19), and terrestrial mudskippers P. magnuspinnatus (22) and Boleophthalmus pectinirostris (23) (Fig. 2).

Selection analysis

The dN/dS ratio is useful to examine natural selection operating on protein-coding genes. In general, the dN/dS ratio of functional protein-coding genes, which are under purifying selection, is < 1. The dN/dS ratios of genes that are nonfunctional or under relaxed purifying selection are elevated towards 1. A dN/dS ratio > 1 indicates that a gene is under positive selection.

In this study, we calculated the dN/dS ratio for the three clades in which keratin genes were highly duplicated (krt44, krt70 and krt100) in lungfishes (Fig. 1 and Fig. 3) to examine the possibility that these genes are under positive selection. The selection analysis is summarized in Fig. 3 and Table 1. The dN/dS ratio of the background was 0.25–0.28, suggesting that most of the keratin genes are under purifying selection. The dN/dS ratio of the krt44 clade (3.86) was significantly higher than that of the background (0.28), suggesting that this clade is under positive selection. The dN/dS ratios of krt70 (0.37) and krt100 (0.30) clades were not significantly higher than that those of the background (0.28 and 0.25, respectively).

Table 1. Results of selection analysis and likelihood ratio tests

Clade krt44nplnLP-valuedN/dS
Null hypothesis252−142274.46610.2767
Alternative hypothesis253−142269.33580.0013593.8629
Clade krt70nplnLP-valuedN/dS
Null hypothesis252−142274.46610.2767
Alternative hypothesis253−142274.18330.4520490.3699
Clade krt100nplnLP-valuedN/dS
Null hypothesis154−91932.329440.2486
Alternative hypothesis155−91932.116020.513550.3007

np: number of free parameters; P-value: probability under the null hypothesis of the likelihood ratio test result for both null and alternative hypotheses.

DISCUSSION

Lungfish-specific duplicated keratin genes

In this study, we revealed the syntenic and phylogenetic relationships of keratin genes between lungfishes and terrestrial vertebrates identified from whole-genome sequences (Fig. 1, Fig. 3). We showed that the keratin genes of lungfishes were highly duplicated in clusters 1 and 2, which are major clusters in tetrapods (Fig. 1). However, these duplicated genes in lungfishes were not orthologous to tetrapods (Fig. 3). These results suggest that keratin genes diversified independently in lungfishes and tetrapods. In the African lungfish, flattened cells are found in the epidermis during estivation, suggesting a similarity between the flattened cells of lungfishes and keratinocytes of tetrapods (Smith and Coates, 1937; Sturla et al., 2002; Heimroth et al., 2018, 2021). These duplicated and diversified keratin genes are presumably expressed in cells to protect against desiccation. Similarly, flattened cells are observed in the epidermis of mudskippers. The flattened cells of the epidermis mostly comprise a single layer in lungfishes (Heimroth et al., 2018, 2021), while they constitute one to five layers in mudskippers (Park, 2002). Since the epidermis of lungfishes is covered by a cocoon during estivation, the contribution of flattened cells to moisture retention is expected to be smaller than in mudskippers, which do not have a cocoon. Identification of keratin genes that are expressed in these flattened cells will be important in elucidating their functional role in defense against skin drying.

Keratin genes named “amphibian adult epidermal keratin” were expressed in the epidermis of amphibians during the adult stage and formed subclusters in their genomes (Suzuki et al., 2017). These keratin genes were proposed to have diversified in function to adapt to terrestrial environments (Kimura and Nikaido, 2021). Our phylogenetic analysis revealed that the highly duplicated keratin gene krt44 in lungfishes duplicated and diversified independently of amphibian adult epidermal keratin genes, although they are commonly located in cluster 1 (Fig. 1 and Fig. 3). Duplications of keratin genes were observed independently in clades A, B, E and F of humans; C of a caecilian; and D and G of the clawed frog (Fig. 3). These findings suggest that the tandem duplications and diversifications of keratin genes in the clusters occurred repeatedly and independently during the adaptation of vertebrates to terrestrial environments (Fig. 2).

Analysis of prior studies suggests a relaxation of purifying selection in some clades of keratin genes in the reedfish and amphibians, both of which are capable of surviving in terrestrial environments (Fig. 1) (Kimura and Nikaido, 2021). In this study, clade krt44 of lungfishes appears to be under positive selection (Fig. 3), implying that duplicated keratin genes acquire novel functions and/or expression patterns during evolution (Zhang, 2003). Given that keratin gene duplication and diversification occur repeatedly in amphibious fish and tetrapods, it is plausible that the expansion of keratin genes is linked to terrestrial adaptation.

Notably, the distance between keratin genes in the cluster is remarkably extended in lungfishes compared with other species (Supplementary Table S1). Accumulation of transposable elements in the intergenic regions may have led to the extension of the cluster. Indeed, lungfishes have extremely large genomes compared with other vertebrates, which was shown to be due to the accumulation of transposable elements (Meyer et al., 2021; Wang et al., 2021). Despite the drastic extension of the keratin gene cluster in lungfishes, the structure itself is mostly conserved (Fig. 1).

In our previous study, we found that a cysteine residue (Cys401) is conserved in keratins expressed in the epidermis of mammals and of adult amphibians (Kimura and Nikaido, 2021). Cys401 is not conserved in any of the lungfish and mudskipper keratin sequences. It appears to have arisen once in amphibians and later tetrapods.

No apparent link between terrestrial adaptation and diversification of keratin genes in mudskippers

In addition to lungfishes, we have elucidated the phylogenetic relationships of keratin genes by focusing on gobies and mudskippers (Fig. 4) to examine the link between terrestrial adaptation and duplication of keratin genes in the clusters in teleost fish. The two mudskippers used in this analysis, B. pectinirostris and P. magnuspinnatus, have adapted to land independently from aquatic species (Steppan et al., 2022). Although some genes (e.g., krt49.6_B.pec and krt49.7_B.pec) show species-specific duplications, no notable duplications were found in mudskippers. Indeed, the total numbers of keratin genes in mudskippers were lower than that in zebrafish (Fig. 2), suggesting no apparent link between terrestrial adaptation and duplication of keratin genes in teleost fish. Medaka and pufferfish also have fewer keratin genes than zebrafish (Vandebergh and Bossuyt, 2012; Kimura and Nikaido, 2021), implying that species-specific gene duplication increased the number of keratin genes only in zebrafish among teleost fish analyzed so far (Fig. 4). In addition, the major keratin genes in teleost fish (indicated as other types I and II in Fig. 2) were distinct from those of tetrapods in that they did not form clusters. Previous studies have discussed the possibility that the whole-genome duplication and subsequent chromosomal rearrangements which occurred in the common ancestor of teleost fish resulted in scattering of the keratin clusters (Zimek and Weber, 2005; Kimura and Nikaido, 2021). Notably, teleost fish possess a substantial number of keratin genes, although their number is stable throughout evolution. The conserved copy number among teleost fish implies that the keratin genes were under strict purifying selection for survival, which is in contrast with those of tetrapods and their relatives (Kimura and Nikaido, 2021).

CONCLUSION

In this and other studies (Kimura and Nikaido, 2021), our group has revealed remarkable and parallel duplications of keratin genes in tetrapods, lungfishes and reedfish, but not in mudskippers, all of which are adapted to terrestrial environments. A possible link between terrestrial adaptation and duplication of the keratin gene repertoire would thus be applicable to Sarcopterygii (tetrapods and lungfishes) and basal Actinopterygii (reedfish), but not to derived Actinopterygii (mudskippers). These results imply that strategies for terrestrial adaptation in the epidermis of vertebrates vary among species. Further comparative genomic analysis will provide important insights into the evolution of epidermal barrier mechanisms during terrestrial adaptation.

MATERIALS AND METHODS

Keratin gene sequences

The nucleotide sequences of keratin genes of reedfish (Erpetoichthys calabaricus), coelacanth (Latimeria chalumnae), a caecilian (Rhinatrema bivittatum) and clawed frog (Xenopus tropicalis) were taken from a previous study (Kimura and Nikaido, 2021). The keratin genes of zebrafish (Danio rerio), two mudskippers (B. pectinirostris and P. magnuspinnatus), African lungfish (P. annectens) and humans were obtained from NCBI Gene (GRCz11, GCA_000788275.1, GCA_009829125.1, GCA_019279795 and GRCh38.p14). The keratin genes of Australian lungfish (N. forsteri) and two gobies (P. semilunaris and R. similis) were identified from whole-genome data (GCA_016271365.1, GCA_021464625.1 and GCA_019453435.1) using our original software FATE (https://github.com/Hikoyu/FATE) with parameter ‘-p 30 -g genewise -o 120 -v 5 -h tblastn’, with the keratin genes of the clawed frog and the African lungfish as queries. To evaluate whether the genes were intact, these sequences were aligned with known keratin genes using MAFFT online ver. 7 (Katoh et al., 2019) with default settings. Pseudogenes and truncated genes, which were judged by the absence of conserved rod domains, were removed from the analyses. The intact genes were translated into amino acid sequences using Open Reading Frame Finder (https://www.ncbi.nlm.nih.gov/orffinder/) and verified as keratin using SMART BLAST (https://blast.ncbi.nlm.nih.gov/smartblast/smartBlast.cgi).

The nomenclature of keratin genes of lungfishes was based on Xenopus, for which several clades were defined in Xenbase (https://www.xenbase.org). The keratin genes specific to lungfishes were named krt44 and krt100, which do not overlap with any previously named genes. The keratin genes of gobies and mudskippers were named based on the nomenclature of reedfish. The names of keratin genes of zebrafish were based on The Zebrafish Information Network (https://zfin.org). The definition of keratin gene clusters 1 and 2 was based on the synteny relationships with several marker genes: smarce1 or eif1 of cluster 1 and faim2 or eif4b of cluster 2. The definition of subcluster 3 refers to another study (Kimura and Nikaido, 2021).

Phylogenetic analysis

To elucidate the phylogenetic tree of keratin genes of Sarcopterygii, all amino acid sequences of keratin and vimentin (type III intermediate filaments) genes of coelacanth and the Australian lungfish were aligned using MAFFT online ver. 7.520 (Katoh et al., 2019) with default settings. The phylogenetic tree was constructed by IQ-TREE ver. 1.6.12 (Nguyen et al., 2015) with the JTT+F+R7 model, which was selected by ModelFinder (Kalyaanamoorthy et al., 2017). Vimentin genes were used as an outgroup. The reliability of the tree was estimated by 1,000 ultrafast bootstraps and 1,000 SH-aLRT tests (Guindon et al., 2010; Hoang et al., 2018). To elucidate the phylogenetic tree of keratin genes of Actinopterygii, amino acid sequences of keratin genes of all species and vimentin genes of reedfish were used for analyses. Alignments and phylogenetic analyses were performed as described with the JTT+F+R5 model.

Selection analysis

We calculated dN/dS values (non-synonymous substitution rate to synonymous substitution rate ratio) for three clades, in which the keratin genes of lungfishes were extensively duplicated, to evaluate the operation of natural selection. The calculation of dN/dS values using the codeml program in PAML (Yang, 2007) and likelihood tests were conducted as described (Kimura and Nikaido, 2021).

ACKNOWLEDGMENTS

We thank Yujiro Kawabe for the animal illustration. We also thank Zicong Zhang of Kyoto University for technical assistance during selection analyses. This work was supported by Grant-in-Aid for JSPS Fellows Grant Number 21J21544.

REFERENCES
 
© 2023 The Author(s).

This is an open access article distributed under the terms of the Creative Commons BY 4.0 International (Attribution) License (https://creativecommons.org/licenses/by/4.0/legalcode), which permits the unrestricted distribution, reproduction and use of the article provided the original source and authors are credited.
https://creativecommons.org/licenses/by/4.0/legalcode
feedback
Top