2023 Volume 98 Issue 2 Pages 93-99
Cichlid fishes are textbook examples of explosive speciation and adaptive radiation, providing a great opportunity to understand how the genomic substrate yields extraordinary species diversity. Recently, we performed comparative genomic analyses of three Lake Victoria cichlids to reveal the genomic substrates underlying their rapid speciation and adaptation. We found that long divergent haplotypes derived from large-scale standing genetic variation, which originated before the adaptive radiation of Lake Victoria cichlids, may have contributed to their rapid diversification. In addition, the present study on genomic data from other East African cichlids suggested the reuse of alleles that may have originated in the ancestral lineages of Lake Tanganyika cichlids during cichlid evolution. Therefore, our results highlight that the primary factor that could drive repeated adaptive radiation across East African cichlids was allelic reuse from standing genetic variation to adapt to their own specific environment. In this report, we summarize the main results and discuss the evolutionary mechanisms of cichlids, based on our latest findings.
Cichlid fishes exhibit tremendous species diversity via adaptive radiation, a phenomenon in which many species are rapidly established from a few ancestral lineages by adapting to ecological niches (Kocher, 2004). In East Africa, particularly, several hundred endemic cichlid species that differ in morphology and ecology were rapidly generated 5–10 million years ago in Lake Tanganyika (Ronco et al., 2021), 800,000 years ago in Lake Malawi (Ivory et al., 2016; Malinsky and Salzburger, 2016), and 14,600 years ago in Lake Victoria (Johnson et al., 2000) (Fig. 1). While these cichlids are genetically related in each lake, morphologically similar species have emerged across lakes and rivers, providing a good example of parallel evolution (Kocher et al., 1993; Brawand et al., 2014). Therefore, cichlids are an important model system for understanding evolutionary mechanisms that have enabled extraordinary species diversity (Kocher, 2004; Brawand et al., 2014; Salzburger, 2018; Svardal et al., 2021). In our latest paper (Nakamura et al., 2021), we focused on the genomic substrate that facilitated the explosive adaptive radiation of East African cichlids, particularly in Lake Victoria where more than 500 endemic species emerged only 14,600 years ago.
Phylogeny of East African cichlids based on the results of Meier et al. (2017) and Ronco et al. (2021). In the present report, we designated a subgroup of Haplochromini, which diverged after the timing of the establishment of Tropheini, a tribe that returned to Lake Tanganyika after entering the river, as “modern Haplochromini”. Blue lines represent the lineage of modern haplochromines (sensu Salzburger et al., 2005). The photos show three Lake Victoria cichlids, Haplochromis chilotes, H. sauvagei and Lithochromis rufus, that were analyzed by Nakamura et al. (2021).
Recently, standing genetic variation (SGV), a genetic polymorphism that is already present at a certain frequency in a population, has attracted attention as a key factor driving rapid adaptation and speciation (Hermisson and Pennings, 2005; Barrett and Schluter, 2008; Schluter and Rieseberg, 2022). This is because the adaptive alleles for a particular environment are already present as SGV within a population, allowing the population to adapt rapidly to a novel environment. Many studies have shown that the acquisition of SGV by hybridization occurred before adaptive radiation in various organisms, including cichlids (Marques et al., 2019). A previous study on the adaptive radiation of Lake Victoria cichlids suggested that many SGVs acquired by hybridization between two parental lineages underlay the subsequent explosive diversification (Meier et al., 2017).
It is clear that allelic polymorphisms are shared among different species of Lake Victoria cichlids due to recent adaptive radiation from the ancestral population with SGV. However, in addition to Lake Victoria, the sharing of allelic polymorphisms is also observed across East African lakes and rivers. For example, it was shown that some highly differentiated single nucleotide polymorphisms (SNPs) in Lake Malawi and Lake Victoria cichlids were shared with cichlids in other lakes and rivers (Loh et al., 2013; Brawand et al., 2014). Furthermore, long divergent haplotypes have been reported to be shared across lakes and rivers. A prominent example is V1R2, encoding a pheromone receptor. Lake Victoria cichlids contain two major divergent alleles of V1R2, which differ by 14 amino acids and are tightly linked by recombination suppression (Nikaido et al., 2014). Moreover, the origin of the derived allele was estimated to predate the adaptive radiation of the modern Lake Tanganyika cichlids (Nikaido et al., 2014). The allelic diversity of genes related to sensory perception, including vision and olfaction, is crucial for rapid speciation, because perception directly contributes to reproductive communication (Seehausen et al., 2008; Keller-Costa et al., 2014). Therefore, this finding pointed to the possibility that the divergent alleles that might contribute to the adaptive radiation of Lake Victoria cichlids (14,600 years ago) had built up before the radiation of Lake Tanganyika cichlids (5–10 million years ago). Although more than a dozen genes with divergent alleles derived from SGV have been identified (Nikaido et al., 2014; Meier et al., 2017; Takuno et al., 2019; McGee et al., 2020; Urban et al., 2021), they might be only a small part of the genomic substrate that yielded cichlid radiation. Nevertheless, it had remained to be investigated which genes had large-scale SGV, such as the allelic diversity of V1R2, and to what extent large-scale SGV accounted for the genetic factors driving rapid speciation and adaptation of East African cichlids.
Nakamura et al. (2021) described 99 novel candidate genes with divergent alleles derived from SGV through comparative genomic analyses of six males from each of three Lake Victoria cichlids, Haplochromis chilotes, H. sauvagei and Lithochromis rufus, which differ in their diet and habitat. To identify candidate genes with divergent alleles derived from SGV that contributed to their speciation and adaptation, we first explored highly differentiated regions (HDRs) between pairs of species by calculating FST and dXY, which are measures of genetic differentiation, for 10-kb overlapping windows. We then examined the origins of divergent alleles in 304 candidate genes located in HDRs with the top 0.5% FST values by molecular phylogenetic analyses using the assembled genomes of nine East African cichlids. Finally, we identified 99 candidate genes with divergent alleles, whose origins predate the adaptive radiation of Lake Victoria cichlids. The results showed that these genes were located in HDRs with higher dXY values, such as those with the top 0.5% dXY values, and harbored long divergent haplotypes derived from large-scale SGV. However, we did not investigate the origins of divergent alleles in the non-coding regions that regulate gene expression. Therefore, our results highlighted the possibility that hundreds or even thousands of SGVs composed of long haplotypes, not just SNPs, that had originated before the adaptive radiation of Lake Victoria cichlids, drove rapid speciation and adaptation.
Here, we offer interpretations of the evolutionary mechanisms of East African cichlids. First, it is noteworthy that the SGV found among Lake Victoria cichlids in our study was composed of long divergent haplotypes, such as V1R2, as well as differentiated SNPs. Two COL6A6 genes located in tandem, COL6A6_a and COL6A6_b, are examples of such large-scale SGV (Fig. 2). Both COL6A6 genes contained long divergent haplotypes that exhibited strong linkage disequilibrium, and the alleles had originated before the adaptive radiation of Lake Victoria cichlids, particularly before that of modern Lake Tanganyika cichlids, as for COL6A6_a. However, the origin of the allelic diversity may be older than that estimated by Nakamura et al. (2021) because of the limited number of species used. Therefore, to ascertain the patterns of allelic polymorphisms among a wide range of East African cichlids, including all tribes of Lake Tanganyika cichlids, we newly investigated the genetic polymorphisms among 144 individuals of East African cichlids using whole-genome resequencing data from the NCBI database (see Supplementary Table S1 for details), adopting a simple mapping-based approach. First, we removed low-quality reads using fastp (Chen et al., 2018) and then mapped reads to the assembled genome of H. chilotes (Nakamura et al., 2021) using bwa-mem (Li, 2013). After variant calling using bcftools (Li, 2011), SNP filtering was performed on each individual. We allowed only SNPs with a mapping quality of 25, with a minimum depth of 10 or 5 (in cases where the median depth in the scaffold was below 10) and a maximum depth of twice as much as the top 5% of the site depth in the same scaffold, using vcffilter (Garrison et al., 2022) and vcftools (Danecek et al., 2011). We then investigated the SNP genotypes of the two COL6A6 genes using the option “-012” in vcftools (Danecek et al., 2011) and obtained the SNP allele frequencies in each lineage of cichlids. Finally, we identified highly differentiated SNPs, in which the differences in allele frequencies of a pair of Lake Victoria cichlids were more than 0.5, and investigated allelic polymorphisms of these SNPs in other East African cichlids.
Allelic polymorphisms of two COL6A6 genes in a wider range of East African cichlids observed using a mapping-based approach. To investigate when the divergent alleles among three Lake Victoria cichlids were established during cichlid evolution, the sites with highly differentiated SNPs (i.e., the differences in allele frequencies between H. chilotes and L. rufus were more than 0.5) were extracted. Sites with missing data for more than three individuals among three species were excluded. (A–B) show the results for COL6A6_a and (C–D) show those for COL6A6_b. (A, C) Genotypes of three Lake Victoria cichlids. Each SNP allele was discriminated based on the differences in allele frequencies between species. Each genotype was then assigned a different color. (B, D) Allele frequencies of each evolutionary lineage of East African cichlids from Lake Tanganyika to Lake Victoria (see Fig. 1 for details). Each tile was assigned a color according to the SNP allele frequencies in each lineage. The approximate positions of the highly differentiated SNPs are indicated at the bottom of (B) and (D).
The results re-emphasized the differences in the evolutionary patterns between the two COL6A6 genes. COL6A6_a had two long divergent haplotypes in the HDRs between H. chilotes and L. rufus in Lake Victoria, consisting of chilotes-type and rufus-type SNP alleles, respectively (Fig. 2A). Interestingly, the ancient haplotypes similar to these divergent haplotypes were found to be fixed in each tribe of Lake Tanganyika cichlids (Fig. 2A, 2B). The ancient chilotes-type haplotype was observed in Boulengerochromini, Trematocarini, Bathybatini, Cyphotilapiini, Cyprichromini, Benthochromini and Perissodini (Fig. 2B). In contrast, the ancient rufus-type haplotype was observed in Lamprologini, Limnochromini, Ectodini, Eretmodini and Tropheini (Fig. 2B). This result suggests that the divergent haplotypes had already been built up before the adaptive radiation of Lake Tanganyika cichlids and that selective sweep occurred in each lineage. Moreover, we also found that both alleles already coexisted in Haplochromini, a riverine tribe of Lake Tanganyika cichlids, implying that the allelic diversity has been retained for a long time or that occasional hybridization has occurred between distinct tribes of Lake Tanganyika cichlids. For example, Irisarri et al. (2018) demonstrated the genomic signature of gene flow between Cyphotilapiini and modern haplochromines (sensu Salzburger et al., 2005).
In contrast, for COL6A6_b, chilotes/sauvagei-type and rufus-type haplotypes were observed in Lake Victoria cichlids, whereas only one major haplotype was found in Lake Tanganyika cichlids (Fig. 2C, 2D). However, we found that modern Haplochromini had a higher proportion of allelic polymorphisms than other ancestral cichlids (Fig. 2D). The phylogenetic analysis by Nakamura et al. (2021) showed that coding sequences of H. burtoni, a species in modern Haplochromini, and L. rufus were monophyletic. Taken together, these results suggest that the allelic polymorphisms on which selective sweep might subsequently work were acquired in modern Haplochromini. Additionally, they imply that the allelic polymorphisms observed in Lake Victoria cichlids existed in the Upper Nile lineage, which is one of the parental lineages (Fig. 2D). Notably, our results indicated that the timing of the acquisition of the genetic resource in modern Haplochromini is inconsistent with that of the establishment of divergent haplotypes after the admixture of the Upper Nile lineage with the Congolese lineage, which could be a clear sign that the SGV contributed to the rapid adaptation when cichlids were exposed to sudden environmental changes (Fig. 1; Fig. 2D).
However, whether the divergent alleles derived from SGV facilitated the rapid speciation and adaptation of Lake Victoria cichlids is debatable. For example, Guerrero and Hahn (2017) stated that genetic drift for SGV, which is sieved during speciation (i.e., unequally allocated SGV to diverging populations), can result in a genomic signature similar to that under natural selection. Recent hybridization between long-isolated populations could also result in the establishment of the long divergent haplotypes just by genetic drift. Therefore, it is necessary to verify that parallel evolution via adaptation to similar environments resulted in the sharing of divergent alleles among East African cichlids. Note that long divergent haplotypes among Lake Victoria cichlids, such as those of COL6A6_a and V1R2 (Nikaido et al., 2014), rarely experienced a collapse by recombination despite a long period of cichlid evolution. One possible scenario is that these large-scale fragments including the beneficial alleles have been reused as bases for rapid adaptation when a population was exposed to a similar environment, resulting in parallel evolution. An alternative scenario is that one divergent haplotype occupied each species by genetic drift because of its high frequency after hybridization. In the latter case, we ask whether each long divergent haplotype has constantly been under selection, leading to the avoidance of collapse by mutation and recombination during a long period of cichlid evolution. For example, recombination between divergent haplotypes after the acquisition of novel functions for adaptation might have reduced fitness owing to an unstable higher-order protein structure.
In Nakamura et al. (2021), we did not test whether the mechanism facilitating interspecies differentiation for each SGV was selection or genetic drift. In addition, the functions of many candidate genes with divergent alleles derived from SGV, including the two COL6A6 genes, are still unknown. A recent finding suggested that V1R2-expressing neurons became activated when cichlids were exposed to male urine, implying the contribution of divergent alleles of V1R2 to assortative mating, which led to speciation via prezygotic isolation (Kawamura and Nikaido, 2022). Thus, in the future, it will be crucial to thoroughly test whether the divergent alleles derived from SGV have advantageous effects on the adaptation and speciation of Lake Victoria cichlids by examining their functions via in situ hybridization and gene knockout.
Meier et al. (2017) postulated that the common ancestor of Lake Victoria cichlids obtained a large allelic diversity for subsequent adaptive radiation through hybridization between two distinct parental lineages. However, our analyses showed that divergent haplotypes that had already originated before this hybridization might have contributed to the adaptive radiation of Lake Victoria cichlids. Strikingly, they suggest that some genes with divergent alleles derived from SGV arose before the adaptive radiation of Lake Tanganyika cichlids, including the V1R2 and COL6A6_a genes, implying a large genetic diversity of the ancestors of Lake Tanganyika cichlids. The “Melting-pot Tanganyika” hypothesis proposed by Weiss et al. (2015) also mentions a possible scenario resulting in large genetic variation in the ancestral lineage of Lake Tanganyika cichlids (Fig. 3). Taking these considerations together, we speculate that many long divergent haplotypes among Lake Victoria cichlids were established much earlier than the ancient hybridization shown by Meier et al. (2017), and that the acquisition of the genetic variation predated the formation of the common ancestor of Lake Tanganyika cichlids. Through repeated hybridization among East African cichlids, as indicated by several previous studies (Meier et al., 2017; Meyer et al., 2017; Irisarri et al., 2018; Svardal et al., 2020; Astudillo-Clavijo et al., 2023), these haplotypes might have eventually been inherited by the ancestor of Lake Victoria cichlids while being occasionally reused for adaptation (Fig. 3). This scenario can be extended to the rapid diversification of other cichlids, including Lake Malawi cichlids. Conversely, large-scale SGV observed in recently emerged species, including Lake Malawi and Lake Victoria cichlids, should provide valuable clues for exploring not only recent but also ancient hybridization and rapid adaptation.
A possible evolutionary scenario under the “Melting-pot Tanganyika” hypothesis proposed by Weiss et al. (2015) with SGV that could repeatedly trigger the adaptive radiation in East African cichlids. The phylogeny is depicted based on the results of Meier et al. (2017) and Ronco et al. (2021), as shown in Fig. 1. Gray arrows indicate possible hybridization between lineages, Cyphotilapiini and modern haplochromines inferred by Irisarri et al. (2018), modern Haplochromini and the ancestor of Lake Malawi cichlids by Svardal et al. (2020), Upper Nile lineage and the ancestor of Lake Victoria cichlids by Meier et al. (2017). The patterns in the circles represent different alleles. Hybridization among the distinct precursor lineages yielded large genetic variation in the ancestral lineage of Lake Tanganyika cichlids, leading to rapid speciation and adaptation. The allelic polymorphism, SGV, has been maintained in East African cichlids via occasional hybridization (gray arrows) to the ancestral species in each lake, ultimately contributing to the adaptive radiation.
Furthermore, the reuse of adaptive alleles derived from SGV facilitates not only repeated adaptive radiation but also parallel evolution of East African cichlids (Waters and McCulloch, 2021). This is because when distinct populations (e.g., populations in different lakes) adapt to similar ecological niches, the same adaptive allele may be used for their rapid specialization if it is retained as an SGV within each population, thereby leading to convergent evolution at the genetic and phenotypic levels. For example, large indels were observed to be shared among species that inhabit different lakes but have the same diet, implying repeated allelic reuse for trophic specialization (McGee et al., 2020). Similarly, previous studies revealed that an allele responsible for exhibiting stripe patterns, which is a convergent trait among East African cichlids, has been sorted to Lake Victoria cichlids from SGV during radiation (Kratochwil et al., 2018; Urban et al., 2021). Therefore, the combinations of genes with divergent alleles derived from SGV found in this study may have caused the tremendous species diversity of East African cichlids with complex traits that sometimes converge across lakes and rivers.
In summary, we identified genes with divergent alleles that originated before the adaptive radiation of Lake Victoria cichlids, suggesting the importance of SGV in their ancestral populations for the frequent adaptive radiation of East African cichlids, and not just Lake Victoria cichlids. We can now access a large amount of genomic data on cichlids (Malinsky et al., 2018; McGee et al., 2020; Ronco et al., 2021) for further exploration of SGV that has contributed to rapid speciation and adaptation of cichlids. Our study provides a cornerstone for understanding the complex evolution of cichlids driven by SGV. Further investigations of both the origins and functions of divergent alleles will illuminate the evolutionary mechanism that could repeatedly trigger explosive speciation in East African cichlids.
This work was supported by JSPS KAKENHI (17H04606, 20KK0167 to M. N., 20J13861 to H. N.) and MEXT KAKENHI (221S0002 to M. N.).