Genes & Genetic Systems
Online ISSN : 1880-5779
Print ISSN : 1341-7568
ISSN-L : 1341-7568
Full papers
Promoter generation for the chimeric sex-determining gene dm-W in Xenopus frogs
Shun HayashiKei TamuraDaisuke TsukamotoYusaku OgitaNobuhiko TakamatsuMichihiko Ito
Author information
JOURNAL OPEN ACCESS FULL-TEXT HTML
Supplementary material

2023 Volume 98 Issue 2 Pages 53-60

Details
ABSTRACT

Many sex-determining genes (SDGs) were generated as neofunctionalized genes through duplication and/or mutation of gonadal formation-related genes. We previously identified dm-W as an SDG in the African clawed frog Xenopus laevis and found that a partial duplication of the masculinization gene dmrt1 created the neofunctionalized dm-W after allotetraploidization by interspecific hybridization. The allotetraploid Xenopus species have two dmrt1 genes, dmrt1.L and dmrt1.S. Xenopus laevis dm-W has four exons: two dmrt1.S-derived exons (exons 2 and 3) and two other exons (noncoding exon 1 and exon 4). Our recent work revealed that exon 4 originated from a DNA transposon, hAT-10. Here, to clarify when and how the noncoding exon 1 and its coexisting promoter evolved during the establishment of dm-W after allotetraploidization, we newly determined nucleotide sequences of the dm-W promoter region from two other allotetraploid species, X. largeni and X. petersii, and performed an evolutionary analysis. We found that dm-W acquired a new exon 1 and TATA-type promoter in the common ancestor of the three allotetraploid Xenopus species, resulting in the deletion of the dmrt1.S-derived TATA-less promoter. In addition, we demonstrated that the TATA box contributes to dm-W promoter activity in cultured cells. Collectively, these findings suggest that this novel TATA-type promoter was important for the establishment of dm-W as a sex-determining gene, followed by the degeneration of the preexisting promoter.

INTRODUCTION

Sex-determining genes (SDGs), which include various protein-coding genes for transcription factors, membrane receptors and extracellular ligands, are often generated by neofunctionalization (Mawaribuchi et al., 2012; Ito and Mawaribuchi, 2013). We previously discovered the female genome-specific dm-W as an SDG, which emerged from partial duplication of the masculinizing gene dmrt1, in the allotetraploid frog Xenopus laevis (Yoshimoto et al., 2008, 2010; Yoshimoto and Ito, 2011). Moreover, we reported that dm-W was generated after allotetraploidization by hybridization between two related species with L and S diploid genomes approximately 17–18 million years ago (Session et al., 2016; Mawaribuchi et al., 2017b). The allotetraploid X. laevis has two dmrt1 genes, dmrt1.S and dmrt1.L. dmrt1.L has two distinct promoters derived from noncoding exon (Ex) 1 and intron 1 for germ and somatic cell expression, respectively, as X. tropicalis dmrt1 (Mawaribuchi et al., 2017a). In contrast, dmrt1.S has no noncoding Ex1, possibly due to degradation after allotetraploidization, which results in only somatic cell-specific expression. In addition, dmrt1.L shows significantly lower expression than dmrt1.S in gonadal somatic cells of X. laevis ZZ gonads throughout development (Mawaribuchi et al., 2017a). These findings suggest that dmrt1.L and dmrt1.S are subfunctionalized for germ cell development and somatic cell masculinization, respectively.

The dm-W gene on chromosome 2L emerged as an SDG from the partial duplication of dmrt1.S on chromosome 1S soon after allotetraploidization (Bewick et al., 2011; Mawaribuchi et al., 2017b). Interestingly, dm-W has not only the two dmrt1.S-derived exons, Ex2 and Ex3, but also a noncoding Ex1 and Ex4 (Yoshimoto et al., 2008). Very recently, we demonstrated that Ex4 originated from a hAT-10 DNA transposon, indicating that dm-W is a chimeric gene (Hayashi et al., 2022). However, the origin of the noncoding Ex1 remains unknown. In this study, to understand how dm-W Ex1 emerged and become functional, we newly isolated the region corresponding to X. laevis dm-W Ex1 from two allotetraploid-derived Xenopus species, X. petersii and X. largeni, and investigated the promoter evolution of neofunctionalized dm-W. Evolutionary analysis revealed that dm-W newly acquired a TATA-type promoter and noncoding Ex1 in the common ancestor of the three Xenopus species.

RESULTS

Generation of the noncoding Ex1 from dm-W

The allotetraploid X. laevis (Xl) dmrt1.L and diploid X. tropicalis (Xt) dmrt1 have two distinct promoters derived from the noncoding Ex1 and intron 1, whereas Xl dmrt1.S and Xl dm-W have only one promoter (Yoshimoto et al., 2008; Mawaribuchi et al., 2012, 2017a). To understand the molecular evolution of the orphan noncoding Ex1 and its coexisting promoter, we first examined the evolutionary relationships of the Ex1- or intron 1-derived promoters and their surrounding sequences among Xl dmrt1.L/S, Xt dmrt1 and Xl dm-W by mVISTA (http://genome.lbl.gov/vista) analysis. We detected some homologous regions in and around the noncoding exon 1s between Xl dmrt1.L and Xt dmrt1 (Fig. 1 and Supplementary Fig. S1). Some homologous sequences in the far upstream regions of the coding Ex1 of Xl dmrt1.S were shared in the upstream regions of the noncoding exon 1s of Xl dmrt1.L and Xt dmrt1, indicating that Ex1 of the ancestral dmrt1.S was lost after allotetraploidization. Notably, the sequences in and around Ex1 of Xl dm-W had no homology with those of the upstream regions of the three Xenopus dmrt1s (Fig. 1B, lower panel), indicating that the origin of the noncoding Ex1 of Xl dm-W differed from that of Xl dmrt1.L and Xt dmrt1.

Fig. 1.

Intron 1-derived TATA-less promoter of dmrt1 for gonadal somatic cell expression. (A) Phylogenetic relationship among four Xenopus species used in this study. Ploidy and gene information are shown on the right. (B) mVISTA plots of X. laevis (Xl) dmrt1.L (upper) or Xl dm-W (lower) with the other three subfamily genes, using the genomic regions covering the noncoding Ex1 and Ex2. Dark and light blue colors indicate protein-coding and noncoding exons, respectively. Pink shows conserved noncoding sequences. (C) Sequence alignment of the corresponding regions to the upstream region and Ex2 of Xl dmrt1.L among the four. Purple and yellow indicate GC and CAAT boxes, respectively. Arrows indicate the 5′ site of Xl dm-W Ex2 or the transcription start site (TSS) of Xl dmrt1.L/S, and asterisks indicate identical nucleotides among all the sequences. Red shows a deletion in dm-W.

The intron 1-derived TATA-less type promoter in the ancestral dm-W was lost by its deletion

We next examined the intron 1-derived promoters for gonadal somatic cell expression in the tetraploid X. laevis and diploid X. tropicalis by aligning the three dmrt1 genes and searching for the eight core promoter elements (see Materials and Methods). All three promoters of Xl dmrt1.L, Xl dmrt1.S and Xt dmrt1 contained three GC boxes at and just upstream of the TSSs, which were determined by 5′-RACE using adult testis RNAs (Mawaribuchi et al., 2017a), but no TATA boxes (Fig. 1C), indicating that they belong to the TATA-less promoter category. Xl dm-W had no region corresponding to the TATA-less promoters: approximately 300 bp of the core promoter region containing the three GC boxes and TSS were deleted. Because mVISTA analysis revealed that all three dmrt1 genes shared homologous sequences with the upstream region of the dm-W Ex2 (Fig. 1B and Supplementary Fig. S1), the intron 1-derived promoter must have existed in the prototype of dm-W (Fig. 1C and Supplementary Fig. S2). These findings indicated that the acquisition of a new promoter in the ancestral dm-W was connected with the expendable nature of the intron-derived promoter, resulting in its deletion.

dm-W acquired a TATA-type promoter and noncoding Ex1 in the common ancestor of the three allotetraploid Xenopus species

To clarify when and how a new promoter with a noncoding Ex1 of dm-W emerged, we next isolated its corresponding regions from the other two allotetraploid Xenopus species, X. petersii and X. largeni (see Fig. 1A and Materials and Methods); approximately 300 bp of the amplified DNA fragments were sequenced (GenBank accession numbers LC654432 and LC654431, respectively). Sequence comparison showed 89% nucleotide identity among the three species (Fig. 2B). A search for the eight core promoter elements indicated that all three sequences shared three core elements: a GC box, a TATA box and a downstream core promoter element (DPE), from −60 to −53, −41 to −35 and +29 to +33, respectively, around the TSS of X. laevis dm-W (Fig. 2B). These results indicated that dm-W acquired a TATA-type promoter in the common ancestor of the three allotetraploid species.

Fig. 2.

Noncoding Ex1 (ncEx1)-derived TATA-type promoter of dm-W for gonadal somatic cell expression. (A) Schematic diagram of a somatic TATA-less promoter region from X. laevis dmrt1.S and the corresponding regions from X. laevis and X. largeni dm-Ws. Red letters correspond to the translation initiation codon AUG. (B) Sequence comparison of TATA-type promoter regions among X. laevis, X. petersii and X. largeni dm-Ws. Purple, green and blue boxes indicate the GC box, TATA box and DPE, respectively. Red letters and arrows show tandem repeats. (C) Distribution of TEs in and around the ncEx1 of X. laevis, identified using the GIRI CENSOR program. Blue and orange boxes indicate DNA transposon- and retrotransposon-derived fragments, respectively. Noncoding and coding exons are shown as white and gray boxes, respectively. The 325-bp sequence containing the ncEx1, which was not recognized as possessing TEs, includes the core promoter elements, GC box (purple) and TATA box (green).

Transposable elements (TEs) around the noncoding Ex1 of dm-W

We previously reported that TEs occupy over half of the dm-W-containing W-specific region (Mawaribuchi et al., 2017a) and that dm-W Ex4 emerged from a hAT-10 transposon under activation of major DNA transposons including Kolobok and hAT superfamilies just after hybridization (Hayashi et al., 2022; Suda et al., 2022). To elucidate the relationships between such TE migrations and generation of the new TATA-type promoter with the noncoding Ex1, we searched TE distribution in about 29,000 bp (−15,400 to +13,683) including the upstream promoter region, noncoding Ex1 and intron 1 of X. laevis dm-W using the CENSOR program (Fig. 2C and Supplementary Fig. S3). TEs accounted for about 87% of this region. Although 325 bp around the noncoding Ex1 was not recognized as a TE, there were numerous DNA transposon-derived fragments near the noncoding Ex1 (Fig. 2C). Although the CENSOR program did not detect short TE sequences (Fig. 2C), we manually found two tandem repeats of 8 bp (5′-AGCACAGA-3′) from −15 to +1 including the TSS in X. laevis, which completely coincided and shared partial identities with those from X. petersii and X. largeni (Fig. 2B). Such tandem repeats may be residual sequences of transposons, which are known as target site duplications (TSDs). Animal and plant hAT transposon superfamily members often contain a TSD consisting of an 8-bp tandem repeat sequence at both of their ends (Kawakami et al., 2000; Xu and Dooner, 2005). We then searched for the 8-bp sequence 5′-AGCACAGA-3′ at the ends of X. laevis hAT transposons in the X. laevis genome and found completely identical sequences on both sides in two subfamilies, hAT-N11_XT and hAT-N12_XT, or on one side in six subfamilies, hAT-N12_XT, hAT-N2_XL, hAT-N1_XL, hAT-N12_XL, hAT_N29_XL and OCR-a_XL. Moreover, we detected 15 sites comprising 8-bp tandem repeat sequences ([5′-AGCACAGA-3′] × 2) in the X. laevis genome, which may be residual sequences of these hAT transposon subfamilies. These findings suggest that the dm-W promoter region includes transposon-derived sequences. By contrast, the tandem repeat sequences created one 5′-AGAARVAG-3′-like sequence as follows: 5′-AGCACAGAAGCACAGA-3′. Because the sequence is involved in the transcription of many genes in X. tropicalis (van Heeringen et al., 2011), it is possible that the tandem sequences participate in the transcriptional machinery through the 5′-AGAAGCA-3′. We next searched for and found candidate regulatory cis-elements, to which several gonadal somatic cell transcription factors could bind, in 4 kbp of the transposon-rich region upstream of the TSS of dm-W (Supplementary Fig. S4). It is possible that some transposon-derived sequences evolved as cis-elements for dm-W transcription.

The GC and TATA boxes upstream of the noncoding Ex1 in X. laevis dm-W contribute to core promoter activity in cultured cells

To evaluate the promoter activity of X. laevis dm-W, we constructed a promoter-reporter plasmid containing 325 bp from −176 to +149, including the putative TATA and GC boxes in the Xl dm-W gene, upstream of a luciferase gene and its derivative reporters harboring mutations of the TATA box (TATA box MT) and the GC box (GC box MT) (see Materials and Methods; Fig. 3). Because there were no dm-W-expressing cultured cells among Xenopus cell lines, we examined their core promoter activities by transfecting the reporter plasmids into three cell lines, human embryonic kidney-derived (HEK) 293T cells, X. laevis A6 kidney epithelial cells and X. laevis XL-B4 myoblast cells (Tamura et al., 2004, 2015; Fujitani et al., 2020). We detected approximately five times higher luciferase activity using the wild-type reporter than the empty control (Fig. 3). The activities from both mutant reporters, TATA box MT and GC box MT, were significantly lower than those from the corresponding wild-type reporters. These results indicate that the upstream sequence of the noncoding Ex1 has the potential to function as a core promoter supported by the two core promoter elements, namely the TATA and GC boxes.

Fig. 3.

Core promoter activity through the TATA and GC boxes of X. laevis dm-W in transfected cultured cells. (A) Schematic diagram of the luciferase reporter plasmids containing the dm-W sequence from −176 to +149 (wild-type) and its mutants of the TATA box (TATA box MT) and GC box (GC box MT). (B–D) Promoter assay of X. laevis dm-W was performed by transfecting the luciferase reporters and wild-type, TATA box MT or GC box MT into human embryonic kidney (HEK293T) cells (B), X. laevis A6 cells (C) or X. laevis XL-B4 cells (D). Transfections were done three times in each cell line; error bars represent the standard errors of the means. a–d: means with different letters are significantly different from each other, tested using one-way ANOVA, followed by the Tukey–Kramer HSD test (P < 0.05).

DISCUSSION

In this study, we identified a TATA-type promoter upstream of the noncoding Ex1 of X. laevis dm-W (Fig. 2 and 4A) and confirmed its activity in cultured cells (Fig. 3). This TATA-type promoter for gonadal somatic cell expression was evolutionarily and functionally distinct from the germ cell promoters upstream of the noncoding exon 1s of X. laevis dmrt1.L and X. tropicalis dmrt1 (Fig. 2B); the upstream regions of both the noncoding Ex1s contained three GC boxes and one CAAT box, but no TATA box, which indicated that they had TATA-less promoters (Fig. 4A; Supplementary Fig. S5).

Fig. 4.

Proposed model for promoter evolution of Xenopus dmrt1 subfamily genes. (A) Promoter variation in Xenopus dmrt1 subfamily genes. Although one dmrt1.S and two dmrt1.L promoters contain no TATA box (Fig. 1C and Supplementary Fig. S5), dm-W acquired a TATA-type promoter for gonadal somatic cell expression (Fig. 2B). (B) An evolutionary model for the emergence of dm-W through three independent insertions.

Based on the findings of this study, we propose a model for promoter generation of dm-W (Fig. 4B). Approximately 17–18 million years ago, two closely related Xenopus species, each of which had a distinct diploid L or S genome, hybridized, resulting in allotetraploidization (Session et al., 2016), which may have caused instability of sex-determining systems, possibly because of the mixture of the two systems. In such a situation, a new SDG could have emerged in the ancestor of the allotetraploid frogs. The duplicates of Ex2 and Ex3 of dmrt1.S on chromosome 1S were transferred into chromosome 2L just after hybridization (Bewick et al., 2011; Mawaribuchi et al., 2017b). We recently reported an evolutionary analysis suggesting that DNA transposons including the hAT superfamily became activated just after hybridization (Hayashi et al., 2022; Suda et al., 2022). Moreover, we found that Ex4 originated from a noncoding portion of the hAT-10 subfamily of DNA transposons and that Ex4 was generated before the diversification of most allotetraploid Xenopus species (Hayashi et al., 2022). Under the activation of DNA transposons just after hybridization, a new TATA-type promoter in the dm-W ancestor was generated. Here, we detected 8-bp tandem repeat sequences in the promoter region including the noncoding Ex1 (Fig. 2B), which might be residual sequences of the hAT superfamily. This promoter generation may therefore be related to TEs (Supplementary Fig. S3), although what the promoter region originated from remains unknown.

Yang et al. (2007) reported that housekeeping and non-housekeeping genes often have TATA-less and TATA promoters, respectively. Many TATA-type promoters are involved in the enhanced spatiotemporal expression of neofunctionalized genes such as dm-W. We previously determined that dm-W is expressed prior to dmrt1.S around the sex determination stages (Yoshimoto et al., 2008; Mawaribuchi et al., 2017a; Fig. 4A). Taken together, these observations indicate that the TATA-type promoter and its associated enhancers could induce earlier expression of dm-W than of dmrt1.S in gonadal somatic cells. Therefore, these newly emerged TATA-type promoters may have been necessary for the establishment of dm-W as an SDG in the common ancestor of at least three allotetraploid Xenopus species.

MATERIALS AND METHODS

Bioinformatics analysis

Nucleotide sequences of X. tropicalis dmrt1, X. laevis dmrt1.S, X. laevis dmrt1.L and X. laevis dm-W analyzed in this study were obtained from the X. laevis and X. tropicalis genomes v9.2 and v10.0, respectively (http://www.xenbase.org/entry/ and https://www.ncbi.nlm.nih.gov/). Broad comparative analysis of Xenopus dmrt1 and dm-W genomic regions was performed using mVISTA (http://genome.lbl.gov/vista) with the LAGAN alignment program (Frazer et al., 2004). For detailed sequence comparison, alignments of multiple sequences were performed using MASCLE with default settings (Edgar, 2004). TEs were detected using CENSOR software (http://www.girinst.org/censor/index.php) (Kohany et al., 2006) and BLAST searches. The core promoter elements, namely TATA box, CAAT box, GC box, initiator, TFIIB recognition element, DPE, motif ten element and downstream core element I–III, were predicted based on JASPAR (http://jaspar.genereg.net/).

Isolation of dm-W exon 1 and exon 2 from two Xenopus species

Adult livers of two Xenopus species, X. largeni and X. petersii, were obtained from the Cryogenic Collection, Museum of Comparative Zoology, Harvard University. Their genomic DNAs were purified using the phenol–chloroform extraction method. PCR was performed using KOD FX DNA Polymerase (Toyobo) for cloning of the 1st and 2nd exons. For exon 1, the primers 5′-AAGCTTACAAAATGTAGCAAATTTA-3′ and 5′-TTCTCAGCTTTCACTTTCCAGC-3′ were used. For exon 2, 5′-CATTGTTTCCTGTTCTGAAAA-3′ and 5′-TAATATACATCACTGCAGGTC-3′ or 5′-TTGTTTCCTGTTCTGAAAATG-3′ and 5′-GTCTATCAGTGTTAATCTCAC-3′ were used for the 1st or nested PCR, respectively. DNA fragments were purified using NucleoSpin Gel and a PCR Clean-up kit (Macherey-Nagel), and then inserted into the EcoRV site of pBluescript KS (+). DNA sequencing was performed using the BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems).

Plasmid constructs for promoter reporters

We isolated 325 bp in and around X. laevis exon 1 using PCR with genomic DNA and specific primer pairs as follows: 5′-TCCTGATGTAGCCTGCACTGAAG-3′ and 5′-TTTCACTTTCCAGCCCTTAAACTG-3′ for the 1st PCR and 5′-GAGCTAACATAACCCTGAAGCTTACAAAATGTAGC-3′ and 5′-AGCTCGGTACCTCCCAACTGGTTAGCTCACCAAAC -3′ for the nested PCR. The obtained fragments were then inserted into the promoter-less luciferase vector pGV-B (Toyo Bnet Bio Products), and the resultant plasmid was named pncEx1-luc. Mutated reporter plasmids containing mutations in putative TATA and GC boxes, named pncEx1-TATAmt-luc and pncEx1-GCmt-luc, respectively, were constructed using the following primer pairs: 5′-GGGAGAAGACTGGGCATGCAGTG-3′ and 5′-TTCTCCCAAAAACATCACCCACGC-3′ for the former and 5′-GGAAGTGGGTGATGTTTTTTTTA-3′ and 5′-CCACTTCCACAAGACTTATTCA-3′ for the latter.

Cell lines, cell culture, transfection and luciferase reporter assay

Xenopus laevis A6 kidney epithelial cells, X. laevis XL-B4 myoblast cells and human embryonic kidney (HEK) 293T cells were cultured as described previously (Tamura et al., 2004, 2015; Fujitani et al., 2020). All three cell lines were tested for mycoplasma contamination. HEK293T or A6 and XL-B4 cells were recently authenticated by STR analysis or genetic analysis, respectively. The cultured cells from each cell line were plated at a density of 1 × 105 cells per well in a 24-well plate. After 24 h, the cells were transfected with firefly luciferase reporter plasmid (100 ng) and Renilla luciferase vector pRL-SV40 (20 ng; Promega) using TransIT-LT1 (Mirus). The total DNA was maintained at 500 ng per transfection with the pcDNA3 empty vector. Luciferase activity was measured using a Luminocounter 700 (NITI-ON) at 24 h post-transfection. Each firefly luciferase activity was normalized to Renilla luciferase activity using the dual luciferase assay system (Promega).

Transcription factor binding sites

Transcription factor binding sites were predicted by the MEME FIMO program (https://meme-suite.org/meme/tools/fimo). The matrix data used for the analysis are MA0077.1(Sox9), MA0482.2(Gata4), MA0514.1(Sox3), MA0886.1(Emx2), MA1603.1(Dmrt1) and MA1607.1(Foxl2) in JASPAR (http://jaspar.genereg.net/). These six transcription factors were selected as expressed genes in undifferentiated and differentiating gonads from stage 50 to 62 in X. laevis (Piprek et al., 2018).

DECLARATIONS

Authors’ contributions: Conceptualization, S. H. and M. I.; formal analyses, S. H., K. T., D. T., N. T. Y. O. and M. I.; methodology, S. H.; data curation, S. H. and M. I.; writing, S. H., and M. I.; supervision, M. I.; funding acquisition, M. I. All authors have read and agreed to the manuscript.

Informed consent statement: Not applicable.

Conflicts of interest: The authors declare no conflict of interest.

ACKNOWLEDGMENTS

We would like to thank Dr. Breda Zimkus, the Cryogenic Collection, Museum of Comparative Zoology, Harvard University, for providing the Xenopus samples. All experimental procedures were approved by the Institutional Animal Care and Use Committee of Kitasato University.

REFERENCES
 
© 2023 The Author(s).

This is an open access article distributed under the terms of the Creative Commons BY 4.0 International (Attribution) License (https://creativecommons.org/licenses/by/4.0/legalcode), which permits the unrestricted distribution, reproduction and use of the article provided the original source and authors are credited.
https://creativecommons.org/licenses/by/4.0/legalcode
feedback
Top