2023 Volume 98 Issue 3 Pages 155-160
Eukaryotic cells contain multiple types of duplicated sequences. Typical examples are tandem repeat sequences including telomeres, centromeres, rDNA genes and transposable elements. Most of these sequences are unstable; thus, their copy numbers or sequences change rapidly in the course of evolution. In this review, I will describe roles of subtelomere regions, which are located adjacent to telomeres at chromosome ends, and recent discoveries about their sequence variation.
Eukaryotic cells possess linear chromosomes. Telomeres, which exist at the ends of linear chromosomes, are composed of species-specific tandem repeat DNA (for instance, [TTAGGG]n in most vertebrates) and various associated proteins. Telomeres play crucial roles in genome integrity, such as protection of chromosome ends and regulation of chromosome dynamics during mitosis and meiosis (Chikashige et al., 1994; Miyoshi et al., 2008; Fujita et al., 2012; de Lange, 2018). Generally, there exist subtelomeres adjacent to telomeres. DNA sequences of subtelomeres are different from those of telomeres: subtelomere DNAs do not consist of simple tandem repeat sequences, but contain multiple segments that are highly conserved among the subtelomeres of each species. Human subtelomeres are mosaics of ~50 types of common segments, and the copy number of each segment varies between subtelomeres and individuals (Linardopoulou et al., 2005; Riethman et al., 2005; Stong et al., 2014). Similarly, the budding yeast Saccharomyces cerevisiae contains X and Y´ elements in its subtelomere DNA, and the copy numbers of these elements vary among strains (Louis, 1995). The unstable and repetitive natures of subtelomeres cause technical difficulties in DNA sequencing. In fact, the sequence information of subtelomeres remains incomplete even two decades after most other parts of the genome sequences were reported in many model organisms. However, the whole subtelomere sequences in the fission yeast Schizosaccharomyces pombe and in human have been reported recently (Oizumi et al., 2021; Nurk et al., 2022). In this review, recent discoveries about the subtelomere sequences in S. pombe will be described.
Schizosaccharomyces pombe possesses only three linear chromosomes (chromosomes 1, 2 and 3) and grows as a haploid under normal nutrient-rich conditions. In the standard wild-type strain (972), chromosomes 1 and 2 contain approximately 50 kb of subtelomeric homologous (SH) sequences adjacent to the telomeres (the telomere repeat sequence: [TTACAG2-5]n). There are no SH sequences on chromosome 3, although some S. pombe strains possess a partial (~16-kb) SH sequence between the left and/or right telomeres and rDNA (ribosomal DNA) repeats on chromosome 3; thus, S. pombe generally has only 4–6 SH sequences in total (Ohno et al., 2016; Tashiro et al., 2017; Oizumi et al., 2021) (Fig. 1). Telomere-binding proteins and an RNAi machinery that acts on part of the SH sequence independently recruit the histone methyltransferase Clr4 to the SH regions, resulting in condensed heterochromatin formation around the SH regions (Cam et al., 2005; Kanoh et al., 2005; Sugiyama et al., 2007). Heterochromatin formed in the SH region, like heterochromatin in other chromosomal regions, has a strong transcriptional repressive effect; however, most of the genes in the SH region are pseudogenes (see PomBase, https://www.pombase.org/), and currently the only genes that have been characterized so far are the RecQ-type DNA helicase genes (tlh1–4), which are localized in all of the SH regions of the strain 972. The tlh1–4 genes are essential for RNAi to function in heterochromatin formation, as part of their transcript is the source of the siRNA that RNAi acts on as described above (Cam et al., 2005; Kanoh et al., 2005). While the mechanisms of subtelomeric heterochromatin formation have been elucidated in detail, the intracellular function of subtelomeric heterochromatin is currently not well understood.
Schematic illustration of the structures of subtelomeres in strain 972. 972 contains four SH sequences, SH1L and SH1R on the left and right arms of chromosome 1 (Ch1), respectively, and SH2L and SH2R on the left and right arms of chromosome 2 (Ch2), respectively (indicated by pale green boxes), adjacent to the telomeres. The SH region (~50 kb) adopts a heterochromatin structure. On the other hand, the SH-adjacent SU region (~50 kb) does not show high sequence identity with other subtelomeres, but forms a common condensed chromatin structure, the knob. Dark green semicircles, telomeres; red boxes, centromeres.
The region of approximately 50 kb adjacent to the SH sequence of chromosomes 1 and 2 of 972 is called the SU (subtelomeric unique) region because it consists of almost unique sequences except for some common sequences, such as retrotransposon LTR (long terminal repeat) sequences and L-asparaginase genes (Tashiro et al., 2017). The reason why SU regions are included in the subtelomeres is that they share a common condensed chromatin structure, a knob, that is stained by DAPI more intensely than subtelomeric heterochromatin and shows low levels of methylation at histone H3K4, H3K9 and H3K36 (Matsuda et al., 2015), even though the sequence similarities between SU regions are not high overall. Interestingly, knobs are formed only during interphase in the cell cycle (Matsuda et al., 2015). This is because the Sgo2 protein, which is localized at centromeres and contributes to accurate chromosome segregation during the M phase (Kitajima et al., 2004; Kawashima et al., 2007, 2010), is recruited to the subtelomere regions specifically in interphase and is essential for knob formation (Tashiro et al., 2016; Kanoh, 2017). The functions of knobs are not yet clearly understood, but at least Sgo2 is important for moderately repressing gene expression in the SU region and controlling the timing of DNA replication in the SH and SU regions (Tashiro et al., 2016). In summary, the subtelomeres of chromosomes 1 and 2 of 972 contain SH regions of common sequences and SU regions of almost unique sequences, in which heterochromatin and knob structures are formed, respectively (Fig. 1).
In addition to the low copy number of the SH sequences and the fact that S. pombe grows stably in haploid form, the subtelomeres of S. pombe do not contain genes that are essential for growth under normal culture conditions; thus, complete or partial deletions of the SH and SU sequences from the genome are executable. In fact, the strain with all SH sequences deleted (SD5: subtelomere deletion 5) has no growth defects, but rather can grow at the same rate as the wild-type strain and maintains normal telomere DNA length. However, heterochromatin formed in the SH region in the wild-type strain invades the entire SU region in the SD5 strain and strongly suppresses gene expression in the SU region, resulting in abnormal stress responses (Tashiro et al., 2017). This indicates that the SH region at least serves as a buffer zone to prevent heterochromatin from invading the SU region. Furthermore, heterochromatin does not spread out of the SU region even when about half of the SU region is deleted in the SD5 strain (Tashiro et al., 2017), suggesting that there is a boundary mechanism at the edge of the SU region that blocks the propagation of heterochromatin towards the centromere.
On the other hand, DNA sequences in the SH and SU regions are critical for survival in situations where telomeric DNA is shortened and unprotected. In fact, each SH sequence in strain 972 contains five homologous sequences (10 in total) that are in the same orientation when the chromosome undergoes self-circularization, and the single-strand annealing (SSA) reaction using one of these homologous sequences causes the chromosome to self-circularize and maintain a relatively stable circular chromosome conformation (Wang and Baumann, 2008). Chromosome 3 of strain 972, which does not possess SH sequences, is thought to undergo self-circularization by chromosome end fusion at the rDNA repeats adjacent to the telomeres at both ends of the chromosome. In fission yeast, this chromosomal self-circularization is a major survival strategy when telomeres become short. In contrast, telomere elongation by homologous recombination, which is common in other species when telomerases are deficient, is rather minor in fission yeast (Nakamura et al., 1998; Tashiro et al., 2017). So, can the loss of telomeric DNA in the SD5 strain, which has lost all SH sequences, prevent chromosomes from self-circularization? Interestingly, SD5 cells can survive by fusing chromosome ends by the SSA reaction using homologous sequences (such as LTRs and L-asparaginase genes, mentioned above) present in or near the SU regions (Tashiro et al., 2017). Surprisingly, the SD5 strain can survive not only by forming self-circularized chromosomes, but also by fusing chromosomes 1 and 2 to produce a circular chromosome. Such a circular chromosome with two centromeres should be unstable due to problems during chromosome segregation, but the SD5-derived chromosome 1–2 fusion overcomes this by inactivating one of the centromeres (Tashiro et al., 2017). Thus, the SH and SU regions (and SU vicinities) play important roles in maintaining gene expression in the SU region and further inside the chromosomes, and in circularizing the chromosomes for survival in the event of telomere shortening.
It is generally very difficult to analyze long, repetitive and/or duplicated sequences such as telomeres, subtelomeres, centromeres and rDNA repeats, even using the newest DNA sequencers that are capable of sequencing relatively long DNA fragments. In the case of subtelomeres, there are two major obstacles to sequencing. One is that each SH sequence is so similar that it is difficult to distinguish which chromosome arm SH is which, and the other is that there are multiple duplicate sequences within each SH sequence, making it also difficult to determine the exact DNA sequence. This problem also applies to subtelomeres in S. pombe, which remained undefined for a long time until we sequenced the DNA.
To distinguish the duplicated SH regions, we constructed the SD4 strains, in which one of the SH regions remains and the others are deleted in strain 972, and the complete SH sequences containing multiple duplicons were successfully determined by sequencing serially deleted fragments of each SH (Oizumi et al., 2021) (Fig. 2). Classification of telomere-proximal parts of SH (SH-P) into multiple common segments showed that the SH-P sequences of 972 have different lengths and composition of common segments for each subtelomere (Oizumi et al., 2021) (Fig. 3A). Moreover, telomere-distal parts of SH (SH-D) also exhibited features similar to those of SH-P, but were more stable than SH-P (Oizumi et al., 2021). Thus, the SH sequences of S. pombe show highly variable and mosaic structures of the common segments, like those of human subtelomeres.
Strategy for DNA sequencing of SH regions. SD4 mutants containing only one SH were produced, and each SH (SH-P or partial SH-D) was amplified by PCR and cloned into a vector. Each plasmid was then serially deleted and re-circularized. SH DNAs were sequenced using common primers (see Oizumi et al., 2021 for details).
High variation in SH-P regions in S. pombe strains. (A) The SH-P regions of the 972 isolate that is used in Junko Kanoh’s laboratory in Japan were classified into common segments (A–X: >95% identity) and their variants (e.g., A1 and A2, and E and E’: 100% identity). For details of the segment classification, see Oizumi et al., 2021. The total length of each SH-P region is indicated. (B) SH-P regions of the 972 isolate (another name, JB22) that is used in Dr. Jürg Bähler’s laboratory in the UK were classified into common segments. Note that the subtelomere sequence information of this strain lacks accuracy; only the segment, not variant, patterns are indicated. (C) An example of the SH-P configurations of a natural isolate of S. pombe. Strain JB858 (also known as CBS10464), collected in Brazil, shows the same patterns of the common segments among its subtelomeres. Δ indicates that the sequence of the associated segment is partial.
Recently, raw genome sequencing data that were determined by next-generation sequencers have become publicly available. Detailed analysis of published (Jeffares et al., 2015; Tusso et al., 2019) but uncharacterized subtelomere sequences of various S. pombe strains revealed interesting features of SH (Oizumi et al., 2021). The SH-P sequences of strain 972 vary from lab to lab; that is, the lengths and composition of common segments differ somewhat among the progeny of 972 (Fig. 3B and see Oizumi et al., 2021 for SH-D). Strain 972 in Bähler’s laboratory in the UK showed the same (SH2R), longer (SH1L and SH2L) or shorter (SH1R) SH-P sequences compared with those of strain 972 in Kanoh’s laboratory in Japan, whereas other chromosomal regions were almost identical (Oizumi et al., 2021). This means that since 972 was first isolated in a Dutch farm (Leupold, 1950), its genomic DNA, especially SH, has gradually changed as it has been distributed to many laboratories.
Furthermore, S. pombe natural isolates other than 972 that have been collected in various countries differ greatly in the arrangement and overall length of SH sequences, although they share some SH segments with 972. Some natural isolates exhibit the same pattern of SH-P segments among their subtelomeres (Fig. 3C), but other strains show different SH configurations and lengths between their subtelomeres, as 972 does. In addition, mitochondrial genome sequences and LTR sequences are found in some subtelomeres of the natural isolates. Furthermore, sequences in SU of 972 are shared with multiple subtelomeres and are classified as SH in some strains (Oizumi et al., 2021). On the other hand, some unique genes in SU of 972 are duplicated in or missing from the subtelomeres of some strains, probably due to chromosome rearrangements, which may have resulted in the acquisition of traits different from 972 (our unpublished observations). It will be interesting to see how the various strains differ in the aforementioned functions of SH and SU, i.e., heterochromatin and knob formation, chromosome circularization during telomere shortening, and gene expression in the SU regions. It should be noted that the subtelomeres are particularly changeable among S. pombe strains, while other parts of the genome are not (Oizumi et al., 2021). Therefore, subtelomeres are hotspots for genome evolution.
Complete sequencing of the SH regions in S. pombe revealed that subtelomeres are highly variable compared with other parts of the genome. Why are subtelomeres so variable? Although the cause has not yet been clarified, the following possibilities are conceivable. First, repeat units within an SH may be amplified or deleted by homologous recombination (HR). Second, a high frequency of HR between the SHs of different subtelomeres (inter-chromosomal repair) may result in gross rearrangement of subtelomeres. Third, mutations and chromosome rearrangement may be induced by break-induced replication, a relatively imprecise repair mechanism (Kramara et al., 2018), which solves replication fork collapses at chromosome ends (telomeres and subtelomeres) that contain repeated sequences. In addition to these direct causes, there may be indirect causes. One of them is that subtelomeres are duplicated between chromosome arms, so that even if a small mutation or deletion occurs, there is a large chance that it will not develop into a life-threatening condition and the cell will survive. If so, the genomic changes would be inherited by the next generation. In the case of subtelomeres in S. pombe, there are no genes essential for proliferation in normal nutrient-rich medium, and no matter how much the subtelomere changes, the changes are stably inherited by daughter cells.
Finally, I will mention the subtelomeres of hominids. The human SH sequence contains many gene sequences, whose copy number is known to vary among humans (Linardopoulou et al., 2005). The variation in copy number may cause the development of various diseases, but is also highly likely to generate human diversity. On the other hand, regions adjacent to some of the telomeres in chimpanzees, bonobos and gorillas, which are evolutionarily most closely related to humans, contain a huge 32-base repeat sequence called a subterminal satellite (StSat), which is absent in humans (Royle et al., 1994). Whether the presence or absence of StSat contributes to the differences between humans and great apes is a very interesting question.
In summary, the instability of subtelomeres may not necessarily indicate weakness as an organism, but rather strength, i.e., a faster rate of evolution. In other words, could it be that the high frequency of subtelomere changes is not a mistake for the cell, but a strategy for evolution? In any case, subtelomeres are a very important chromosomal region for exploring the history of genome evolution.
This work was supported by Japan Society for the Promotion of Science KAKENHI (JP20H03185, JP21K19208, JP21H00244, JP22H04685 and JP23H02408), the Ohsumi Frontier Science Foundation and the Takeda Science Foundation to J. K.