2024 Volume 99 Article ID: 24-00061
Mycoplasmas, autonomously culturable bacteria with the smallest genome, are an important organism to understand the minimal form of life. Mutagenesis using mutagens is a useful methodology for understanding the essential regions of genomic information. Ultraviolet light (UV) and trimethyl psoralen (TMP) are mutagens known to induce various mutations; the latter is reported to specifically induce deletions in nematodes. However, their mutagenic effects on mycoplasma are not known. Here, we exposed Metamycoplasma salivarium to UV-C light or TMP and UV-A as mutagens, and analyzed the mutational pattern after serial cultivation ranging from 34 to 56 rounds for different lineages. Our results showed that more deletions, but fewer point mutations, were induced with TMP and UV-A than with UV-C, indicating the usefulness of TMP in inducing deletions. In addition, we compared our results with mutational data from other studies, which suggested that the combination of TMP and UV-A or UV-C exposure both induced point mutations that were highly biased toward C→T and G→A transitions. These data provide useful basic knowledge for mutational studies on M. salivarium.
Mutagens are used to generate mutational variants to understand gene functions (Cordes, 2005). Although directed mutagenesis has become more accessible using CRISPR/Cas-9 (Doudna and Charpentier, 2014), random mutagenesis is effective in obtaining information at the whole-genome scale. To induce point mutations randomly, N-ethyl-N-nitrosourea has been a popular choice when conducting mouse studies (Stottmann and Beier, 2014). When targeting Caenorhabditis elegans, Drosophila melanogaster and plants, ethyl methanesulfonate has been used as a chemical mutagen (Gengyo-Ando and Mitani, 2000; Lin et al., 2014; Chen et al., 2023). Understanding the mutation trends induced by each mutagen is crucial owing to their differential effects. Ultraviolet (UV) radiation is a physical mutagen that induces many types of mutations, such as base substitutions, frameshifts and deletions (Schaaper et al., 1987), by inducing DNA lesions with pyrimidine dimers and oxidative damage (Pfeifer, 2020). The combination of trimethyl psoralen (TMP) exposure and UV-A irradiation is known to elicit small deletions in the genome and base substitutions without any transversion or transition bias. The mutational effect of psoralens with UV-A radiation was studied in the late 1970s (Seki et al., 1978). The method was originally applied to C. elegans by Yandell et al. (1994), resulting in deletions between 0.15 and 10.0 kbp. This method has demonstrated efficacy, especially when targeting C. elegans, by promoting DNA mono adducts and interstrand cross-links.
Mutagens are potentially applicable in the pursuit of minimal genomes. Point mutations caused by UV exposure can disrupt the function of the mutated genes, resulting in a functionally minimal genome (Shibai et al., 2017). Moreover, deletions of nonessential genes should more directly reveal the minimal genome.
Mycoplasma is known to have the smallest genome among autonomously viable life forms, making it an ideal organism for understanding the minimum form of life (Morowitz, 1984). Hutchison et al. (2016) designed an artificial minimal mycoplasma genome and integrated it into a cell to create an independent life form with the smallest genome. Deleterious mutagens like UV or TMP hold promise in reducing genomic function or size with simpler methodology. Comparative genomic studies of mycoplasmas suggest that DNA lesions caused by these mutagens are repaired through the nucleotide excision repair pathway (Hakim et al., 2021). However, the mutagenic effects of UV and TMP on mycoplasmas are yet to be observed at the whole-genome scale.
Here, we investigated the mutagenic effect of UV-C (UVC) or TMP combined with UV-A exposure (TMP-UVA) on a mycoplasma species (Metamycoplasma salivarium) at the whole-genome scale during serial cultivation. A strain of M. salivarium was isolated from the human oral cavity and was found to have a smaller genome than a previously reported strain. Genomic DNA was analyzed after serial cultivation in the range of 34 to 56 rounds with UVC or TMP-UVA treatment. The mutation patterns were compared with those reported in previous mutagenesis studies. The results showed that TMP-UVA and UVC exposure yielded unique mutational patterns. In particular, the deletion rate was higher with TMP-UVA than with UVC treatment, indicating the usefulness of TMP-UVA for obtaining a minimal genome.
We chose M. salivarium as a target mycoplasma for mutagenesis because it has a relatively small genome (728.3 kbp, NCBI RefSeq GCA_900660445) among culturable bacteria and can be easily isolated from the healthy human oral cavity (see Materials and Methods). First, the genome sequence of the isolated strain (KS-1) was analyzed using MGI DNBSEQ, with minimum read coverage depth of over 100. We assembled the reads and successfully obtained the whole genome sequence. The size (678.9 kbp) was smaller than that of the previously reported strain, NCTC10113 (728.3 kbp, NCBI RefSeq GCA_900660445) (Table 1). Taxonomy was checked using a BLASTN search with the 16S rRNA sequence as a query, which resulted in 100% identity with M. salivarium strain NCTC10113. Also, average nucleotide identity (ANI) was calculated. The ANI was 99% against M. salivarium, which means that the two can be considered as the same species given that the ANI is over 95% (Jain et al., 2018). We mapped the reads against the reference sequence to examine differences. The genome of the KS-1 strain had 64 deletions throughout the genome (Fig. 1A). These deletions affected 61 genes, including 21 completely removed genes (Table 2, Table 3, Supplementary Table S1). Insertions and single-nucleotide substitutions (SNSs) were observed throughout the genome (Fig. 1A).
Genome statistics | Reference | KS-1 |
---|---|---|
Total sequence length (bp) | 728,347 | 678,897 |
GC content (%) | 26.4 | 26.6 |
Number of CDSs | 627 | 587 |
Average protein length | 359 | 355 |
Coding ratio (%) | 92.6 | 92.1 |
Number of rRNAs | 3 | 3 |
Number of tRNAs | 33 | 33 |
Number of CRISPRs | 1 | 0 |
Gene product |
---|
ABC transporter ATP-binding protein |
ABC transporter ATP-binding protein/permease |
Cof-type HAD-IIB family hydrolase |
MFS transporter |
UPF0236 family protein |
And 16 hypothetical proteins.
Gene product | Gene name |
---|---|
Proline--tRNA ligase | proS |
Ribonuclease III | rnc |
50S ribosomal protein L4 | rplD |
WhiA | whiA |
ABC transporter ATP-binding protein/permease | - |
Abi family protein | - |
APC family permease | - |
F0F1 ATP synthase subunit epsilon | - |
HNH endonuclease | - |
Lipoprotein 17-related variable surface protein | - |
P80 family lipoprotein | - |
Replication-associated recombination protein A | - |
TatD family hydrolase | - |
TM2 domain-containing protein | - |
YitT family protein | - |
And 21 hypothetical proteins.
Next, to investigate the effects of mutagens, a clone of the KS-1 strain was exposed to UV-C (UVC) or TMP and UV-A (TMP-UVA) on an agar plate and incubated for 3–14 days until microcolonies appeared. This process of exposure to mutagen and incubation was repeated for every round of serial cultivation. The serial cultivation was performed for four independent lineages until the final round, which was 54, 55 and 56 for UVC lineages noted as UVC1, UVC2 and UVC3, respectively (insufficient DNA was isolated from the fourth lineage), and 34, 34, 37 and 37 for TMP-UVA lineages noted as TMP-UVA1, TMP-UVA2, TMP-UVA3 and TMP-UVA4, respectively. After the final round, the genome sequences of the original KS-1 strain and the three (UVC) or four (TMP-UVA) final mutants were analyzed using MGI DNBSEQ. The minimum read coverage depth was over 100 for all mutants.
Second, the sequence reads of serially cultivated mutants were mapped against the de novo-assembled sequence of strain KS-1. After cultivation with mutagenesis, many mutations (deletions and substitutions) were detected under both UVC and TMP-UVA conditions throughout the whole genome (Fig. 1B). No insertions were detected in any of the lineages.
The mutations introduced during serial cultivation were further analyzed for each lineage with different mutations. Most mutations were SNSs for all lineages, irrespective of mutagens (Fig. 2A). The substitutions were categorized according to the regions in which they occurred (Fig. 2B). Most of the substitutions were introduced in coding sequence (CDS), and their numbers differed between UVC and TMP-UVA conditions. The substitutions found in CDS were further categorized into nonsense, nonsynonymous and synonymous mutations (Fig. 2C). Nonsynonymous mutations were the most frequent under both conditions, and synonymous mutations were relatively higher under UVC conditions.
Next, the detected substitutions were further analyzed according to the nucleotide type before and after the cultivation experiment (Fig. 3A). The commonest substitutions in both conditions (UVC or TMP-UVA) were transitions, specifically C→T followed by G→A. Only a few substitutions in opposite directions (T→C or A→G) were observed. The changes in amino acids were also analyzed (Fig. 3B). The most frequently observed mutation in both conditions was glutamic acid to lysine, which is caused by the second most common base substitution of the G→A transition. The second most frequently observed mutations for the UVC and TMP-UVA lineages were serine to leucine and histidine to tyrosine, respectively, which can be explained by the most common base substitution of the C→T transition. Nonsense mutations were detected only in UVC lineages (Supplementary Table S2).
The numbers of serial cultivation rounds differ among lineages from 34 to 56, which yields differences in the numbers of mutations. To normalize this effect, we compared the average rates of substitution and deletion per round of cultivation. The substitution rate under UVC conditions was approximately 2.2-fold higher than that under TMP-UVA conditions (Fig. 4A), whereas the deletion rate was approximately 2.3-fold lower (Fig. 4B). To clarify this difference, we calculated the ratio of substitution and deletion numbers to the total number of mutations, which generated a higher deletion ratio for the TMP-UVA condition (Fig. 4C), indicating that TMP-UVA is useful for selectively introducing deletions into M. salivarium.
Comparison with mutational patterns in other studies
The mutagenic effects of UVC and TMP-UVA on serial cultivation were compared with those reported in previous studies. Mutation data for serially cultivated wild-type Escherichia coli with UV irradiation (Shibai et al., 2017) and for serially cultivated mycoplasma JCVI-syn3A without any mutagen (Sandberg et al., 2023) were selected because mutation data for the whole genomes were available.
The ratio of each base substitution observed in these studies was also calculated (Fig. 5). The mutation patterns were similar among UVC, TMP-UVA and E. coli, with transitions from C→T and G→A being the most frequently detected substitutions. These transitions were also frequent in JCVI-syn3A, but the high numbers of transversions C→A and A→T in JCVI-syn3A were different from the other three lineages. The M. salivarium UVC lineage had the highest rate of C→T or G→A type of substitution, accounting for over 90% of total base substitutions.
Searching for the simplest form of life is a possible strategy to understand the basic principles of a living system. One approach is to explore the effects of genetic mutations, including gene deletions and nonsense mutations. To understand the effect of random mutagenesis on mycoplasma, which possesses the smallest genome among autonomously culturable organisms, we serially cultivated M. salivarium. We exposed it to UVC or TMP-UVA in each round to induce mutations. Our genomic analysis revealed successful mutation induction by both treatments. Notably, TMP-UVA treatment resulted in a higher frequency of deletions than UVC alone, while UVC led to more point mutations. Additionally, both treatments predominantly induced transition mutations, particularly C→T and G→A. This elucidation of mutational patterns is useful for future mutational studies of mycoplasmas to understand the simplest architecture of life.
The strain KS-1 isolated for this study has a 7% smaller genome size than the reference genome. It is worth noting that ANI may not be sensitive to large deletion by its principle, but still supports the high similarity in the rest of the genome. How the difference in genome size emerged in these mycoplasma strains – whether genes were deleted or acquired – is still unknown. Further analysis of deleted genes and their functions in other strains of M. salivarium may shed light on the genome diversity of these pathogens, and may also yield insights into the evolution of the genomes.
The deletion rate was higher in TMP-UVA lineages than in UVC lineages. This was expected, because interstrand cross-links in DNA induced with TMP-UVA cause double-strand breaks, which are thought to cause deletion through the UvrABC system, a mechanism for repairing DNA (Sladek et al., 1989). The UvrABC system is the only complete DNA repair pathway that mycoplasmas possess (Hakim et al., 2021). The genes in the pathway, uvrA, uvrB and uvrC, were all found in the genome of strain KS-1, suggesting the mechanism of deletion in this experiment.
For reduction of the genome toward the minimal genome, more rounds of serial cultivation will be required when using TMP-UVA. This may seem time-consuming, but the method is very simple and also allows the evolution of mycoplasmas to adapt to shorter genome sizes. For comparison, other evolutionary experiments using mycoplasmas were performed for ~400 rounds (Sandberg et al., 2023) or even 2,000 generations (Moger-Reischer et al., 2023). Growth speed was an issue we faced with M. salivarium strain KS-1: it took 3–14 days for one cultivation round. Using faster-growing strains or species should lead to faster minimization. Several nonsense mutations were observed in UVC lineages (Supplementary Table S2). Depending on the position of the nonsense mutation within the gene, the protein may have lost its function, resulting in the nullification of the gene. This information is also useful to edit the genome for minimization, in addition to transposon mutagenesis used in mycoplasmas (Hutchison et al., 1999).
Comparison with other mutational studies provided insights into the effects of various mutagens on different target organisms. Our results revealed a high ratio of transition mutations, specifically C→T and G→A, in the UVC and TMP-UVA lineages. This observation can be attributed to several factors. First, the AT bias prevalent in bacteria may play a significant role (Hershberg and Petrov, 2010). Second, bias toward AT was also observed in evolutionary experiments on mycoplasma (Moger-Reischer et al., 2023, Sandberg et al., 2023). Sandberg et al. (2023) observed a similar mutational bias with JCVI-syn3A. They suggested that the low GC content, widely conserved across mycoplasma genomes, drives a trend toward genome evolution in a low GC (high AT) direction (Thompson et al., 2011). Moger-Reischer et al. (2023) referred to the deletion of uracil DNA glycosylase, which corrects the misincorporation of uracil, to explain the high AT bias in JCBI-syn3B. However, our isolated strain KS-1 possesses a uracil DNA glycosylase gene. A similar transition mutation bias was also observed in the serial cultivation of E. coli under UV irradiation, which resulted in particularly high C→T and G→A substitutions. UV exposure is known to induce the formation of pyrimidine dimers, leading to the conversion of cytosine to uracil through deamination (Burger et al., 2003), which explains the high frequency of C→T transitions. The exceptionally high rate (over 90%) of the two transitions (C→T and G→A) in the UVC lineage (Fig. 5) can be attributed to the combined effects of the intrinsic mutational bias of mycoplasma genomes and the mutagenic impact of UV irradiation. In contrast, the JCVI-syn3A data showed a notable prevalence of C→A and A→T mutations as well. The relatively low rate of C→A mutations in other studies can be interpreted as a consequence of the high incidence of C→T mutations, induced by UV exposure. However, the elevated rate of A→T mutations in JCVI-syn3A remains unexplained by our current findings.
The mycoplasma agar plate used in this study contained 1.5 g agar (Nakalai Tesque), 2.1 g PPLO broth (Beckton Dickinson), 10 ml of 25% fresh yeast extract (Fujifilm Wako), 20 ml of horse serum (Sigma-Aldrich), 61 mg penicillin G (Sigma-Aldrich) and 0.2 g L-arginine per 100 ml. Human saliva samples, obtained from graduate students, were spread on a mycoplasma agar plate and incubated at 37 °C for 3–7 days under a semi-anaerobic condition using AnaeroPack Anaero (Mitsubishi Gas Chemical). Metamycoplasma salivarium was identified with the PCR Mycoplasma Detection Set (Takara Bio). The isolated strain was named KS-1.
Serial cultivation with exposure to mutagensThe serial cultivation experiment was performed as follows. A single colony of the isolated M. salivarium KS-1 was spread on mycoplasma agar medium. For the UVC exposure experiment, the plate was exposed under a standard UV light (Iwasakidenki, GL15) at a distance of 60 cm with an intensity of 11 µW/cm2 for 5–10 s. The relative spectral power of UV light for UV-C wavelengths was over 95%, while other UV wavelengths had less than 10%, according to the manufacturer. For TMP-UVA exposure experiments, colonies were suspended in liquid mycoplasma medium (250 µl) lacking agar but containing 1 µg/ml trioxsalen (Sigma-Aldrich), and 10 µl of the suspension was placed on a mycoplasma agar plate. The plates were then placed directly on a compact UV lamp (UVP, UVGL-25) and exposed to UV-A (long wavelength mode) for 1–5 min at a distance of 1 cm with an intensity of approximately 130 µW/cm2. Exposure was set to a period that slightly inhibited growth (i.e., the colony sizes decreased with UV exposure). To determine the exposure period in each round, we occasionally performed control experiments without UV exposure. The plates were incubated at 37 °C for 3–14 days until microcolonies were detected with the naked eye under a semi-anaerobic condition using AnaeroPack Anaero in four independent lineages. The colonies were scraped and spread on a new agar plate for each lineage for the next round of cultivation. We repeated this serial cultivation process for up to 56 and 37 rounds of UVC exposure or TMP-UVA exposure, respectively.
Genome assembly and variant callingFor genome extraction, colonies were collected and suspended in medium. Genomic DNA was extracted using a Nucleospin Microbial DNA kit (Takara Bio). For the UVC exposure experiment, we obtained sufficient amounts of genomic DNA from three out of the four lineages at 54, 55 and 56 rounds. For the TMP-UVA exposure experiment, we obtained sufficient amounts of genomic DNA from all four lineages after 34 or 37 rounds.
Genomic DNA was sequenced using DNBSEQ-G400. The reads were preprocessed using fastp v0.23.2 (Chen et al., 2018) with Trimmomatic v.0.39 (Bolger et al., 2014) for additional adapter trimming. Data quality was checked using FastQC v.0.12.1 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). All reads were sampled to 100-fold coverage using SeqKit v.0.13.2 (Shen et al., 2016). Reads from the ancestral strain were assembled using Platanus_B v.1.3.2 (Kajitani et al., 2020). Contigs were manually circularized by deleting overlapping ends. The assembled genome was annotated using DFAST server v.1.6.0 (Tanizawa et al., 2018). The sequence of 16S rRNA was manually extracted from the annotated genome, to be used as a query in a BLASTN search of the 16S ribosomal RNA database (Zhang et al., 2000). The ANI of the genome was calculated using FastANI (Jain et al., 2018), which is implemented in the DFAST server. Reads from serially cultivated lineages were mapped to the assembled genome of the ancestral strain before the variants were called. Both mapping and variant calling were executed by breseq v 0.37.1 (Deatherage and Barrick, 2014). Genomes were visualized as Circos plots using pyCircos (https://doi.org/10.5281/zenodo.6477641).
Author contributions: K. S. and N. I. designed the project, analyzed the data and wrote the manuscript. K. O. analyzed the original strain sequence.
Conflicts of interest: The authors declare no competing interests.
Materials and correspondence: Correspondence and requests for materials should be addressed to N. I.
Data availability: The genomic sequencing data of originally isolated M. salivarium KS-1, mutated lineages UVC1, UVC2, UVC3, TMP-UVA1, TMP-UVA2, TMP-UVA3 and TMP-UVA4, and the complete genome sequence of M. salivarium KS-1 have been deposited in the Sequence Read Archive under the BioProject accession number PRJDB17793. The complete genome sequence of M. salivarium KS-1 is also deposited in DDBJ under the accession number AP031385.
Supporting information: Supplementary Table S1 shows the list of fully removed genes and partially removed genes in the original isolated strain. Supplementary Table S2 shows the list of genes with nonsense mutations detected in UVC lineages.
We are grateful to Dr. Sakatani Yoshihiro, who kindly offered his oral cavity for the sampling of mycoplasma. This work was supported by JST SPRING, grant number JPMJSP2108, CREST grant number JPMJCR20S1, Japan and Kakenhi grant number 22H05402.