2023 Volume 38 Issue 1 Article ID: ME22077
Current information on the diversity and evolution of eukaryotic RNA viruses is biased towards host lineages, such as animals, plants, and fungi. Although protists represent the majority of eukaryotic diversity, our understanding of the protist RNA virosphere is still limited. To reveal untapped RNA viral diversity, we screened RNA viruses from 30 marine protist isolates and identified a novel RNA virus named Haloplacidia narnavirus 1 (HpNV1). A phylogenetic analysis revealed that HpNV1 is a new member of the family Narnaviridae. The present study filled a gap in the distribution of narnaviruses and implies their wide distribution in Stramenopiles.
High-throughput sequencing (HTS) technology has expanded our knowledge of RNA virus diversity and evolution (Shi et al., 2016; Dolja and Koonin, 2018; Wolf et al., 2018, 2020; Zayed et al., 2022). Metatranscriptomic analyses of aquatic and soil biospheres have detected numerous novel RNA viruses (Urayama et al., 2018; Starr et al., 2019; Wolf et al., 2020; Chen et al., 2022; Neri et al., 2022; Zayed et al., 2022). Furthermore, HTS-based RNA virus identification from organism samples has been intensively conducted for animals, plants, and fungi (Shi et al., 2016, 2018; Gilbert et al., 2019; Sutela et al., 2020; Kawasaki et al., 2021; Mifsud et al., 2022). The findings obtained have expanded our knowledge of RNA virus diversity; however, the target of HTS analyses of organism samples is biased towards a small fraction of eukaryotic taxa (Cobbin et al., 2021).
Protists are defined as eukaryotes other than animals, land plants, and true fungi (Burki et al., 2020), and some are ecologically and/or economically important. Eukaryotic algae and labyrinthulids play an essential role in the ecosystem as primary producers or decomposers (Field et al., 1998; Raghukumar et al., 2001). Moreover, harmful bloom-forming algae, some parasitic protists, and some oomycetes cause illnesses and death in livestock, fishes, crops, and humans (Derevnina et al., 2016; Mitra and Mawson, 2017; Brown et al., 2019; World Health Organization, 2019). Based on the importance of these protists, several RNA viruses have so far been screened and identified (Nagasaki and Yamaguchi, 1997; Takao et al., 2005; Grybchuk et al., 2018; Charon et al., 2019; Cai et al., 2012; Chiba et al., 2020b; Charon et al., 2021). Despite the vast diversity of protists, the number of studies conducted to date has been limited.
In the present study, to reveal untapped RNA virus diversity in protists, we examined RNA virus genome sequences using HTS from 30 isolates of marine heterotrophic protists belonging to Diplonemea (Euglenozoa), Thecofilosea and Imbrecatea (Cercozoa), and Sagenista and Opalozoa (Bigyra, Stramenopiles) (Table S1 and Fig. S1). To the best of our knowledge, no RNA viruses have been reported in these lineages, except for the Aurantiochytrium single-stranded RNA virus (AuRNAV) identified from an isolate of Sagenista (Takao et al., 2005).
Detailed methods are described in the supplemental material. Briefly, 30 protists were cultured with Hemi medium (Tashyreva et al., 2018) or KLB medium (Yabuki and Tame, 2015) at 20°C, and cells were harvested from cultures by centrifugation at 2,400×g for 4 min. Taxonomic information on each protist is summarized in Table S1. In the present study, we constructed sequencing libraries from pooled cells and single-strain cells for RNA virus screening and complete RNA virus genome identification, respectively. In screening, 30 strains were pooled into pool-1 and pool-2, as shown in Table S1.
Since double‐stranded RNA (dsRNA) is a marker of RNA virus infection (Morris, 1979), dsRNA was purified from cells, and sequencing libraries were constructed using fragmented and primer-ligated dsRNA sequencing (FLDS) technology as previously described (Urayama et al., 2018; Hirai et al., 2021). The details of this method are described in the supplemental material. Libraries were sequenced using the Illumina NovaSeaq 6000 platform with 150 bp paired-end sequences or the Illumina MiSeq platform with 300 bp paired-end sequences (Illumina). More than 400,000 reads were obtained for each library. Raw sequence reads are available in the Short Read Archive database (DDBJ Accession Nos. DRA014844 and DRA014881).
Raw sequence reads were processed as previously described (Hirai et al., 2021) with a custom Perl script (https://github.com/takakiy/FLDS), and cleaned reads were assembled de novo using CLC GENOMICS WORKBENCH version 11.0 (CLC Bio) (Urayama et al., 2016, 2018). To obtain full-length sequences, assembled contigs were manually extended using the assembler and Tablet viewer (Milne et al., 2010). We identified full-length sequences using a previously described method (Urayama et al., 2018). Full-length sequences and contigs were annotated by a BLASTX analysis against the NCBI non-redundant protein database and RNA viral protein sequences reported in recent RNA virome studies (Chen et al., 2022; Neri et al., 2022; Zayed et al., 2022). To identify more distantly related RNA viruses, we performed RNA virus detection using hidden Markov model (HMM) profiles, such as RVDB-prot (Bigot et al., 2019) and NeoRdRp (Sakaguchi et al., 2022).
To identify the host organisms of RNA viruses detected in pooled sequencing, we conducted a RT-PCR analysis targeting the virus sequences. Total nucleic acids were individually extracted from the cells of each isolate with SDS-phenol and used as the template. Two specific primer pairs were used in the RT-PCR analysis, and the products were applied to direct Sanger sequencing.
The phylogenetic positions of the identified RNA viruses were elucidated based on a maximum likelihood-based phylogenetic tree using the deduced amino acid sequences of the RNA-dependent RNA polymerase (RdRp) gene. Details on this method and the accession numbers of the sequences used are shown in the supplemental material.
In RNA virus screening, 30 strains were pooled into pool-1 and pool-2, as shown in Table S1. The FLDS analysis provided 45 and 30 contigs (>500 nt and >0.05% read abundance) from pool-1 and pool-2, respectively. These contigs were examined by BLASTX and HMM analyses, and a single RNA virus contig was identified in pool-2. In the BLASTX analysis, this viral contig showed the lowest e-value with RdRp of Bremia lactucae associated narnavirus 2 (BlaNV2) (e-value; 6E–57, identity 30%), a member of the family Narnaviridae. To identify the host of this narnavirus from pool-2, RT-PCR was performed with two sets of specific primers named Narna-P1 and Narna-P2 (Fig. 1A). PCR products were obtained only from strain YPF1522 (Haloplacidia sp. deposited in the National Institute for Environmental Studies collection as NIES-4585) (Fig. 1B), and sequences were identical to the narnavirus contig sequence (data not shown). Based on the host species and sequence similarity, we named this novel virus Haloplacidia narnavirus 1 (HpNV1). Although the culture of YPF1522 contained prey bacterial cells, the host of HpNV1 appeared to be Haloplacidia sp., not the prey bacteria, because all known hosts of narnaviruses are eukaryotes.
Host identification and genomic organization of HpNV. (A) Schematic representation of the HpNV1 genome and the position of the two primer pairs used in the RT-PCR analysis. The boxes indicate open reading frames encoding >150 amino acid residues, and empty boxes show hypothetical proteins. The black arrows indicate the direction and location of each primer. (B) RT-PCR-based host identification of HpNV1 from pool-2. (C) Multiple alignments of the nucleotide sequences of both terminal regions in the coding strands of RNA1 and RNA2 of HpNV1. Black shading indicates nucleotide positions that are identical. Numbers represent nucleotide positions.
To obtain the complete genome sequence of HpNV1, a FLDS analysis of strain YPF1522 was performed. The results obtained for HpNV1 revealed a bisegmented genome consisting of RNA1 and RNA2 (Fig. 1A) (GenBank accession: LC728461 and LC730475). These two full-length sequences shared both terminal sequences (Fig. 1C). Conserved terminal sequences are a hallmark of the genomic segment in a single RNA virus (Hutchinson et al., 2010). The predicted amino acid sequence of the open reading frame (ORF) of RNA1 showed the lowest e-value with RdRp of BlaNV2, as described above. However, the ORFs of RNA2 did not show significant (e-value >1×10–5) similarities to known protein sequences.
To identify the phylogenetic position of HpNV1, we constructed a phylogenetic tree using the RdRp amino acid sequence of HpNV1 and viruses in the family Narnaviridae (Fig. 2). Narnaviridae currently includes one genus (Narnavirus), and only two viruses (Saccharomyces 20S RNA narnavirus and Saccharomyces 23S RNA narnavirus) are defined as species in this genus by the International Committee on Taxonomy of Viruses (https://talk.ictvonline.org/). The phylogenetic tree revealed that HpNV1 was distantly related to the genus Narnavirus. However, HpNV1 clustered together with known viruses in the family Narnaviridae detected in isolates of Trypanosomatid (Euglenozoa), Peronosporomycetes (Stramenopiles), and Plasmodium (Alveolata) and in environmental samples of invertebrates, fungi, and Peronosporomycetes. This result suggests that HpNV1 is a new member of an undefined genus in the family Narnaviridae.
Phylogenetic tree for RdRp from the family Narnaviridae. Branch labels indicate bootstrap support (%) from 1,000 RAxML bootstrap samplings, and we showed more than 50% bootstrap support. Host taxa are shown by symbols. The scale bar indicates the number of amino acid substitutions per site. The best-fitting amino acid substitution model was [rtREV+F+G]. Full virus names and accession numbers are listed in Table S3. Viruses included in each collapsed node are indicated in Table S3. The newly described virus is shown in orange.
RNA virus screening from various protist species was performed in the present study, and an RNA virus was detected from only one sample. Not all strains of the same species have RNA viruses, and the prevalence of RNA viruses differ among host species (Chiba et al., 2020a, 2021). This might be the cause of the low frequency of the virus-infected samples in this study.
Haloplacidia sp. YPF1522 is a heterotrophic marine protist belonging to Stramenopiles (Rybarski et al., 2021). Stramenopiles is a major eukaryotic group rooted between Bigyra and Gyrista (Adl et al., 2019). Although narnaviruses have been identified in two groups of Gyrista (Ochrophyta and Peronosporomycetes) (Cai et al., 2012; Charon et al., 2021), none have been reported in Bigyra, which includes Haloplacidia (Rybarski et al., 2021). Therefore, although we cannot rule out the possibility that the host of HpNV1 is prey bacteria, this is the first study of a narnavirus potentially infecting Bigyra. This result suggests that narnaviruses are more widely distributed in Stramenopiles than previously reported (Cai et al., 2012; Charon et al., 2021). In addition to previous findings (Charon et al., 2021), the present results support the ubiquity of Narnaviridae in eukaryotes and the evolutionary hypothesis of the phylum Lenarviricota including the families Narnaviridae, Leviviridae, and Mitoviridae (Dolja and Koonin, 2018). Narnaviridae and Mitoviridae are eukaryotic RNA viruses that are proposed to originate from RNA bacteriophages in Leviviridae. The RNA bacteriophage that infected the ancestors of mitochondria was assumed to be brought into a eukaryotic ancestor during eukaryogenesis, and this is the ancestor of Mitoviridae. Narnaviridae is proposed to originate from mitoviruses that escaped to the cytosol. This scenario implies the ubiquity of Narnaviridae and Mitoviridae in eukaryotes.
Although the typical genomic organization of viruses in the family Narnaviridae is mono-segmented (King et al., 2012), recent studies suggest that some narnaviruses have a bisegmented genome (Grybchuk et al., 2018; Chiba et al., 2020a; Jia et al., 2021). Among them, Aspergillus lentulus narnavirus 1 (AleNV1) and Leptomonas seymouri narna‑like virus 1 (LepseyNLV1) clustered together with HpNV1 (Fig. 2), and their genome organizations are consistent with that of HpNV1 (Fig. S2). Therefore, although the relative dsRNA abundance of RNA1 and RNA2 differed (Table S2), we concluded that RNA2 is a genomic segment of HpNV1.
Narnaviruses lack an extracellular phase in their life cycle and persistently infect their hosts without lysis (King et al., 2012). Viruses with this type of life cycle are called persistent-type viruses (Márquez and Roossinck, 2012; Urayama et al., 2022). They are detected in various environmental samples and may be widely distributed in the RNA viral sequence space (Urayama et al., 2022). However, the hosts of the majority of persistent-type viruses in the RNA virus sequence space have not yet been identified. Since the previously reported AuRNAV is an acute-type RNA virus that lyses host cells and enters into new cells, the present study is the first to report a persistent-type virus in Bigyra.
Chiba, Y., Yabuki, A., Takaki, Y., Nunoura, T., Urayama, S., and Hagiwara, D. (2023) The First Identification of a Narnavirus in Bigyra, a Marine Protist. Microbes Environ 38: ME22077.
https://doi.org/10.1264/jsme2.ME22077
We are grateful to Miho Hirai and Fumie Kondo for their excellent technical support in sequencing. This research was supported by a grant from the Institute for Fermentation, Osaka, by a Grant-in-Aid for Scientific Research (18H05368) from the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan, by Grants‐in‐Aid for Scientific Research on Innovative Areas from the MEXT of Japan (Nos. 16H06429, 16K21723, and 16H06437), and by a Grant-in-Aid for JSPS Fellows (21J10873) from the Japan Society for the Promotion of Science (JSPS).