2022 Volume 97 Issue 3 Pages 153-166
Understanding the processes and consequences of the morphological diversity of organisms is one of the major goals of evolutionary biology. Studies on the evolution of developmental mechanisms of morphologies, or evo-devo, have been extensively conducted in many taxa and have revealed many interesting phenomena at the molecular level. However, many other taxa exhibiting intriguing morphological diversity remain unexplored in the field of evo-devo. Although the annelid family Syllidae shows spectacular diversity in morphological development associated with reproduction, its evo-devo study, especially on molecular development, has progressed slowly. In this study, we focused on Megasyllis nipponica as a new model species for evo-devo in syllids and performed transcriptome sequencing to develop a massive genetic resource, which will be useful for future molecular studies. From the transcriptome data, we identified candidate genes that are likely involved in morphogenesis, including genes involved in hormone regulation, sex determination and appendage development. Furthermore, a computational analysis of the transcriptome sequence data indicated the occurrence of DNA methylation in coding regions of the M. nipponica genome. In addition, flow cytometry analysis showed that the genome size of M. nipponica was approximately 524 megabases. These results facilitate the study of morphogenesis in molecular terms and contribute to our understanding of the morphological diversity in syllids.
Organisms exhibit enormous morphological diversity, which is a consequence of a long history of biological evolution. Thus, understanding evolutionary mechanisms of morphology is one of the major goals of evolutionary biology, and is being studied under the discipline of evolutionary developmental biology (evo-devo). In recent years, the central focus of evo-devo has been elucidating the molecular mechanisms of morphological development and its evolutionary processes. Although extensive studies have been conducted in many taxa, which have revealed intriguing consequences and processes of morphological evolution, especially molecular mechanisms (Carroll et al., 2005), several taxa still remain unexplored. To obtain a more general understanding, further studies should focus on the unexplored taxa.
The family Syllidae is one of the most speciose taxa in Annelida, containing over 1,100 described species (Martin et al., 2021). They exhibit a spectacular morphological diversity in association with postembryonic transformation of body segments during reproductive maturation, including posterior and collateral budding (Franke, 1999; Aguado et al., 2015). The striking change in morphology, behavior and physiology as a consequence of reproductive maturation is called epitoky, which can be broadly divided into two categories: epigamy and schizogamy. In epigamy, the entire worm body transforms into a reproductive morph, usually with enlarged sensory organs, and developed gonads, gametes and swimming apparatus (Wissocq, 1970a, 1970b; Ponz-Segrelles et al., 2020). In schizogamy, only a part of the worm body develops into an epitokous reproductive morph. The epitokous part of the body, called the stolon, is filled with eggs or sperms, and develops its own head with eye spots and antennae. The stolon detaches from the rest of the body, swims away, and spawns. A variety of patterns of stolon formation, or stolonization, are found among schizogamous species, such as the formation of a single stolon in posterior segments, tandem formation of multiple stolons, collateral or sequential budding of multiple stolons, and lateral branching of stolons (Malaquin, 1893; Potts, 1911; Franke, 1999; Aguado et al., 2015; Schroeder et al., 2017). Previous phylogenetic studies suggested that schizogamy is a derived condition in syllids (Nygren, 1999; Aguado et al., 2012, 2015). The diversity and evolution of drastic morphological modifications associated with reproductive maturation in syllids is clearly an interesting topic for evo-devo studies.
Although various species of Annelida have been studied to illuminate molecular developmental mechanisms (Irvine and Martindale, 2000; Prud’homme et al., 2003; Cho et al., 2010; de Jong and Seaver, 2016; Endo et al., 2016; Kulakova et al., 2017), syllids have not yet been well explored. Recently, transcriptomic studies using high-throughput sequencing technology have been conducted in some syllid species (Brugler et al., 2018; Ponz-Segrelles et al., 2018, 2020; Álvarez-Campos et al., 2019; Ribeiro et al., 2019, 2020). However, these studies still cover only a few species. Therefore, we need to conduct such studies in other syllid species to establish additional model species for comparative analysis to enable a comprehensive understanding of the evolution of syllid morphogenesis.
Here, we propose the Japanese green syllid Megasyllis nipponica as a new model species for syllid evo-devo study. M. nipponica is widely distributed in intertidal and subtidal zones from Hokkaido to Shikoku, Japan, and is a common annelid species in these areas (Imajima, 1966); sample collection from the field is therefore easy for this species. In addition, rearing and breeding methods for this species have been established recently (Miura et al., 2019), allowing us to observe morphological development and to carry out experiments in the laboratory. However, the genetic information on this species is still insufficient.
In this study, we conducted transcriptome sequencing (RNA-seq) and generated a de novo transcriptome assembly to obtain a massive genetic resource for M. nipponica. To elucidate morphological innovations in terms of molecular genetics, analysis of candidate genes that are likely involved in morphogenesis can be a successful approach. For this, we first need to obtain sequence information for the candidate genes. Therefore, in this study, we searched the de novo transcriptome assembly for genes that are likely to be involved in syllid morphogenesis, especially those linked to stolonization. Stolonization is suggested to begin with hormonal changes, followed by sex determination, gonad development and morphological development (Franke, 1999; Álvarez-Campos et al., 2019). Therefore, genes related to hormonal regulation, sex determination and morphological development, especially appendage patterning, are predicted to play major roles in stolon formation. In addition, methylation of cytosines in DNA is also proposed to be involved in the regulation of gene expression during development in annelids (Planques et al., 2021). Genes involved in DNA methylation can also be important candidates as regulators of stolonization. Thus, we identified such genes from the transcriptome assembly.
In addition to gene identification, we estimated the DNA methylation level of coding regions in the M. nipponica genome by a computational method to infer the possibility of DNA methylation and epigenetic regulation of M. nipponica development. Finally, we estimated the genome size of this species, which provides supplementary information for future genome sequencing and associated analyses.
We extracted RNA from eggs, metatrochophore larvae, juveniles (relatively young individuals), male and female adults with/without stolons, and male and female stolons, and then pooled the preparations. We conducted RNA-seq using a cDNA library generated from the pooled RNA sample. Using Illumina paired-end sequencing, 76,263,633 DNA fragments were sequenced. The Illumina sequencing reads were deposited in the Sequence Read Archive of the DNA Data Bank of Japan (DDBJ) under the accession number DRA013174. After removal of adapter sequences, low-quality bases, short reads less than 25 bp and unpaired single reads, 73,748,736 (96.7%) read pairs were retained for subsequent analyses. De novo transcriptome assembly using Trinity (Grabherr et al., 2011) generated 159,970 contigs with an N50 value of 2,407 bp. The transcriptome assembly generated in this study showed a higher N50 value than those in previous studies on syllid species (Ponz-Segrelles et al., 2018, 2020; Ribeiro et al., 2019, 2020). The transcriptome assembly was deposited in the Transcriptome Shotgun Assembly of DDBJ (accession numbers: ICSJ01000001–ICSJ01159970).
Assessment of transcriptome completeness using Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis showed that 949 of 954 metazoan single-copy orthologs were present in the transcriptome assembly, and that 941 orthologs were classified as “complete” in gene length. These results suggest that transcriptome sequencing captured most of the protein-coding genes in the M. nipponica genome, and that short-read assembly by Trinity reconstructed the original transcripts adequately. On the other hand, the BUSCO analysis result indicated a high level of duplication in the assembly (543 of 941 “complete” ortholog groups were duplicated). This may reflect the existence of isoforms. The Trinity assembler reconstructs isoforms from RNA-seq data, and we found isoforms in 24,200 of 85,697 genes in the assembly generated in this study. The duplication in the assembly may also be due to allelic variation, because the RNA sample used in this study was a mixture of RNA derived from multiple individuals. If different alleles were categorized into different genes or isoforms in the assembly, the sequences with high similarity would result in duplication.
Gene identificationAs mentioned above, we focused on genes related to hormonal regulation, sex determination, appendage patterning and DNA methylation as candidates for the developmental regulation of M. nipponica. To identify these genes from the transcriptome assembly, we carried out BLAST sequence similarity searches using amino acid sequences of orthologous genes identified in other species.
Hormone regulationIn syllids, the reproductive life cycle is considered to be under hormonal control (Franke, 1980, 1999; Weidhase et al., 2016). Consequently, the morphological modifications associated with reproductive maturation should also be a result of hormonal changes. Thus, genes related to hormonal regulation are candidates as important factors for morphological modifications. In this study, we focused on five hormone-related signaling pathways that play key roles in physiological regulations of a wide range of animal taxa, i.e., juvenile hormone (JH) (Jindra et al., 2013), ecdysteroids (Bonneton et al., 2008; Hansen et al., 2014; Miyakawa et al., 2018), insulin/insulin-like growth factor signaling (IIS) (Garofalo, 2002; Pertseva and Shpakov, 2002; Piñero González et al., 2009), Janus kinase/signal transducers and activators of transcription (JAK/STAT) (Arbouzova and Zeidler, 2006) and Hippo pathways (Zhao et al., 2011).
We found most of the JH-related genes (10/12) in the transcriptome assembly of M. nipponica (Table 1). This indicates that the JH pathway is conserved in M. nipponica. JHs are a group of sesquiterpenoids that regulate development and reproduction in some invertebrate groups, and are especially well known in arthropods such as crustaceans and insects. In the polychaete species Platynereis dumerilii, a hormone named nereidin, which corresponds to the sesquiterpenoid hormone methyl farnesoate (MF), represses sexual maturation and therefore has a similar role to JH (Schenk et al., 2016). Additionally, in crustaceans, MF is a key regulator of development, such as molting, metamorphosis and reproduction (Miyakawa et al., 2013). MF is the immediate precursor of juvenile hormone III, which functions as a developmental regulator in insects. Thus, sesquiterpenoid hormones are thought to have conserved functions as developmental regulators in a broad range of animals. A transcriptomic analysis of S. magdalena suggested that MF is responsible for the regulation of stolonization (Álvarez-Campos et al., 2019). The function of MF may be conserved widely in syllids. On the other hand, in P. dumerilii, MF influenced posterior regeneration (Álvarez-Campos et al., 2022), thus indicating the pleiotropic roles of this hormone in annelids.
Pathway | Gene | FlyBase accession | Top hit contig | E value |
---|---|---|---|---|
JH | Allatostatin | FBgn0015591 | No hits found | – |
JH | Allatostatin A receptor 1 | FBgn0266429 | MnipOSR_DN18310_c0_g1_i1 | 3.01E-95 |
JH | Broad | FBgn0283451 | MnipOSR_DN29801_c0_g1_i1 | 1.28E-09 |
JH | Chd64 | FBgn0035499 | MnipOSR_DN29987_c5_g2_i15 | 2.27E-59 |
JH | FK506-binding protein 39kD | FBgn0013269 | MnipOSR_DN19184_c0_g1_i1 | 1.89E-30 |
JH | Juvenile hormone acid methyltransferase | FBgn0028841 | No hits found | – |
JH | Juvenile hormone epoxide hydrolase 1 | FBgn0010053 | MnipOSR_DN30796_c1_g1_i1 | 1.01E-117 |
JH | Juvenile hormone epoxide hydrolase 2 | FBgn0034405 | MnipOSR_DN30796_c1_g1_i1 | 2.38E-118 |
JH | Juvenile hormone epoxide hydrolase 3 | FBgn0034406 | MnipOSR_DN30796_c1_g1_i1 | 4.56E-100 |
JH | Juvenile hormone esterase-binding protein | FBgn0035088 | MnipOSR_DN22814_c0_g1_i3 | 3.80E-30 |
JH | Methoprene-tolerant | FBgn0002723 | MnipOSR_DN25898_c2_g1_i10 | 3.20E-16 |
JH | Retinoid X receptor | FBgn0003964 | MnipOSR_DN27563_c1_g1_i5 | 4.73E-126 |
Ecdysone | Broad | FBgn0283451 | MnipOSR_DN29801_c0_g1_i1 | 1.28E-09 |
Ecdysone | Bursicon | FBgn0038901 | MnipOSR_DN33620_c0_g1_i3 | 1.06E-15 |
Ecdysone | Crustacean cardioactive peptide | FBgn0039007 | No hits found | – |
Ecdysone | Cytochrome P450-18a1 | FBgn0010383 | MnipOSR_DN23908_c0_g1_i4 | 2.68E-101 |
Ecdysone | Cyp6t3 | FBgn0033697 | MnipOSR_DN26540_c0_g1_i5 | 1.35E-49 |
Ecdysone | Death-associated APAF1-related killer | FBgn0263864 | No hits found | – |
Ecdysone | Death caspase-1 | FBgn0010501 | MnipOSR_DN27159_c4_g1_i2 | 2.21E-65 |
Ecdysone | Dopa decarboxylase | FBgn0000422 | MnipOSR_DN25135_c1_g4_i3 | 0 |
Ecdysone | Disembodied | FBgn0000449 | MnipOSR_DN31941_c3_g1_i1 | 1.66E-54 |
Ecdysone | Early gene at 23 | FBgn0020445 | MnipOSR_DN30948_c0_g1_i5 | 2.48E-91 |
Ecdysone | Ecdysone receptor | FBgn0000546 | MnipOSR_DN33364_c1_g1_i7 | 9.87E-100 |
Ecdysone | Eclosion hormone | FBgn0000564 | No hits found | – |
Ecdysone | Ecdysone-induced protein 63E | FBgn0005640 | MnipOSR_DN30318_c3_g1_i4 | 7.53E-165 |
Ecdysone | Ecdysone-induced protein 74EF | FBgn0000567 | MnipOSR_DN28564_c2_g1_i14 | 3.01E-53 |
Ecdysone | Ecdysone-induced protein 75B | FBgn0000568 | MnipOSR_DN27580_c0_g1_i3 | 2.14E-65 |
Ecdysone | Ecdysone-induced protein 78C | FBgn0004865 | MnipOSR_DN27580_c0_g1_i3 | 3.79E-61 |
Ecdysone | Ecdysone-induced protein 93F | FBgn0264490 | MnipOSR_DN17956_c0_g1_i1 | 1.08E-23 |
Ecdysone | Ftz transcription factor 1 | FBgn0001078 | MnipOSR_DN27388_c0_g1_i1 | 5.74E-53 |
Ecdysone | Glutathione S transferase E14 | FBgn0033817 | MnipOSR_DN29196_c4_g1_i7 | 1.16E-18 |
Ecdysone | Hormone receptor 3 | FBgn0000448 | MnipOSR_DN31717_c0_g1_i1 | 1.98E-103 |
Ecdysone | Hormone receptor-like in 38 | FBgn0014859 | MnipOSR_DN30598_c3_g1_i1 | 1.31E-130 |
Ecdysone | Hormone receptor-like in 39 | FBgn0261239 | MnipOSR_DN23764_c0_g1_i1 | 3.24E-44 |
Ecdysone | Hormone receptor 4 | FBgn0264562 | MnipOSR_DN23997_c0_g2_i1 | 6.31E-44 |
Ecdysone | Imitation SWI | FBgn0011604 | MnipOSR_DN25509_c3_g2_i1 | 0 |
Ecdysone | Kruppel homolog 1 | FBgn0266450 | MnipOSR_DN25719_c0_g1_i1 | 7.64E-46 |
Ecdysone | Myoinhibiting peptide precursor | FBgn0036713 | No hits found | – |
Ecdysone | Neverland | FBgn0287185 | MnipOSR_DN33232_c2_g2_i2 | 1.01E-94 |
Ecdysone | Phantom | FBgn0004959 | MnipOSR_DN30915_c5_g1_i1 | 7.76E-28 |
Ecdysone | Prothoracicotropic hormone | FBgn0013323 | No hits found | – |
Ecdysone | Shadow | FBgn0003312 | MnipOSR_DN29918_c0_g1_i6 | 3.44E-25 |
Ecdysone | Shade | FBgn0003388 | MnipOSR_DN29918_c0_g1_i6 | 4.21E-44 |
Ecdysone | Sox box protein 14 | FBgn0005612 | MnipOSR_DN20184_c0_g1_i1 | 2.26E-38 |
Ecdysone | Spook | FBgn0003486 | MnipOSR_DN32111_c1_g1_i9 | 7.27E-36 |
Ecdysone | Spookier | FBgn0086917 | MnipOSR_DN32111_c1_g1_i8 | 2.10E-39 |
Ecdysone | Shroud | FBgn0262112 | MnipOSR_DN34346_c2_g3_i3 | 2.74E-26 |
Ecdysone | Ultraspiracle | FBgn0003964 | MnipOSR_DN27563_c1_g1_i6 | 4.50E-61 |
Ecdysone | Trunk | FBgn0003751 | No hits found | – |
IIS | Insulin-like peptide 1 | FBgn0044051 | No hits found | – |
IIS | Insulin-like peptide 2 | FBgn0036046 | No hits found | – |
IIS | Insulin-like peptide 3 | FBgn0044050 | No hits found | – |
IIS | Insulin-like peptide 4 | FBgn0044049 | No hits found | – |
IIS | Insulin-like peptide 5 | FBgn0044048 | No hits found | – |
IIS | Insulin-like peptide 6 | FBgn0044047 | No hits found | – |
IIS | Insulin-like peptide 7 | FBgn0044046 | No hits found | – |
IIS | Insulin-like receptor | FBgn0283499 | MnipOSR_DN30454_c1_g1_i11 | 2.55E-138 |
IIS | Chico | FBgn0024248 | MnipOSR_DN33268_c1_g1_i4 | 4.98E-37 |
IIS | Pi3K21B | FBgn0020622 | MnipOSR_DN31811_c2_g3_i1 | 3.92E-73 |
IIS | Pi3K92E | FBgn0015279 | MnipOSR_DN31584_c0_g1_i1 | 0 |
IIS | Phosphatase and tensin homolog | FBgn0026379 | MnipOSR_DN25021_c0_g1_i3 | 2.47E-79 |
IIS | Phosphoinositide-dependent kinase 1 | FBgn0020386 | MnipOSR_DN22185_c0_g1_i2 | 5.57E-158 |
IIS | Akt1 | FBgn0010379 | MnipOSR_DN30080_c0_g2_i5 | 0 |
IIS | Target of rapamycin | FBgn0021796 | MnipOSR_DN29047_c0_g1_i3 | 0 |
IIS | Ribosomal protein S6 kinase | FBgn0283472 | MnipOSR_DN23760_c0_g1_i1 | 0 |
IIS | Shaggy | FBgn0003371 | MnipOSR_DN32045_c2_g1_i17 | 0 |
IIS | Forkhead box, sub-group O | FBgn0038197 | MnipOSR_DN28353_c1_g1_i1 | 3.58E-84 |
JAK/STAT | Unpaired 1 | FBgn0004956 | No hits found | – |
JAK/STAT | Unpaired 2 | FBgn0030904 | No hits found | – |
JAK/STAT | Unpaired 3 | FBgn0053542 | No hits found | – |
JAK/STAT | Domeless | FBgn0043903 | MnipOSR_DN31006_c1_g1_i19 | 1.15E-18 |
JAK/STAT | Hopscotch | FBgn0004864 | MnipOSR_DN27877_c2_g1_i7 | 6.15E-60 |
JAK/STAT | Signal-transducer and activator of transcription protein at 92E | FBgn0016917 | MnipOSR_DN31823_c0_g1_i3 | 3.40E-113 |
JAK/STAT | BRWD3 | FBgn0011785 | MnipOSR_DN31865_c0_g1_i6 | 0 |
JAK/STAT | Suppressor of cytokine signaling at 36E | FBgn0041184 | MnipOSR_DN30668_c2_g1_i8 | 1.17E-80 |
JAK/STAT | Suppressor of cytokine signaling at 44A | FBgn0033266 | MnipOSR_DN34579_c0_g1_i1 | 4.04E-23 |
JAK/STAT | Suppressor of cytokine signaling at 16D | FBgn0030869 | MnipOSR_DN34579_c0_g1_i4 | 5.40E-48 |
JAK/STAT | Suppressor of variegation 2-10 | FBgn0003612 | MnipOSR_DN33121_c6_g1_i10 | 5.44E-127 |
JAK/STAT | PTP61FC | FBgn0267487 | MnipOSR_DN32046_c1_g1_i2 | 3.56E-100 |
JAK/STAT | Ken and barbie | FBgn0011236 | MnipOSR_DN31172_c0_g1_i3 | 6.09E-13 |
Hippo | Crumbs | FBgn0259685 | MnipOSR_DN34606_c2_g1_i1 | 0 |
Hippo | Stardust | FBgn0261873 | MnipOSR_DN31950_c0_g1_i5 | 0 |
Hippo | Patj | FBgn0067864 | MnipOSR_DN33377_c0_g1_i5 | 1.86E-72 |
Hippo | Par-6 | FBgn0026192 | MnipOSR_DN20635_c0_g1_i1 | 3.89E-116 |
Hippo | Atypical protein kinase C | FBgn0261854 | MnipOSR_DN29998_c0_g2_i13 | 0 |
Hippo | Scribbled | FBgn0263289 | MnipOSR_DN24488_c0_g1_i1 | 0 |
Hippo | Lethal (2) giant larvae | FBgn0002121 | MnipOSR_DN33748_c0_g1_i3 | 0 |
Hippo | Discs large 1 | FBgn0001624 | MnipOSR_DN23955_c0_g2_i10 | 0 |
Hippo | Expanded | FBgn0004583 | MnipOSR_DN34532_c0_g1_i1 | 4.86E-16 |
Hippo | Merlin | FBgn0086384 | MnipOSR_DN23900_c0_g1_i4 | 0 |
Hippo | Kibra | FBgn0262127 | MnipOSR_DN30693_c0_g2_i2 | 3.19E-89 |
Hippo | Hippo | FBgn0261456 | MnipOSR_DN26477_c0_g1_i6 | 2.40E-178 |
Hippo | Salvador | FBgn0053193 | MnipOSR_DN29268_c0_g1_i1 | 2.22E-43 |
Hippo | Mob as tumor suppressor | FBgn0038965 | MnipOSR_DN24211_c0_g1_i1 | 2.37E-146 |
Hippo | Warts | FBgn0011739 | MnipOSR_DN25071_c2_g1_i1 | 0 |
Hippo | Yorkie | FBgn0034970 | MnipOSR_DN29599_c0_g1_i2 | 1.32E-23 |
Hippo | Homothorax | FBgn0001235 | MnipOSR_DN30021_c0_g1_i10 | 1.72E-137 |
Hippo | Teashirt | FBgn0003866 | MnipOSR_DN30289_c0_g1_i1 | 3.07E-22 |
Hippo | Mothers against dpp | FBgn0011648 | MnipOSR_DN30939_c3_g1_i1 | 0 |
Hippo | Ajuba LIM protein | FBgn0030530 | MnipOSR_DN28203_c0_g1_i1 | 1.39E-142 |
Hippo | Ras association family member | FBgn0039055 | MnipOSR_DN28119_c0_g1_i3 | 1.58E-54 |
Hippo | Discs overgrown | FBgn0002413 | MnipOSR_DN33067_c1_g2_i1 | 0 |
Hippo | Fat | FBgn0001075 | MnipOSR_DN34583_c0_g1_i3 | 0 |
Hippo | Lowfat | FBgn0032230 | MnipOSR_DN21681_c0_g1_i1 | 1.51E-132 |
Hippo | Four-jointed | FBgn0000658 | MnipOSR_DN26264_c0_g1_i1 | 2.66E-73 |
Hippo | Approximated | FBgn0260941 | MnipOSR_DN32852_c1_g1_i2 | 1.29E-148 |
Hippo | Dachs | FBgn0262029 | MnipOSR_DN32859_c0_g1_i2 | 0 |
Hippo | Microtubule star | FBgn0004177 | MnipOSR_DN30580_c0_g1_i3 | 0 |
Hippo | Protein phosphatase 2A at 29B | FBgn0260439 | MnipOSR_DN28913_c0_g1_i2 | 0 |
Hippo | Twins | FBgn0004889 | MnipOSR_DN33116_c1_g2_i8 | 0 |
Hippo | Widerborst | FBgn0027492 | MnipOSR_DN23647_c0_g1_i2 | 0 |
Hippo | Well-rounded | FBgn0042693 | MnipOSR_DN27048_c0_g1_i3 | 0 |
Hippo | Cerebral cavernous malformation 3 | FBgn0038331 | MnipOSR_DN24828_c0_g4_i1 | 2.55E-65 |
Hippo | CG10915 | FBgn0034308 | MnipOSR_DN33474_c1_g1_i3 | 9.93E-38 |
Hippo | Connector of kinase to AP-1 | FBgn0044323 | MnipOSR_DN24423_c0_g1_i2 | 1.59E-180 |
Hippo | Fibroblast growth factor receptor 1 oncogene partner 2 | FBgn0031871 | MnipOSR_DN31038_c0_g1_i3 | 1.48E-28 |
Hippo | MOB kinase activator 4 | FBgn0259483 | MnipOSR_DN34278_c2_g2_i1 | 2.03E-134 |
Hippo | Sarcolemma associated protein | FBgn0040011 | MnipOSR_DN32548_c1_g3_i6 | 3.06E-71 |
Hippo | Striatin interacting protein | FBgn0035437 | MnipOSR_DN27380_c1_g2_i2 | 0 |
BLAST searches were performed for the protein sequences of the D. melanogaster genes (shown with FlyBase Gene IDs) against the transcriptome assembly of Megasyllis nipponica.
We detected most of the ecdysone-related genes (31/37) in the transcriptome assembly (Table 1). Ecdysones are well known as ecdysis (molting) hormones, particularly in arthropods. It has been reported that some key genes involved in arthropod ecdysis are also conserved widely in Bilateria (de Oliveira et al., 2019), and that these genes have an ancestral role in major life stage transitions such as metamorphosis (Zieger et al., 2021). In this study, some genes that are generally conserved in Bilateria, such as the crustacean cardioactive peptide, eclosion hormone and myoinhibiting peptide precursor genes, were not found in our assembly. We need to further investigate whether these undetected genes have indeed been lost from the genome and elucidate the roles of the ecdysone pathway in metamorphosis and stolonization in M. nipponica.
We also detected almost all genes of the IIS (11/18), JAK/STAT (10/13) and Hippo (39/39) signaling pathways in the transcriptome assembly (Table 1). These signaling pathways play key roles in information transmission from the external environment of the cell, not only in development but also in various other biological phenomena (Garofalo, 2002; Pertseva and Shpakov, 2002; Arbouzova and Zeidler, 2006; Piñero González et al., 2009; Zhao et al., 2011). Identification of these genes provides the basis for a variety of molecular genetic studies in syllids.
Sex determinationIn this study, we explored the transcriptome assembly for the known sex determination genes from Caenorhabditis elegans, Crassostrea gigas, Drosophila melanogaster and Mus musculus (Zhang et al., 2014). Although half of the sex determination genes of C. elegans were not detected in the transcriptome assembly, most of the evolutionarily conserved genes showed BLAST hits with low E-values (Table 2). In addition, most of the sex determination genes of C. gigas, D. melanogaster and M. musculus were identified in the assembly (Table 2). These results indicate that M. nipponica possesses the majority of evolutionarily conserved sex determination genes in its genome. In some species of syllids, an individual can change its sex through multiple stolonization events. This may be regulated by hormones from the prostomium and/or proventricle, as suggested by classical experiments of removal and transplantation in some syllid species (Franke, 1980; Heacox and Schroeder, 1982). Based on the results of transcriptomic analysis, it has recently been proposed that different hormones are involved in sexual maturation and stolon formation in different sexes in S. magdalena (Álvarez-Campos et al., 2019; Ponz-Segrelles et al., 2020). Understanding the mechanism of sexual differentiation is thus important for a comprehensive understanding of the molecular mechanism of stolon formation. In some syllid species, gene expression was compared between sexually differentiated individuals (Álvarez-Campos et al., 2019; Ponz-Segrelles et al., 2020). However, gene expression and functional analyses for individuals during sex differentiation have not yet been undertaken. Genes identified in this study should be important candidates for such analyses to understand the mechanisms of sex differentiation and the influences of these genes on stolonization.
Species | Gene | NCBI accession | Top hit contig | E value |
---|---|---|---|---|
Caenorhabditis elegans | XO lethal protein 1 | NP_001024419.1 | No hits found | – |
Caenorhabditis elegans | Zinc finger protein sdc-1 | NP_510650.2 | No hits found | – |
Caenorhabditis elegans | Sex determination and dosage compensation protein sdc-2 | NP_509924.1 | No hits found | – |
Caenorhabditis elegans | Zinc finger protein sdc-3 | NP_506703.1 | No hits found | – |
Caenorhabditis elegans | Protein her-1 | NP_001024310.1 | No hits found | – |
Caenorhabditis elegans | Sex-determining transformer protein 1 | NP_001022880.2 | MnipOSR_DN33322_c0_g1_i2 | 7.53E-64 |
Caenorhabditis elegans | Sex-determining transformer protein 2 | NP_495426.1 | No hits found | – |
Caenorhabditis elegans | Calpain-5 | NP_502751.1 | MnipOSR_DN29560_c3_g1_i1 | 5.58E-139 |
Caenorhabditis elegans | Sex-determining protein fem-1 | NP_500824.1 | MnipOSR_DN29616_c0_g2_i1 | 5.86E-113 |
Caenorhabditis elegans | Protein phosphatase fem-2 | NP_497224.1 | MnipOSR_DN25594_c1_g2_i1 | 3.38E-32 |
Caenorhabditis elegans | Sex-determination protein fem-3 | NP_001255365.1 | No hits found | – |
Caenorhabditis elegans | Feminization of germline | NP_001021792.1 | MnipOSR_DN31635_c0_g2_i1 | 3.25E-32 |
Caenorhabditis elegans | Feminization of germline | NP_001041187.1 | No hits found | – |
Caenorhabditis elegans | Feminization of germline | NP_492646.1 | MnipOSR_DN29492_c0_g1_i1 | 2.13E-07 |
Caenorhabditis elegans | Runt | NP_491679.1 | MnipOSR_DN24155_c0_g1_i1 | 1.07E-35 |
Caenorhabditis elegans | Protein male abnormal 3 | NP_001022464.1 | MnipOSR_DN28556_c2_g1_i4 | 6.84E-13 |
Crassostrea gigas | Feminizer | XP_011450380.1 | MnipOSR_DN19626_c0_g1_i1 | 9.91E-98 |
Crassostrea gigas | Feminizer | XP_011449960.1 | MnipOSR_DN29616_c0_g2_i1 | 0 |
Crassostrea gigas | Feminizer | XP_011418266.1 | MnipOSR_DN32321_c2_g2_i2 | 8.76E-163 |
Crassostrea gigas | Runt | XP_011439752.1 | MnipOSR_DN24155_c0_g1_i2 | 2.43E-104 |
Crassostrea gigas | GATA binding protein 4 | XP_011433554.1 | MnipOSR_DN33455_c0_g1_i3 | 2.30E-39 |
Crassostrea gigas | SoxH | XP_011415857.2 | MnipOSR_DN20184_c0_g1_i1 | 3.80E-13 |
Crassostrea gigas | Sox100B | XP_011444919.1 | MnipOSR_DN31181_c0_g1_i3 | 1.16E-29 |
Crassostrea gigas | Doublesex | XP_011441049.1 | MnipOSR_DN26755_c7_g2_i1 | 2.47E-09 |
Crassostrea gigas | Forkhead box L2 | NP_001295827.1 | MnipOSR_DN29113_c4_g1_i1 | 2.64E-67 |
Crassostrea gigas | Wnt oncogene analog 4 | XP_011448637.1 | MnipOSR_DN23780_c0_g1_i1 | 6.53E-154 |
Crassostrea gigas | Armadillo | XP_011417098.2 | MnipOSR_DN29807_c0_g2_i2 | 0 |
Drosophila melanogaster | Hermaphrodite | NP_001260506.1 | MnipOSR_DN21298_c0_g2_i1 | 1.82E-09 |
Drosophila melanogaster | Transformer | NP_524114.1 | No hits found | – |
Drosophila melanogaster | Transformer | NP_599107.1 | MnipOSR_DN30456_c1_g1_i5 | 1.70E-39 |
Drosophila melanogaster | Feminization of germline | NP_732851.2 | MnipOSR_DN32591_c0_g1_i4 | 1.59E-137 |
Drosophila melanogaster | Feminization of germline | NP_729428.1 | MnipOSR_DN31635_c0_g2_i2 | 0 |
Drosophila melanogaster | Fruitless | NP_996237.1 | MnipOSR_DN29801_c0_g1_i1 | 4.34E-11 |
Drosophila melanogaster | Sisterless A | NP_511116.2 | No hits found | – |
Drosophila melanogaster | Runt | NP_001285501.1 | MnipOSR_DN24155_c0_g1_i2 | 1.53E-61 |
Drosophila melanogaster | Sex lethal | NP_001162689.1 | MnipOSR_DN24543_c1_g4_i1 | 3.37E-51 |
Drosophila melanogaster | Darkener of apricot | NP_001138122.1 | MnipOSR_DN33961_c0_g1_i1 | 5.76E-85 |
Drosophila melanogaster | Sox100B | NP_651839.1 | MnipOSR_DN20184_c0_g1_i1 | 9.36E-26 |
Drosophila melanogaster | Doublesex | NP_001287220.1 | MnipOSR_DN8903_c0_g1_i1 | 2.02E-22 |
Drosophila melanogaster | Wnt oncogene analog 4 | NP_001260187.1 | MnipOSR_DN25119_c0_g1_i1 | 1.38E-56 |
Drosophila melanogaster | Armadillo | NP_001259149.1 | MnipOSR_DN29807_c0_g2_i2 | 0 |
Mus musculus | Transformer | NP_932770.2 | MnipOSR_DN30456_c1_g1_i5 | 3.76E-54 |
Mus musculus | Transformer | NP_033212.1 | MnipOSR_DN30456_c1_g1_i5 | 2.99E-55 |
Mus musculus | Feminizer | NP_034322.3 | MnipOSR_DN29616_c0_g2_i1 | 0 |
Mus musculus | Feminizer | NP_789799.1 | MnipOSR_DN29616_c0_g2_i1 | 0 |
Mus musculus | Feminizer | NP_034323.1 | MnipOSR_DN19626_c0_g1_i1 | 0 |
Mus musculus | Feminizer | NP_775599.1 | MnipOSR_DN29616_c0_g2_i1 | 0 |
Mus musculus | Runt | NP_001104491.1 | MnipOSR_DN24155_c0_g1_i1 | 8.29E-67 |
Mus musculus | Runt | NP_033950.2 | MnipOSR_DN24155_c0_g1_i1 | 1.52E-63 |
Mus musculus | Runt | NP_062706.2 | MnipOSR_DN24155_c0_g1_i3 | 1.51E-64 |
Mus musculus | GATA binding protein 4 | NP_001297539.1 | MnipOSR_DN33455_c0_g1_i2 | 2.16E-75 |
Mus musculus | Wilms tumor 1 homolog | NP_659032.3 | MnipOSR_DN28343_c1_g3_i2 | 5.04E-30 |
Mus musculus | Cbx2, chromobox 2 | NP_031649.2 | MnipOSR_DN25232_c0_g2_i2 | 1.16E-24 |
Mus musculus | Steroidogenic factor 1 | NP_001303616.1 | MnipOSR_DN27388_c0_g1_i1 | 2.97E-54 |
Mus musculus | Amh, anti-Mullerian hormone | NP_031471.2 | MnipOSR_DN16848_c0_g1_i1 | 4.56E-07 |
Mus musculus | Nuclear receptor subfamily 0, group B, member 1 | NP_031456.1 | MnipOSR_DN24935_c0_g1_i3 | 1.99E-19 |
Mus musculus | Sex-determining region Y protein | NP_035694.1 | MnipOSR_DN25343_c0_g1_i1 | 2.96E-31 |
Mus musculus | Sox100B | NP_035578.3 | MnipOSR_DN31181_c0_g1_i1 | 4.65E-29 |
Mus musculus | Doublesex | NP_056641.2 | MnipOSR_DN8903_c0_g1_i1 | 4.34E-32 |
Mus musculus | Forkhead box L2 | NP_036150.1 | MnipOSR_DN29113_c4_g1_i1 | 3.84E-57 |
Mus musculus | R-spondin 1 | NP_619624.2 | MnipOSR_DN20115_c0_g1_i1 | 1.65E-45 |
Mus musculus | Wnt oncogene analog 4 | NP_033549.1 | MnipOSR_DN23780_c0_g1_i1 | 4.92E-164 |
Mus musculus | Catenin beta-1 | NP_001159374.1 | MnipOSR_DN29807_c0_g2_i2 | 0 |
BLAST searches were performed for the protein sequences of C. elegans, C. gigas, D. melanogaster and M. musculus (shown with NCBI accessions) against the transcriptome assembly of Megasyllis nipponica.
We detected almost all the genes involved in appendage development in D. melanogaster (Kojima, 2004; Angelini and Kaufman, 2005) in the transcriptome assembly (Table 3). Identification of these genes should facilitate further functional analyses on appendage development. Syllids have a body plan with a serial repetition of segments, with appendages developing in each segment. Additionally, in the stolon head segment, appendages, i.e., antennae, are formed (Malaquin, 1893). Thus, to obtain a comprehensive understanding of syllid morphogenesis, we need to study appendage development. It has been suggested that some of the genes involved in appendage development, whose functions are conserved in mammals and arthropods, also have similar functions in the polychaetes P. dumerilii (Grimmel et al., 2016) and Neanthes arenaceodentata (Winchell et al., 2010; Winchell and Jacobs, 2013). Therefore, the genes identified in this study are likely also involved in appendage development in syllids.
Gene | FlyBase Gene ID | Top hit contig | E value |
---|---|---|---|
Abdominal A | FBgn0000014 | MnipOSR_DN31520_c0_g1_i1 | 7.25E-30 |
Aristaless | FBgn0000061 | MnipOSR_DN40758_c0_g1_i1 | 4.10E-33 |
Armadillo | FBgn0000117 | MnipOSR_DN29807_c0_g2_i2 | 0 |
Distal-less | FBgn0000157 | MnipOSR_DN29014_c0_g1_i1 | 5.55E-29 |
Buttonhead | FBgn0000233 | MnipOSR_DN30591_c0_g1_i11 | 1.12E-33 |
Deformed | FBgn0000439 | MnipOSR_DN30959_c0_g1_i2 | 6.97E-38 |
Delta | FBgn0000463 | MnipOSR_DN27920_c0_g1_i4 | 7.69E-180 |
Decapentaplegic | FBgn0000490 | MnipOSR_DN24844_c0_g1_i4 | 1.25E-70 |
Engrailed | FBgn0000577 | MnipOSR_DN32985_c1_g1_i3 | 8.96E-52 |
Extradenticle | FBgn0000611 | MnipOSR_DN23624_c0_g1_i6 | 2.54E-160 |
Homothorax | FBgn0001235 | MnipOSR_DN30021_c0_g1_i1 | 2.74E-126 |
Odd skipped | FBgn0002985 | MnipOSR_DN30809_c5_g3_i2 | 2.72E-55 |
Sex combs reduced | FBgn0003339 | MnipOSR_DN30312_c2_g2_i2 | 3.31E-38 |
Sex combs reduced | FBgn0003339 | MnipOSR_DN30312_c2_g2_i2 | 7.48E-39 |
Spineless | FBgn0003513 | MnipOSR_DN30646_c0_g1_i2 | 1.00E-132 |
Epidermal growth factor receptor | FBgn0003731 | MnipOSR_DN33334_c0_g1_i4 | 0 |
Teashirt | FBgn0003866 | MnipOSR_DN30289_c0_g1_i1 | 6.38E-20 |
Ultrabithorax | FBgn0003944 | MnipOSR_DN31520_c0_g1_i1 | 1.55E-26 |
Serrate | FBgn0004197 | MnipOSR_DN27920_c0_g1_i4 | 3.43E-116 |
Hedgehog | FBgn0004644 | MnipOSR_DN27472_c0_g1_i3 | 1.37E-118 |
Notch | FBgn0004647 | MnipOSR_DN32940_c0_g1_i2 | 0 |
BarH2 | FBgn0004854 | MnipOSR_DN27445_c1_g1_i1 | 1.26E-40 |
Bric a brac 1 | FBgn0004870 | MnipOSR_DN30773_c0_g1_i1 | 3.37E-10 |
Sister of odd and bowl | FBgn0004892 | MnipOSR_DN30809_c5_g3_i2 | 8.74E-56 |
Brother of odd with entrails limited | FBgn0004893 | MnipOSR_DN30809_c5_g3_i2 | 1.51E-55 |
Spitz | FBgn0005672 | No hits found | – |
Dachshund | FBgn0005677 | MnipOSR_DN27390_c0_g1_i2 | 4.72E-51 |
Fringe | FBgn0011591 | MnipOSR_DN30588_c0_g1_i6 | 1.96E-92 |
BarH1 | FBgn0011758 | MnipOSR_DN27445_c1_g1_i1 | 1.17E-40 |
Sp1 | FBgn0020378 | MnipOSR_DN30591_c0_g1_i11 | 5.32E-70 |
Drumstick | FBgn0024244 | MnipOSR_DN30809_c5_g3_i1 | 1.93E-18 |
Bric a brac 2 | FBgn0025525 | MnipOSR_DN29452_c0_g1_i2 | 2.98E-10 |
LIM homeobox 1 | FBgn0026411 | MnipOSR_DN28921_c0_g1_i2 | 1.88E-56 |
Distal antenna-related | FBgn0039283 | No hits found | – |
Distal antenna | FBgn0039286 | No hits found | – |
Proboscipedia | FBgn0051481 | MnipOSR_DN19818_c0_g3_i1 | 3.44E-34 |
Pangolin | FBgn0085432 | MnipOSR_DN25617_c0_g1_i12 | 1.82E-86 |
Antennapedia | FBgn0260642 | MnipOSR_DN27734_c0_g1_i2 | 1.14E-35 |
Spalt major | FBgn0261648 | MnipOSR_DN31132_c0_g3_i1 | 1.22E-47 |
Transcription factor AP-2 | FBgn0261953 | MnipOSR_DN23429_c0_g1_i1 | 3.03E-93 |
Rotund | FBgn0267337 | MnipOSR_DN33428_c0_g1_i10 | 1.24E-78 |
Apterous | FBgn0267978 | MnipOSR_DN20053_c0_g1_i1 | 9.50E-42 |
Wingless | FBgn0284084 | MnipOSR_DN31782_c0_g1_i4 | 3.92E-93 |
Escargot | FBgn0287768 | MnipOSR_DN24592_c0_g1_i4 | 3.01E-76 |
BLAST searches were performed for the protein sequences of the D. melanogaster genes (shown with FlyBase Gene IDs) against the transcriptome assembly of Megasyllis nipponica.
We searched the transcriptome assembly for DNA methylation toolkit genes that were examined in a previous study of P. dumerilii, i.e., genes involved in cytosine methylation and in the nucleosome remodeling and deacetylase (NuRD) complex (Planques et al., 2021). All of these genes were detected by BLAST searches (Table 4). Although the HDAC8 gene was not found in Helobdella robusta, the conservation of full sets of DNA methylation and NuRD complex toolkit genes was confirmed in P. dumerilii and Capitella teleta. Planques et al. (2021) suggested that the existence of full sets of those genes is the ancestral state of annelids. This is supported by the identification of orthologs of these genes in M. nipponica. Conservation of those genes also suggests that the epigenetic modification system through cytosine methylation, reported in P. dumerilii (Planques et al., 2021), also occurs in M. nipponica.
Gene | NCBI accession | Top hit contig | E value |
---|---|---|---|
DNA methylation | |||
Dnmt1 | EKC23761.1 | MnipOSR_DN31062_c0_g1_i2 | 0 |
Dnmt2 | XP_011450725.1 | MnipOSR_DN23347_c0_g2_i3 | 2.69E-110 |
Dnmt3 | XP_011435619.1 | MnipOSR_DN33436_c0_g1_i3 | 3.32E-141 |
Tet | XP_011419030.1 | MnipOSR_DN32744_c0_g1_i3 | 5.09E-106 |
Tdg | EKC23562.1 | MnipOSR_DN30645_c2_g1_i1 | 6.80E-95 |
Uhrf | EKC40754.1 | MnipOSR_DN29835_c2_g1_i2 | 0 |
NuRD complex | |||
Mbd1/2/3 | XP_011453189.1 | MnipOSR_DN23949_c0_g1_i1 | 3.82E-112 |
Mbd4 | XP_011447845.2 | MnipOSR_DN26665_c0_g1_i2 | 2.09E-71 |
Chd1/2 | EKC39961.1 | MnipOSR_DN32072_c1_g2_i2 | 0 |
Chd3/4/5 | EKC37026.1 | MnipOSR_DN25200_c0_g3_i7 | 0 |
Chd6/7/8/9 | EKC30422.1 | MnipOSR_DN26472_c0_g1_i2 | 0 |
Hdac1/2 | EKC29198.1 | MnipOSR_DN32154_c2_g1_i6 | 2.66E-32 |
Hdac3 | EKC42750.1 | MnipOSR_DN32154_c2_g1_i3 | 0 |
Hdac8 | EKC18363.1 | MnipOSR_DN26804_c0_g2_i1 | 1.18E-143 |
Rbbp4/7 | EKC19423.1 | MnipOSR_DN21214_c0_g1_i1 | 0 |
Mta1/2/3 | EKC18877.1 | MnipOSR_DN27758_c0_g1_i1 | 0 |
Gatad2 | EKC41807.1 | MnipOSR_DN22006_c0_g1_i1 | 5.61E-69 |
BLAST searches were performed for the protein sequences of the Crassostrea gigas genes (shown with NCBI accessions) against the transcriptome assembly of Megasyllis nipponica.
In animals, DNA methylation mostly occurs in cytosines of cytosine-phosphate-guanine dinucleotides (CpG) (Suzuki and Bird, 2008; Yi and Goodisman, 2009). Because CpG methylation in invertebrates mostly occurs in gene bodies (transcribed regions), we estimated DNA methylation levels using a computational method in the coding regions of M. nipponica, which were predicted from the transcriptome assembly. In this study, we calculated CpG O/E, which is the ratio of observed (O) to expected (E) frequencies of CpGs, to estimate DNA methylation levels in the predicted coding regions. Because methylated cytosines have a hyper-mutability to thymines, CpG is depleted in highly methylated genomic regions over evolutionary time and, thus, CpG O/E decreases according to CpG depletion (Bird, 1980). Therefore, CpG O/E can be used as an indicator for DNA methylation levels (Yi and Goodisman, 2009). We also calculated the guanine-phosphate-cytosine (GpC) O/E for each of the coding sequences (CDSs). Generally, GpC dinucleotides are not the target of DNA methylation in animals, and are thus not under the mutation pressure resulting from cytosine methylation. Therefore, GpC O/E can be a null model against CpG O/E for the estimation of DNA methylation level (Glastad et al., 2013).
We calculated CpG O/E and GpC O/E ratios for the CDSs extracted from the transcriptome assembly. Analyses of the kernel densities of the frequency distributions revealed a single mode in each of the distributions. The mode positions (95% confidence interval) were 0.748 (0.732–0.761) and 1.040 (1.027–1.055) in CpG O/E and GpC O/E frequency distributions, respectively, indicating an obvious depletion of the CpG frequency (Fig. 1). Because CpG depletion is a typical feature of methylated genomes (Bird, 1980; Suzuki et al., 2007), this result indicates the occurrence of DNA methylation in coding regions of the M. nipponica genome. As described above, DNA methylation toolkit genes were detected in M. nipponica (Table 4). These results strongly suggest that DNA methylation occurs and influences gene expression in M. nipponica. The association between DNA methylation and development has been reported in some invertebrates. In the oyster C. gigas, DNA methylation plays important roles in early development, possibly through the regulation of expression of some genes, including some homeobox genes (Riviere et al., 2013). Recently, in the polychaete P. dumerilii, the influences of DNA methylation on larval development and regeneration (Planques et al., 2021) were reported. In M. nipponica, further studies are required to understand proximate mechanisms of developmental regulation. Illuminating the influence of DNA methylation on the development of M. nipponica should contribute to a deeper understanding of invertebrate epigenetics.
Frequency distribution histogram and kernel density estimation (solid curves) of the CpG O/E and GpC O/E values (upper and lower graph, respectively) of the coding sequences extracted from the transcriptome assembly in Megasyllis nipponica. The vertical solid line indicates the mode value. The dotted lines indicate bootstrap confidence intervals at the level of 95%.
The C-value of M. nipponica was 0.536 ± 0.025 pg (mean ± SD, n = 3), suggesting that the genome size of M. nipponica is about 524 megabase pairs. This size estimation will be useful in determining the methodological strategy for future genome sequencing in this species, thereby facilitating genomic studies. With the genome sequence data, we will be able to conduct functional analyses on cis-regulatory elements of the genes identified in this study.
A previous study on genome sizes of syllids showed variation in C-values among species, ranging from 0.12–0.8 pg (Gambi et al., 1997). Thus, based on our results, M. nipponica can be said to have a moderate genome size among that of syllids. Future genomic sequencing in multiple species of syllids should reveal the genomic components that contribute to their genome size.
In our previous study, we established a method for culturing M. nipponica, thereby facilitating experiments and observation of development processes in the laboratory. In addition to the culturing method, in the present study, we obtained massive genetic data of expressed genes. These studies make it possible to conduct gene expression and functional analyses during various developmental stages in this species. In addition to M. nipponica, genetic resources have been gradually accumulating in syllids (Brugler et al., 2018; Ponz-Segrelles et al., 2018, 2020; Álvarez-Campos et al., 2019; Ribeiro et al., 2019, 2020). By advancing molecular developmental studies in multiple species of syllids, we can achieve a comprehensive understanding of the evolution from epigamy to schizogamy, and of the morphological evolution among schizogamous species in syllids.
Individuals of M. nipponica were collected from the subtidal zone at a beach in Oshoro Bay, opposite the Oshoro Marine Station, which belongs to Hokkaido University and is located near Otaru, Hokkaido. At low tide, red algae (Rhodophyta), which often harbor the focal syllid species, were collected from shallow rocky reefs. We subsequently washed the algae in plastic trays filled with sea water to release the syllids from the algae into the sea water. The syllid species were identified according to previous studies (San Martín and Worsfold, 2015; Miura et al., 2019). The collected individuals of the focal species were maintained in small plastic containers (84 × 57 × 44 mm) placed in an incubator (21 ± 1 ℃, 16-h light and 8-h dark cycle). The detailed rearing conditions were based on Miura et al. (2019).
RNA-seqFor transcriptome analyses, total RNA was extracted at various developmental stages of the focal species, including eggs, metatrochophore larvae, juveniles, male and female adults with/without stolons, and male and female stolons. RNA was extracted using ISOGEN (Nippon Gene, Tokyo, Japan), treated with DNase I (Thermo Fisher Scientific, Waltham, MA, USA) and purified using the SV Total RNA Isolation System (Promega, Madison, WI, USA). Total RNAs extracted from the samples were then pooled. Library preparation and sequencing were conducted by Eurofins Genomics (Tokyo, Japan). Briefly, a strand-specific mRNA sequencing library was constructed from the pooled RNA. Paired-end sequencing on the Illumina HiSeq 2500 platform (Illumina, San Diego, CA, USA) was conducted to produce 100-bp reads from both ends of each DNA fragment.
De novo transcriptome assemblyPrior to assembling the sequencing reads, sequences of the adapters used for Illumina sequencing and low-quality bases [leading and trailing bases less than a phred score (Ewing and Green, 1998), denoted as “Q”, of 20 and the rest of the sequence less than Q20 with average quality in a 4-bp sliding window] were trimmed from the sequencing reads using Trimmomatic v0.36 (Bolger et al., 2014). Reads shorter than 25 bp after the trimming, as well as single reads that had lost one of the pairs by the trimming, were eliminated from further analysis. The adapter-trimmed and quality-filtered paired reads were then subjected to de novo transcriptome assembly using Trinity v2.6.6 (Grabherr et al., 2011) with default options.
The quality of the transcriptome assembly, or the assembled contig set, was evaluated in terms of transcriptome completeness. Transcriptome completeness of the assembly was assessed using BUSCO v5 (Simão et al., 2015) with the Metazoan single copy ortholog set of OrthoDB9 (Zdobnov et al., 2017).
Prediction of coding sequencesCDSs were predicted from the assembly using a stand-alone version of OrfPredictor (Min et al., 2005). While OrfPredictor basically performs ab initio prediction for CDSs, it can optionally use the translation reading frame identified in BLASTX alignment (Altschul et al., 1997) for ORF identification when query nucleotide sequences have a BLASTX hit. Thus, prior to CDS prediction, we carried out BLASTX searches for the contigs against protein sequences derived from Annelida in the UniProtKB/TrEMBL database (74,427 sequences; downloaded on 13 April, 2018). Subsequently, we performed CDS prediction by OrfPredictor using the BLASTX alignments with E-values less than 1E-10. The protein sequences were deduced from the CDSs. The CDSs and translated amino acid sequences are available at the Figshare repository under the identifiers https://doi.org/10.6084/m9.figshare.19416803.v1 and https://doi.org/10.6084/m9.figshare.19416800.v1, respectively.
Gene identificationWe performed sequence homology searches against the protein set predicted from our transcriptome assembly to identify genes involved in hormone regulation, namely JH (Jindra et al., 2013), ecdysteroids (Bonneton et al., 2008; Hansen et al., 2014; Miyakawa et al., 2018), IIS (Garofalo, 2002; Pertseva and Shpakov, 2002; Piñero González et al., 2009), JAK/STAT signaling (Arbouzova and Zeidler, 2006) and Hippo signaling pathways (Zhao et al., 2011). In addition, we also searched genes related to sex determination (Zhang et al., 2014), appendage patterning (Kojima, 2004; Angelini and Kaufman, 2005) and DNA methylation (Planques et al., 2021). Prior to the homology search, we retrieved protein sequence information from the NCBI database or FlyBase (https://flybase.org/) for genes that had been identified in other species (Tables 1, 2, 3, 4). We then searched our transcriptome assembly for the protein sequences of other species using the TBLASTN algorithm with an E-value threshold of 1E-6.
Estimation of DNA methylation levelWe inferred the DNA methylation level of CDSs with CpG O/E. In this analysis, we used only one contig from a group of contigs that are regarded as possible splice variants of the same gene, generated using the Trinity assembling process. The grouped contigs share the same sequences to some extent and, thus, they are redundant for the calculation of CpG O/E. Short CDSs (< 100 bp) were also removed from the analysis. After filtering, 76,310 CDSs were retained for further analysis. For each of the filtered CDSs, the CpG O/E was calculated using the following formula: CpG O/E = PCpG/PCPG, where PCpG, PC and PG are the frequencies of CpG dinucleotides, C nucleotides and G nucleotides, respectively. We also calculated the ratio of the observed to the expected frequencies of guanine-phosphate-cytosine (GpC) dinucleotides, GpC O/E. Next, we examined the modality of the frequency distributions of CpG O/E and GpC O/E values. Using the Notos algorithm (Bulla et al., 2018), we removed the outlier and zero values (2,796 and 693 CDSs were removed for CpG O/E and GpC O/E analyses, respectively), estimated kernel densities, and determined the numbers and positions of modes, or peaks, by detecting local maxima in the CpG O/E and GpC O/E distributions.
Genome size estimationAn anterior part (2–3 mm including the head) of M. nipponica and a single head of a female D. melanogaster were homogenized using a Wheaton Dounce tissue grinder (DWK Life Sciences, Mainz, Germany) in 2 ml of Galbraith buffer (Galbraith et al., 1983), and then filtered through a CellTrics filter (mesh diameter 30 μm, Sysmex, Kobe, Japan). The nuclear DNA of dissociated cells was stained with propidium iodide (50 μg/ml final concentration) (Invitrogen, Waltham, MA, USA) overnight at 4 ℃. The fluorescence was measured using a FACSCanto flow cytometer and FACSDiva software (BD Biosciences, San Jose, CA, USA) (Nakabachi et al., 2010). The genome size (haploid C-value) was calculated from the relative strength of the fluorescence of M. nipponica and D. melanogaster, Oregon-R (0.18 pg or 176 megabase pairs; Bennett et al., 2003).
We are grateful to the staff of Oshoro Marine Station, Hokkaido University, for permitting us to work in their field and laboratory. Computational resources were provided by the Data Integration and Analysis Facility of the National Institute for Basic Biology and Super Computer Facilities of the National Institute of Genetics. We thank the Open Facility, Global Facility Center, Creative Research Institution, Hokkaido University for use of the FACSCanto flow cytometer. This study was supported by a Grant-in Aid for Scientific Research A (No. 18H04006) from the Ministry of Education, Culture, Sports, Science and Technology of Japan and grants from the Research Institute of Marine Invertebrates, Tokyo, Japan, and the Japan Science Society, Tokyo, Japan.