Transcriptome sequencing and DEG analysis in different developmental stages of floral buds induced by potassium chlorate in Dimocarpus longan

Potassium chlorate can promote off-season flowering in longan, but the molecular mechanisms are poorly understood. In this study, four-year-old ‘Shixia’ longan trees were injected in the trunk with potassium chlorate, and terminal buds were sampled and analyzed using transcriptomics and bioinformatics tools. To generate a reference longan transcriptome, we obtained 207,734 paired-end reads covering a total of 58,514,149 bp, which we assembled into 114,445 unigenes. Using this resource, we identified 3,265 differentially expressed genes (DEGs) that were regulated in longan terminal buds in response to potassium chlorate treatment for 2, 6 or 30 days, including 179 transcription factor genes. By reference to the Arabidopsis literature, we then defined 38 longan genes involved in flowering, from which we constructed the longan flowering pathway. According to RNA-seq data, at least 24 of these genes, which participate in multiple signaling pathways, are involved in potassium chlorate-stimulated floral induction, and the differential regulation in terminal buds of ten floral pathway genes ( GI , CO , GID1 , GA4 , GA5 , FLC , AP1 , LFY , FT and SOC1 ) was confirmed by qRT-PCR. These data will contribute to an improved understanding of the functions of key genes involved in longan floral induction by potassium chlorate.


Introduction
In annual and biennial plants, various genes have been shown to regulate floral induction (FI). For example, in Arabidopsis thaliana, a light signal activates FLOWERING LOCUS T (FT) transcription in leaf vascular tissue (phloem), resulting in the transfer of FT protein to the shoot apical meristem, where it initiates expression of further floral genes leading to flowering (Corbesier et al. 2007). In this way, most or all aboveground meristems are converted into floral meristems. A number of factors modulate this process, including environmental conditions and plant hormones (Bernier and Périlleux 2005;Johansson and Staiger 2015).
The genetic and physiological regulation of FI in perennial fruit trees is more complex than in annual plants, because it is a quantitative process in which only some of the meristems are converted into floral meristems; many remain vegetative and give the tree its typical perennial nature. This trait is thought to depend on physiological processes that protect particular meristems from FI by floral promoting substances (Bangerth 2006;Wilkie et al. 2008). For a few subtropical fruit tree species, it has become possible to induce flowering artificially by the application of chemicals, although the regulatory mechanisms behind this process may be rather different (Bonhomme et al. 2000;Huang et al. 2021;Li et al. 2016). However, these techniques are still not fully developed and require further research and refinement.
Longan (Dimocarpus longan Lour.) is an important tropical/subtropical fruit in southern China and Thailand. Farmers have focused on how to improve the quality and quantity of longan fruit produced, but the most important factor influencing yields is the irregular timing and degree of flowering; this depends on chilling, which is difficult to control (Suttitanawat et al. 2012;Thunyarpar 1998). Potassium chlorate (KClO 3 ) treatment of the root or foliage can promote FI in longan at almost any time of the year (Manochai et al. 2005;Subhadrabandhu and Yapwattanaphun 2000). Consequently, potassium chlorate treatment is very widely applied in longan production in China and Thailand, and has resulted in a remarkable improvement in yields . To further improve longan production, researchers have focused increasing attention on the mechanism of FI by potassium chlorate.
Potassium chlorate treatment is surprisingly efficient in that it partially overcomes not only the requirement for low temperature, but also other environmental "essentials" for FI in longan (Blanchard and Runkle 2008). This unique situation makes longan a useful "model tree" and provides an excellent opportunity for scientists to study the physiological mechanisms of transition from vegetative to floral bud development. This should improve our knowledge of the mode of action of KClO 3 in longan and the control of FI in perennial fruit trees in general.
Many researchers have attempted to uncover the relationship between FI and hormonal changes, both temporal and spatial, in the tissues of longan trees following KClO 3 application (Hegele et al. 2008(Hegele et al. , 2004Potchanasin et al. 2009aPotchanasin et al. , 2009b. Thus, the concentrations of gibberellins (GAs), auxin (IAA), and cytokinins (CKs) in the terminal buds (shoot tips) and leaves treated with KClO 3 have been examined and compared with untreated control trees (Susawaengsup et al. 2011). One important result obtained was that increased CK levels and decreased GA levels are needed for the transition from vegetative to floral bud development (Hegele et al. 2008(Hegele et al. , 2004. The role of auxin remains unclear, however, although the relationship between CKs and auxin is likely to be important (Hegele et al. 2004). KClO 3 treatment enhanced the expression of DlFT1 in mature leaves with an increasement of CK content (Winterhagen et al. 2020).
Some experiments on the mechanism of potassium chlorate induction of flower initiation have been conducted, but mostly at the physiological level, with unconvincing outcomes (Hegele et al. 2008(Hegele et al. , 2004Huang et al. 2006;Potchanasin et al. 2009aPotchanasin et al. , 2009bSritontip et al. 2013). Thus, more research is required at the molecular level. Next-generation sequencing techniques have dramatically improved the efficiency of gene discovery, and these have been applied to longan in a small number of studies. In particular, RNA-seq technology allows for the determination of genome-wide expression levels as well as the identification of novel transcripts and isoforms. Lai and Lin (2013) generated a global transcriptome from longan embryogenic callus by high-throughput Illumina RNA sequencing, and analyzed functions, classification and metabolic pathways of the resulting unigenes using bioinformatics. Jia et al. (2014) investigated genes associated with continuous flowering of the 'Sijimi' longan cultivar, which flowers throughout the year, using RNA-seq, and identified 539 potentially flowering-related unigenes, including SHORT VEGETATIVE PHASE (SVP), GIGATEA (GI), F-BOX 1 (FKF1) and EARLY FLOWERING 4 (ELF4).  identified a large number of genes related to the four known flowering pathways and floral integrator genes. Using an earlier approach, Matsumoto described 65 unique longan genes identified by suppression subtractive hybridization (SSH) and confirmed some of these to be differentially expressed in vegetative buds or in floral buds induced by KClO 3 (Matsumoto 2006). Despite this research, data on the molecular mechanisms of FI by potassium chlorate are still lacking, for which further sequencing of the longan transcriptome and identification of the genes that respond to potassium chlorate are needed.
In this study, we used four-year-old 'Shixia' longan trees, which were treated by injection of potassium chlorate into their trunks. Following treatment, terminal buds were sampled and analyzed by transcriptomics and bioinformatics to identify key candidate genes involved in FI by potassium chlorate. Some of these genes were analyzed for their spatio-temporal expression patterns.

Longan material and treatments
Experiments were conducted in September 2014 (the flowering off-season) with four-year-old 'Shixia' cultivar longan plants, as 'Shixia' was widely cultivated in China. One hundred trees were grown in 20 l pots in a 50 : 50 sand and loess-loam mixture; each pot was watered with an automatically regulated trip-irrigation system, which provided a constant water supply. All longan plants were randomly divided into two groups: treatment and controls. Plants were randomly allocated to three sample areas for each group, denoted K-2d, K-6d and K-30d for the treatment group, and C-2d, C-6d, C-30d for the control group. In the treatment group, flowering was induced by injection into the trunk of 8 g potassium chlorate per plant. The same volume of clean water was similarly administered to the control group. Flowering comes after about 30 days after KClO 3 treatment. In order to explore its response to potassium chlorate in the early, middle and late stages, three key timepoints after the commencement of treatment: day 2, day 6 and day 30 were selected ( Figure 1). Terminal bud tissues were randomly collected from five plants (five buds per plant) of each group. All materials from each sample area were divided into three parts, frozen in liquid nitrogen immediately and stored at −80°C for later use.
Total RNA was extracted from 100 mg terminal bud tissue using Trizol Reagent (Invitrogen, USA), and RNA integrity was checked by gel electrophoresis. The quality and quantity of all RNA samples were evaluated using an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA) and a Nanodrop ND-1000 (Thermo Scientific, Waltham, MA). All RNAs from the same sample area were mixed (equivalent quantities) and divided into 3 parts (equivalent quantities) for later use.
We first constructed a cDNA transcriptome sequencing library as a reference library. Total RNA from each sample was divided into two portions. The first portions of each RNA sample were mixed (equivalent quantities) and then pooled for cDNA sequence library construction. The remaining portions of each sample were used to construct six DEG sequencing libraries.

Library preparation for transcriptome sequencing and DEG sequencing
Sequencing libraries were generated using the NEBNext ® Ultra ™ RNA Library Prep Kit for Illumina ® (NEB, USA) following manufacturer's recommendations; index codes were added to attribute sequences to each sample. Briefly, mRNA was purified from total RNA using oligo(dT) magnetic beads. First strand cDNA was synthesized using random hexamer primers and M-MuLV Reverse Transcriptase (RNase H). Second strand cDNA synthesis was subsequently performed using DNA Polymerase I and RNase H. Remaining overhangs were converted into blunt ends using exonuclease/polymerase activities. After adenylation of 3′ ends of DNA fragments, NEBNext Adaptors, which have a hairpin loop structure, were ligated to prepare for hybridization. To preferentially select cDNA fragments 150-200 bp in length, the library fragments were purified with the AMPure XP system (Beckman Coulter, Danvers, MA). Then 3 µl USER Enzyme (NEB, USA) was used with size-selected, adaptor-ligated cDNA at 37°C for 15 min followed by 5 min at 95°C before PCR. PCR was performed with Phusion High-Fidelity DNA polymerase, Universal PCR primers and Index (X) Primer. Finally, PCR products were purified (AMPure XP system) and library quality was assessed using an Agilent Bioanalyzer 2100.
Clustering of index-coded samples was performed on a cBot Cluster Generation System using the TruSeq PE Cluster Kit v3-cBot-HS (Illumina) according to the manufacturer's instructions. After cluster generation, the library preparations were sequenced on an Illumina Hiseq 2000 platform and paired-end reads were generated.

Data analysis and quality control
Raw data (raw reads) in fastq format were first processed through in-house perl scripts. In this step, clean reads were obtained by removing reads containing adapter sequences, reads containing poly(N) and low quality reads (Supplementary Figure S1), and Q20, Q30 and GC-content were calculated (Supplementary Table S1). All downstream analyses were based on clean data of high quality.

Transcriptome assembly and annotation of gene function
The high-quality clean reads were assembled using Trinity (Grabherr et al. 2011). The filtered transcripts were annotated using BLASTx alignment (E-value≤1.0e-5) to the protein databases Nr (http://www.ncbi.nlm.nih.gov), Nt (http://www. ncbi.nlm.nih.gov), COG (http://www.ncbi.nlm.nih.gov/COG), Swiss-Prot (http://www.expasy.ch/sprot) and KO (KEGG Ortholog database). Protein function was predicted based on features of the best BLASTx hits from the alignments. Sequence orientations were determined according to the best hit in the database. The Blast2GO program was used for functional annotation of the transcripts using the Gene Ontology GO (http://www.geneontology.org) (Conesa et al. 2005). After obtaining a GO annotation for every transcripts WEGO software was used to classify GO function (Ye et al. 2018). The HMMER3.0 program was used to annotate the transcripts using the Pfam (Protein family) database (http://pfam.sanger. ac.uk/).

GO enrichment and KEGG pathway enrichment analyses
GO enrichment analysis of DEGs was implemented using the GO-seq R packages based on a Wallenius non-central hypergeometric distribution (Young et al. 2010), which can adjust for gene length bias in DEGs. KEGG is a database resource for understanding the high-level functions and utilities of a biological system (Kanehisa et al. 2016), such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies (http://www.genome.jp/kegg/). We used KOBAS software to test the statistical significance of the enrichment of DEGs in KEGG pathways (Mao et al. 2005).

Quantification of gene expression levels and differential expression analysis of six DEG libraries
The clean sequencing reads were mapped back onto the assembled transcriptome using SOAPaligner/soap2 (Li et al. 2009). Reads that could be uniquely mapped to a gene were used to calculate expression levels. Gene expression levels for six DEG libraries were estimated according to the RPKM method (reads per kb per million mapped reads) using the following equation: RPKM= 10 6 C/(NL/10 3 ) (Mortazavi et al. 2008). In the equation, C is the number of reads uniquely mapped to a given gene, N is the number of reads uniquely mapped to all genes, and L is the total length of the exons in the given gene. When the gene had more than one alternative transcript, the longest transcript was selected to calculate the RPKM. RPKM values can be used for comparing the difference of gene expression among samples.
DEGs were identified by comparison of control and treatment samples at the same timepoints (C-2d vs. K-2d, C-6d vs. K-6d, and C-30d vs. K-30d) using the DEGseq R package ) based on the method described by Audic and Claverie (1997). In this study, we used FDR≤0.001 and an absolute value of log2(ratio)≥1 (The ratio is the expression levels of potassium chlorate/control) as the threshold for differential expression (Storey and Tibshirani 2003).

Construction of Arabidopsis and longan floral pathways
The Arabidopsis flowering pathway was constructed by integrating multiple Arabidopsis flowering pathway genes reported in publications and databases. In total, 42 flowering pathway genes (Supplementary Table S2) were collected from publications (Amasino 2004;Amasino and Michaels 2010;Boss et al. 2004;Mutasa-Göttgens and Hedden 2009;Shen et al. 2014) and two databases (http://www.phytosystems.ulg. ac.be/florid/, and http://www.wikipathways.org/index.php/ Pathway:WP2312). Most pathway genes were supported by two or more reports. These genes were classified into six categories (photoperiod, autonomous, vernalization, GA, FLC clade and floral integration) according to their reported functions. A manually drawn map of the Arabidopsis flowering pathway was then developed (Supplementary Figure S2).
We identified 38 longan genes corresponding to most of the 42 Arabidopsis floral pathway genes (Supplementary Table S3) by searching Arabidopsis gene names, descriptions of function and important domain names in longan unigene annotations (including information in the Nt, Nr, KO, Swiss-Prot, PFAM and KOG databases). These 38 genes were then used to build the longan flowering pathway.

Real-time RT-PCR analysis
To validate the RNA-seq data, quantitative real-time PCR (qRT-PCR) was performed with total RNA samples from relevant timepoints. Two-step qRT-PCR was performed with a SYBR PremixEx Taq II kit (Takara, Dalian, China) and a Light Cycler 480 instrument (Roche) equipped with Light-Cycler software v. 1.5 (Roche) based on the manufacturer's instructions. Primers were designed using Primer v. 5.0 (Premier Biosoft International, Palo Alto, CA) based on the original unigene sequences used to design probes. Amplification reactions were conducted in a total volume of 20 µl. Cycling parameters were as follows: 95°C for 30 s, then 40 cycles of 95°C for 5 s and 60°C for 34 s. The actin gene was used as an internal reference to calculate relative transcript abundance using the 2 −∆∆Ct method. All real-time PCR reactions were technically repeated three times. Specific primers for these genes were designed using DNAMAN 6.0 and their sequences are listed in Supplementary  Table S4.

Assembly and annotation of reference and KClO 3treatment transcriptomes from longan terminal buds
Four-year old 'Shixia' longan (Dimocarpus longan) trees were treated with potassium chlorate and terminal buds were harvested after 2, 6 and 30 d; equivalent samples were also taken from an untreated control group at the same timepoints. We first generated a reference transcriptome after pooling equal proportions of the total RNA from all samples, both experimental and control, and preparing a sequencing library. The Illumina Hiseq 2000 platform was used to generate 58,514,149 high-quality paired-end reads (Supplementary Table  S1), which were assembled into 207,734 transcripts (Supplementary Table S5). From these, gap filling was used to generate 114,445 unigenes in various size categories, and these were annotated using the NR, GO, SwissProt, PFAM, NT, KOG and KO databases (Supplementary Table S6). The annotated unigenes were used as the reference transcriptome in the following analysis.
In parallel, sequencing libraries were also prepared from each timepoint of both the treatment and control groups (Supplementary Table S1). These were used subsequently to identify differentially expressed genes (DEGs) that are likely regulated by potassium chlorate treatment (see below).

Variations in gene expression modulated by potassium chlorate
Longan genes regulated by potassium chlorate (DEGs) were identified by comparison with mock-treated plants for three different time points, i.e. 2 d, 6 d and 30 d of treatment, as described in Materials and Methods. At 2 d, 1,004 genes were upregulated by potassium chlorate treatment, while 241 genes were downregulated; at 6 d, 971 genes were upregulated, while 1,275 genes were downregulated; at 30 d, 284 genes were upregulated, while 213 genes were downregulated. A total of 3,265 longan genes responded to potassium chlorate treatment, of which 538 were regulated at both the 2 d and 6 d timepoints; 114 genes were regulated at both the 2 d and 30 d timepoints; 137 genes were regulated at both the 6 d and 30 d timepoints; and 67 genes were regulated at all three timepoints (Figure 3).

GO and KEGG classifications of genes modulated by potassium chlorate
To evaluate their potential function, DEGs were grouped into GOseq functional categories (Figure 4). For those DEGs apparent by day 2 of potassium chlorate treatment, the predominant categories were metabolic process (744 DEGs) and cellular process (737 DEGs) of the biological process ontology, and cell (512 DEGs) and cell part (512 DEGs) of the cellular component ontology. For day 6, the main categories were again metabolic process (1,359 DEGs) and cellular process (1,287 DEGs) of the biological process ontology, and cell (873 DEGs) and cell part (873 DEGs) of the cellular component ontology. By day 30, a different pattern was observed involving fewer DEGS: the predominant categories were regulation of metabolic process (74 DEGs) and electron transport (24 DEGs) of the biological process ontology, transcription factor complex (45 DEGs) and cell wall (19 DEGs) of the cellular component ontology, and nucleic acid binding transcription factor activity (42 DEGs) and sequencespecific DNA binding transcription factor activity (42 DEGs) of the molecular function ontology.

Construction of longan flowering pathway
To better explore the molecular mechanisms of FI by potassium chlorate, we constructed the longan flowering pathway after identifying 38 longan genes with homologs in the Arabidopsis flowering pathway (Supplementary  Table S3). These genes were chosen by manual filtering steps based on the following factors: i) whether genes had a length greater than 500 bp; ii) whether the gene contained the key domains of the matched Arabidopsis gene; and iii) whether the genes were differentially expressed. We further attempted to reveal the main factors involved by analyzing the change in expression of genes in the deduced flowering pathway. Figure 5 shows the change in expression (according to the RNA-seq data) of 38 longan flowering pathway genes after potassium chlorate treatment for 2 d, 6 d and 30 d (red: upregulated, fold change≥2; green: downregulated, fold change≤0.5; blue: no change, 0.5<fold change<2; white: no unigene matched). We found that multiple genes in the flowering pathway were differentially regulated in apical bud after potassium chlorate treatment: 24 of the 38 genes in the flowering pathway were upregulated or downregulated at one or more timepoints (Supplementary Table S3). These  , GI and CO) and seven genes (CRYPTOCHROME 2 (CRY2), CIRCADIAN CLOCK ASSOCIATED 1 (CCA1), ZEITLUPE (ZTL), LHY, FKF1, GI and CO) were regulated differentially after 2 d and 6 d of treatment, respectively, but only GI was differentially expressed on day 30. Only one DEG was found in the autonomous pathway: the FY gene was downregulated at the 6 d timepoint. In the vernalization pathway, VERNALIZATION INSENSITIVE 3 (VIN3) was upregulated at the 2 d and 6 d timepoints, while FRIGIDA (FRI) was downregulated after 30 d. The FLC gene, which is the key negative regulatory factor of the autonomous and vernalization pathways, was downregulated at the 2 d and 30 d timepoints, but upregulated at day 6. Unlike the previous three pathways, the expression of almost half the GA pathway genes were modulated at all three timepoints. Thus, seven genes (GA1, GA4, GA5, ABSCISIC ACID-INSENSITIVE 5 (ABI5), REPRESSOR OF GA1-3-LIKE 1 (RGL1), SCFSL Y1 (SLY1) and GAI), four genes (GA5, ABI5, RGL1, GID1) and six genes (GA4, GA5, RGL1, GID1, SLY1 and GAI) were differentially expressed on day 2, 6 and 30, respectively. Interestingly, four floral pathway integrators, FT, AP1, SOC1 and LFY, showed significant differential expression after 30 d of KClO 3 treatment, but only one DEG in this category appeared at each of the two earlier timepoints (FT at 2 d and LFY at 6 d). These results indicate that genes in the photoperiod, vernalization and GA pathways may all contribute to the process of KClO 3 -induced longan flowering in the off-season.
To test the reliability of the RNA-seq data, we selected 10 floral pathway candidate genes for qRT-PCR validation ( Figure 6). We found that the expression profiles of all selected genes were consistent with the RNA-seq data.

Discussion
Potassium chlorate treatment can promote FI of longan trees and thus can be used to induce flowering in the offseason. Such flower bud induction and differentiation happens in the terminal bud. Earlier studies of the mechanism of potassium chlorate-induced longan flowering usually focused on hormonal changes (e.g. of auxin, cytokinins and gibberellin) in the apical bud, in the leaves around the apical bud, and in bark or wood. Hegele et al. (2008) and Potchanasin et al. (2009a) both used a radioimmunoassay approach to determine that cytokinins play a significant role in KClO 3 -induced off-season FI in longan. Using liquid chromatographyelectrospray ionization-mass spectrometry, Susawaengsup et al. (2011) showed that high GA1, GA3, GA19 and Figure 5. Longan flowering pathway constructed by reference to Arabidopsis flowering pathway. Change in expression of flowering pathway genes after potassium chlorate treatment. Gene names are given within small boxes whose color indicates the change in gene expression (red: upregulated, fold change≥2; green: downregulated, fold change≤0.5; blue: no change, 0.5<fold change<2; white: no unigene matched). Arrows (promotion) and T-bars (repression) indicate regulation. A, B and C show the gene expression patterns on day 2, 6 and 30 after KClO 3 treatment, respectively. GA20 levels in longan shoot tips contribute to flower bud induction. Qiang et al. (2015) determined the optimum level of cytokinins (zeatin) required to promote apical bud flowering in longan, and also found that low levels of auxin (IAA), gibberellins (GA3) and abscisic acid (ABA) facilitate longan flowering (Qiang et al. 2015). In contrast, very few researchers have characterized the molecular mechanisms of KClO 3 -induced flowering in longan by studying gene expression. The only report was published over ten years ago, when Matsumoto used a cDNA array to screen more than 2,000 clones and identified 65 unique genes that are differentially expressed in potassium chlorate-induced vegetative and floral buds (Matsumoto 2006); many of the identified genes had been previously demonstrated to be associated with shoot and floral meristem development in Arabidopsis.
To facilitate investigations of the mechanism of potassium chlorate-induced FI, we performed a de novo assembly and annotation of the longan transcriptome. We also used transcriptome sequencing technology to determine the expression profiles of key transcription factors in longan terminal buds at three timepoints following potassium chlorate treatment. As far as we are aware, this is the first investigation of the molecular mechanisms of KClO 3 -induced longan off-season flowering using whole genome gene expression profiles.

Illumina sequencing and annotation of longan terminal bud reference transcriptome
Illumina sequencing technology is an effective tool for gene discovery in non-model organisms; it has been used in various plant species including longan embryogenic callus (Lai and Lin 2013), Momordica cochinchinensis (Hyun et al. 2012), Hevea brasiliensis (Xia et al. 2011), sweet potato , Salvia miltiorrhiza (Hua et al. 2011) and Chinese bayberry (Feng et al. 2012), among others. In our study, a total of 58,514,149 bp were obtained by Illumina HiSeq 2000 sequencing technology and assembled into a reference transcriptome, generating a total of 114,445 unigenes (≥200 bp) of average length 758 bp. The numbers of reads and assembled unigenes surpassed the previous study in longan embryogenic tissue (Lai and Lin 2013). Of the 114,445 unigenes obtained, 25,973 unigenes were longer than 1,000 bp and 75,599 unigenes were annotated in at least one database (NR, NT, KO, SwissProt, PFAM, GO, KOG), while 10,468 unigenes were annotated in all databases. However, there were many unassembled reads among the longan sequence data, probably due to a number of factors including the difficulty of assembling short sequence fragments, the assembly options chosen, the low levels of expression of some genes, repetitive sequences, alternative splicing.
Almost half (48.87%) of the annotated unigenes were assigned to GO categories, with biological process being the dominant category, followed by cellular component and molecular function. KOG functional annotation of the unigenes revealed that the category "general function prediction associated with basic physiological and metabolic functions" was predominant among the 26 functional categories identified. Similar results were obtained in study of genes associated with continuous flowering of the 'Sijimi' longan cultivar (Jia et al. 2014;Zhang et al. 2016). Using the KEGG database, 28,491 of our unigenes could be annotated and classified. With 45.1% and 23.0% falling into the categories of genetic information processing and metabolism, respectively. We found that many unigenes are involved in starch and sucrose metabolism pathways, amino sugar and nucleotide sugar metabolism pathways, nitrogen metabolism pathway, photosynthesis pathway and plant hormone signal transduction pathway, all of which are active in the terminal bud. These results are consistent with the findings of previous studies (Jia et al. 2014;Zhang et al. 2016). In summary, the longan terminal bud reference transcriptome assembled in this study is comprehensive, accurate and useful for future genetic research, and provides a foundation for further studies of the genetics of FI in longan. In this study, 3,265 longan DEGs, including 179 transcription factors, were identified across three different time points (2 d, 6 d and 30 d). The largest number of DEGs were identified at the 6 d timepoint, with fewer observed at 2 d, and a much smaller number at 30 d. Thus, the maximal physiological response to potassium chlorate occurred after six days.
GO analysis revealed that the DEGs identified at 2 d and 6 d predominantly fall into the metabolic process, cellular process, cell and cell part subcategories. At 30 d, the majority of DEGs can be assigned to the regulation of metabolic process and electron transport, transcription factor complex and cell wall subcategories. Similar results have been obtained in studies of differential gene expression in another Sapindaceae species undergoing floral development (Jia et al. 2014).
KEGG pathway analysis revealed that members of the Photosynthesis-antenna proteins pathway (ko00196) were significantly enriched at all three treatment timepoints. Thus, 9 DEGs and 12 DEGs that encode light-harvesting complex II chlorophyll b binding protein (LHCB) were downregulated at 2 d and 6 d, respectively. In contrast, at 30 d, 5 DEGs encoding LHCB protein were upregulated. These results indicate that potassium chlorate affects photosynthetic rate, which is consistent with previous results at the physiological level (Huang et al. 2006). Nine DEGs mapping to the Plant hormone signal transduction pathway (ko04075), which are involved in the auxin, cytokinin, gibberellin, brassinosteroid and jasmonic acid signal transduction pathways, were significantly enriched at 30 d.
We also looked for transcription factor genes that are regulated by KClO 3 treatment and found seven AUX/IAA genes, three AP2-EREBP-like genes, which belong to the family of ethylene-responsive element binding factors, and one GRAS-like gene, which is a gibberellin signal transduction-related protein, among our set of DEGs. These results align with those of previous studies, which showed that hormonal changes (e.g. involving auxin, cytokinins and gibberellin) are key factors in potassium chlorate-induced longan flowering (Hegele et al. 2008;Huang et al. 2021;Potchanasin et al. 2009a;Qiang et al. 2015;Susawaengsup et al. 2011).

Genes implicated in potassium chlorate-stimulated FI in longan
GA is an important factor in floral regulatory networks and is required for flowering in A. thaliana under short day conditions (Wilson et al. 1992), when this pathway assumes a major role; its function becomes obligatory in the absence of the photoperiod flowering pathway (Mutasa-Göttgens and Hedden 2009). GAs promote the expression of floral integrators CONSTANS 1 (SOC1) (Bonhomme et al. 2000;Moon et al. 2003) and LEAFY (LFY) (Blázquez et al. 1998) by an independent DELLA (GAI/RGA)-mediated pathway either directly, or indirectly via GAMYB (Gocal et al. 1999(Gocal et al. , 2001. Furthermore, GAs might directly regulate FT to induce floral transition at the shoot apex under short day conditions (Hisamatsu and King 2008). Our new results show that the expression patterns of many GA pathway-associated genes, such as GA4, GA5, GID1, SLY and GAI, are modified by KClO 3 treatment. Thus, the GA4 gene (annotated in A. thaliana as gibberellin 3-beta-dioxygenase 1, GA3OX1) is upregulated after 2 and 30 d of KClO 3 treatment. The GA5 gene (annotated in A. thaliana as gibberellin 20 oxidase 2, GA20OX2) is upregulated at the 2 and 6 d treatment timepoints, but downregulated after 30 d. These expression profiles were confirmed by qRT-PCR. Their upregulation might accelerate the biosynthesis of gibberellins and increase GA concentration in the apical bud, which would promote flowering. The GA4 and GA5 genes, both of which are involved in gibberellin biosynthesis, are activated at early timepoints, and GAs are known to induce the floral integrator genes SOC1 and LFY. This inference is consistent with the findings of Susawaengsup et al. (2011), who found significantly higher levels of gibberellin A1 (GA1), gibberellic acid (GA3), gibberellin A19 (GA19) and gibberellin A20 (GA20) in the shoot tips of potassium chlorate-treated longan plants than in controls. In addition, we found that the FT and LFY genes, operating downstream of the GA pathway, were also upregulated on 30 d following treatment (Figure 6). These results suggest that the GA pathway, especially GA4 and GA5, is involved in potassium chlorate-induced longan flowering.
The photoperiod pathway controls the flowering of A. thaliana primarily through the activation of FT, which is regulated by CO Putterill et al. 1995;Suárez-López et al. 2001). Our RNA-seq data show that many genes in the photoperiod pathway, such as PHYB, LHY, GI and CO, are dysregulated after KClO 3 treatment. This was confirmed by qRT-PCR experiments, which showed that CO expression was almost four times hat of controls on day 2 (3.7-fold change; Figure 6). In addition, we found that the gene encoding phytochrome B (PHYB), which delays flowering by suppressing FT expression (Endo et al. 2005;Lazaro et al. 2015), is downregulated 2 d after potassium chlorate treatment, while FT itself is upregulated (1.7-fold change; Figure  6). These results suggest that genes in the photoperiod pathway are also involved in potassium chloratestimulated FI.
Both the autonomous and vernalization pathways promote flowering by downregulating the expression of FLC in Arabidopsis (Simpson 2004;Sung and Amasino 2004). Our RNA-seq data and PCR validation revealed that FLC expression was significantly lower than in controls after 30 d potassium chlorate treatment. Further analysis found that almost no genes in the autonomous pathway showed expression changes, but the expression of several genes in the vernalization pathway, such as FRI and VIN, did change significantly at different timepoints. It is well known that vernalization requires low temperature conditions. Nevertheless, potassium chlorate treatment is able to inhibit FLC expression under natural conditions of high temperature (September in Guangzhou, China). We therefore conclude that potassium chlorate treatment may activate vernalization pathway genes by a route other than low temperature conditions. This is consistent with the results of Potchanasin et al. (2009b), who found that hormonal changes, specifically changes in gibberellin and auxin levels, induced in apical buds by low temperature were different from those induced by high temperature plus KClO 3 .

Conclusions
Potassium chlorate is very widely used to promote FI in longan production. In this study, we analyzed and identified key candidate genes involved in FI after potassium chlorate treatment using transcriptomics and bioinformatics tools: 207,734 paired-end reads were obtained and assembled into 114,445 unigenes, and 3,265 differentially expressed genes, including 179 differentially expressed transcription factor genes, were identified. We constructed the longan flowering pathway using 38 candidate genes and investigated which of these genes were differentially regulated in terminal buds following potassium chlorate treatment. These data will help researchers to identify and understand the functions of key genes involved in longan flower induction by potassium chlorate, and to develop more efficient ways of regulating longan flowering.
There are several limitations in the approach used to investigate the mechanism of flowering in longan. First, because there were no better options, we had to select the A. thaliana flowering pathway for comparison, although Arabidopsis is not closely related to longan. Accordingly, the flowering pathway we constructed for longan only contains highly conserved genes. It is currently unknown whether any longan-specific genes are involved in the flowering process. Second, the Arabidopsis flowering pathway we constructed is probably an oversimplification, because it was produced by a simple integration of key genes and their relationships as reported in the literature. There are likely other genes involved that have not been considered. Third, flowering is an integrated systemic process involving the cooperation of different plant organs and tissues; we only considered the apical bud, because it is the tissue most closely involved in flowering. Finally, when constructing the longan flowering pathway, we tended to choose a differentially expressed longan gene to look for a matching Arabidopsis gene, which may introduce bias. Many of the above problems are expected to be resolved by improved annotation of the longan genome.

Data availability statement
The datasets are deposited in the NCBI SRA database with the accession number SRA403714.