Phylogenetic analysis of the Si7PPO gene in foxtail millet, Setaria italica , provides further evidence for multiple origins of the negative phenol color reaction phenotype

To elucidate the diversity and evolution of the Si7PPO gene that controls phenol color reaction (Phr) in foxtail millet, Setaria italica , we analyzed sequence polymorphisms of the Si7PPO gene in 39 accessions consisting of foxtail millet landraces (32 accessions) and their wild ancestor ssp. viridis (seven accessions) collected from various regions in Europe and Asia. The accessions included wild type (positive Phr) and three different types of loss-of-function phenotype (nega-tive Phr), “stop codon type”, “TE1-insertion type” and “6-bp duplication type”, found in our previous study. We constructed a phylogenetic tree of the gene and found that accessions with positive Phr showed higher genetic diversity at the nucleotide sequence level. We also found that the three different loss-of-function types formed different clusters, suggesting that landraces with negative Phr have multiple origins from three different lineages including both landrace and ssp. viridis accessions with positive Phr.


INTRODUCTION
Foxtail millet, Setaria italica (L.) P. Beauv., is one of the oldest cereals in the Old World and is thought to have played an important role in ancient civilization as a staple crop (Fukunaga, 2017). This species of millet has been broadly cultivated from East Asian countries including Japan, Korea and China to Western Europe and part of Africa. The wild ancestor of foxtail millet is thought to be ssp. viridis according to the results of cytological studies (Kihara and Kishimoto, 1942;Li et al., 1945), but the geographical origin of foxtail millet is still a controversial issue because the wild ancestor is broadly distributed in the temperate zone of Eurasia. Some researchers claim that China is the geographical origin of this species, whereas others insist on multiple independent origins including China and other regions (Fukunaga, 2017). Studies on domestication and crop evolution of foxtail millet have been carried out using biochemical and genetic markers (Fukunaga, 2017), as well as some genes involved in domestication and diversification such as the Waxy gene (Fukunaga et al., 2002a;Kawase et al., 2005;Hachiken et al., 2013), the Heading date 1 (HD1) gene  and the SiDreb2 gene (Suehiro et al., 2018). Recently, foxtail millet has become a model crop for C4 photosynthesis panicoid grasses because of its small diploid genome size (ca. 500 Mb) with a small number of chromosomes (2n = 2x = 18) and inbreeding nature (Doust et al., 2009;Li and Brutnell, 2011), and its genome sequences have been determined (Bennetzen et al., 2012;Zhang et al., 2012), thus facilitating genetic and genomic studies on this millet.
Phenol color reaction (Phr) is a coloration of the hulls/ lemma and palea (grains) of cereals after soaking in phenol solution, as reported for rice (Oka, 1953;Takahashi and Alterfah, 1983) and barley (Takeda and Chang, 1996), and the molecular basis of Phr in these two crops has been investigated in detail (Yu et al., 2008;Taketa et al., 2010). The positive Phr type shows black coloration after soaking in phenol solution, whereas the negative Phr type does not show coloration. Variation of Phr and the geographical distribution of Phr phenotypes for foxtail millet have been reported (Kawase and Sakamoto, 1982). It was shown in that study that Phr in foxtail millet is controlled by a single gene (positive Phr being dominant and negative Phr being recessive) and that the negative Phr type is predominant in Eurasia, whereas the positive Phr type generally has a skewed distribution toward subtropical and tropical regions including the Nansei Islands of Japan, Taiwan, the Philippines, Nepal and India (21-100%). Positive Phr is also sporadically distributed in East Asia and Europe at a low frequency. In our previous work, we isolated the gene responsible for Phr, Si7PPO, using genome sequence information, and we also investigated the molecular basis of phenotypic change of this trait (Inoue et al., 2015). As a result, we found three major negative Phr genotypes: one is a "stop codon type" that arose by a single nucleotide substitution in exon 1 resulting in a premature stop codon, another is a "TE1insertion type" that has a transposable element insertion in intron 2 and the other is a "6-bp duplication type" that has a 6-bp duplication in exon 3 resulting in a two-amino acid duplication. Of the negative Phr types, 72.8% are classified into the stop codon type, which is distributed broadly in Eurasia, and 25.2% are the TE1-insertion type, which is distributed in the temperate zone. The 6-bp duplication type is very rare (2%) and has only a limited distribution in the Nansei Islands of Japan (Inoue et al., 2015). We concluded that negative Phr originated three times independently. However, the geographical and phylogenetic origins of these three types have not been fully clarified. In this work, we sequenced the Si7PPO gene in a total of 39 accessions, including the wild ancestor and landraces with positive Phr and the three different types of negative Phr, from various locations, and we constructed a phylogenetic tree of the gene. Here we report the evidence of multiple origins of the three differ-ent negative Phr types.

MATERIALS AND METHODS
We sequenced and compared the entire coding sequences of the Si7PPO gene including three exons and two introns ( Fig. 1) of 32 accessions of foxtail millet (nine landraces of foxtail millet with positive Phr and 23 landraces with negative Phr) and seven accessions of S. italica ssp. viridis. Of the 23 landraces with negative Phr, 17 were stop codon type, three were TE1-insertion type and three were 6-bp duplication type (Table 1). Of the 39 accessions, the Yugu1 sequence was obtained from the Setaria genome database (SiGDB, http://www. plantgdb.org/SiGDB/; Bennetzen et al., 2012) and 24 were newly sequenced in the present study. Fourteen had already been sequenced in a previous study (Inoue et al., 2015). Seeds of all of the foxtail millet landraces were provided by the NARO Genebank, Japan (https:// www.gene.affrc.go.jp/index_en.php) and all of the ssp. viridis accessions except PU67 were provided by a USDA genebank (https://data.nal.usda.gov/dataset/grin-globalproject). PU67 was collected directly in the field by the first author.
The primers used in this study are shown in Fig. 1 and Supplementary Table S1. For wild type, stop codon type and 6-bp duplication type, the primer pair PPONewF3 and PPONewR3 was used, and for the TE1-insertion type the primer pair PPONewF3 and transR1 and the primer pair transF5 and PPOnewR3 were used for amplification. PCR was carried out using Toyobo KOD FX according to the supplier's instructions, and PCR products were purified through the Wizard SV Gel and PCR Clean-Up System (Promega) or NucleoSpin Gel and PCR Clean-up (Takara). Sequencing reactions were carried out using the BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems). The TE1 sequence in the TE1-insertion type was removed before alignment with other types. Alignment of the Si7PPO gene sequences  from the start codon in exon 1 to the stop codon in exon 3 and construction of a phylogenetic tree were carried out using CLUSTAL W (https://clustalw.ddbj.nig.ac.jp/) and MEGA X (Kumar et al., 2018). The neighbor-joining (NJ) method was used for tree construction with 1,000 bootstraps. Nucleotide diversity (π) (Nei and Li, 1979) was also calculated using MEGA X.

RESULTS AND DISCUSSION
Polymorphism and genetic diversity of the Si7PPO gene A total of 1,961-1,967 bp for the Si7PPO gene were sequenced and registered in DDBJ (LC008429-LC008442, LC522914-LC522934, LC536026-LC536031). Alignment is shown in Supplementary Fig. S1 and summarized in Table 2. In addition to a stop codon mutation in exon 1 and a 6-bp duplication in exon 3, which are responsible for negative Phr (Inoue et al., 2015), we found 21 single-nucleotide polymorphisms (SNPs) in the gene. Four SNPs were in introns (two in intron 1 and two in intron 2) and 17 in exons. Eight mutations in the exons were non-synonymous and nine were synonymous. For positive Phr, we found 21 SNPs in the genes of seven accessions of ssp. viridis and nine landraces of foxtail millet (11 SNPs in ssp. viridis and 12 in the nine landraces) (Tables 2 and 3). On the other hand, negative Phr types were less polymorphic. Although a total of 11 SNPs were found in 23 accessions of negative Phr, each genotype of Phr was almost monomorphic (Tables  2 and 3). We used 17 accessions of the stop codon type with negative Phr from various parts of Eurasia (Japan 7, China 2, Taiwan 1, India 1, Pakistan 1, Afghanistan 1, Turkey 2, Kyrgyzstan 1, Ukraine 1), but we found only one SNP within this type. Three accessions of the TE1insertion type with negative Phr from geographically remote localities (Japan, Kyrgyzstan, ex-Czechoslovakia) were used for analysis, but their Si7PPO sequences were completely identical. Three accessions of the 6-bp duplication type from Nansei Islands of Japan were used in this study, but these three were also completely identical (Tables 2 and 3).
Nucleotide diversity (π) for all of the 39 accessions was 0.002. Nucleotide diversity was 0.003 for positive Phr landraces, 0.002 for the wild ancestor, almost 0.000 for the stop-codon type and 0.000 for both the TE1-insertion type and the 6-bp duplication type (Table 3). These results indicate that the positive Phr type is older than the negative Phr and that negative Phr originated after domestication.
Phylogeny of the three different negative Phr types A phylogenetic tree constructed on the basis of the Si7PPO gene is shown in Fig. 2. We divided the tree into three clades, clades Ia, Ib and II (Fig. 2). As expected from genetic diversity data, positive Phr accessions were basal to negative Phr accessions in each clade, also indicating that negative Phr originated after domestication. Wild accessions (ssp. viridis) and landraces with positive Phr were included in each clade of the tree and were basal to each of the three different negative Phr genotypes.
In clade Ia, an accession of ssp. viridis from China (PI408810) was basal to a Korean accession with positive Phr (JP222746), and these two accessions were basal to 17 landraces of the stop codon type.
In clade Ib, three Indian accessions (JP222980, JP222981 and JP222925) and a ssp. viridis accession from Japan (PU67) with positive Phr were basal to all three accessions of the TE1-insertion type with negative Phr, as were two European accessions with positive Phr (JP222998 and JP222999); a ssp. viridis accession from Chile (PI202407) was also included in this latter subclade.
In clade II, three ssp. viridis accessions (one each from Turkey, Iran and China) and a Nepalese accession with positive Phr (JP225339) were basal to two Taiwanese positive Phr accessions (JP222588 and JP222567) and to all three accessions of the 6-bp duplication type from the Nansei Islands of Japan (JP71641, JP222652 and JP222668).  hypotheses have been proposed for the origin of foxtail millet including a monophyletic hypothesis in China, a monophyletic hypothesis in Central Asia-Afghanistan-India and polyphyletic hypotheses (Fukunaga, 2017). Based on the Si7PPO gene, it is more likely that landraces of foxtail millet with the positive Phr type were domesticated independently from different genotypes of its wild ancestor, S. italica ssp. viridis, and that the negative Phr type originated from different genotypes of foxtail millet with positive Phr (Figs. 2 and 3). As shown in clade Ia in Fig. 2, and in Fig. 3, one accession with positive Phr from Korea was basal to all of the accessions of the stop codon type with negative Phr, suggesting that foxtail millet with positive Phr was domesticated and that the stop codon type then originated in East Asia and spread all over Eurasia including East Asia, India, Central Asia and Western Asia. Interestingly, even in Taiwan and India, where the positive Phr type is distributed at relatively high frequencies (Kawase and Sakamoto, 1982;Inoue et al., 2015), accessions with the negative Phr type are phylogenetically distinct from those with the positive Phr type based on Si7PPO gene sequences.

Multiple origins of negative Phr type Several
As shown in clade Ib in Fig. 2, it seems that accessions with positive Phr such as those in India and Europe were domesticated somewhere in Eurasia; the TE1-insertion type then arose and spread to the temperate zone in Eurasia (Fig. 3).
Regarding clade II, Nepalese and Taiwanese landraces form a cluster with ssp. viridis from Turkey, Iran and China (Fig. 2) and the 6-bp duplication type with negative Phr subsequently originated in the Nansei Islands of Japan (Fig. 3). A close relationship between Nepalese and Taiwanese accessions has been supported by phylogenetic analysis using TD (transposon display) markers (Hirano et al., 2011), and a very close relationship between accessions from Taiwan and those from the Nansei Islands has also been supported by analysis using TD markers (Hirano et al., 2011), hybrid pollen sterility (Kawase and Sakamoto, 1987), rDNA (Eda et al., 2013) and RFLP (Fukunaga et al., 2002b), and the results of this study strongly suggested that the 6-bp duplication type originated in the Nansei Islands after introduction of the Taiwanese positive Phr type.
Positive Phr is commonly found in wild species, but negative Phr is sometimes predominant in cultivated species such as foxtail millet and japonica rice. This is probably because polyphenol oxidase (PPO) activity is not advantageous under cultivation (Yu et al., 2008;Inoue et al., 2015). PPO may be necessary for wild species dur-ing seed dormancy in the soil to protect the seeds from microorganisms, but it is no longer required for cultivated species (Inoue et al., 2015).
Although it seems that foxtail millet was domesticated polyphyletically based on the results of Si7PPO gene sequencing, genome-wide TD analysis (Hirano et al., 2011) indicated that domestication was monophyletic, and intraspecific hybrid sterility (Kawase and Sakamoto, 1987) indicated a clear reproductive barrier between geographically separated foxtail landraces. Introgression of the Si7PPO gene from S. italica ssp. viridis into foxtail millet landraces at the early stage after domestication is also a possible explanation of the results, as it has been reported that introgression played an important role in rice domestication and diversification (Huang et al., 2012). For rice (Oryza sativa), Ichitani et al. (2016) reported that phylogeny based on DNA markers linked to the ortholog of Si7PPO (the Phenol staining (Ph) locus in rice) differs from that based on markers covering the whole genome. One possible reason is that the Ph gene was linked to resistance genes against pests such as blast, bacterial blight and gall midge, and such genes may also have been selected in the long history of foxtail millet cultivation. Analysis of other domestication-related genes and genome-wide phylogenetic study are needed to further clarify the evolution and domestication of foxtail millet.