Genetic variation and structure of Ubame oak , Quercus phillyraeoides , in Japan revealed by chloroplast DNA and nuclear microsatellite markers

Ko Harada*, Fifi Gus Dwiyanti, Huan-Zhen Liu, Yasunori Takeichi, Narumi Nakatani and Koichi Kamiya Faculty of Agriculture, Ehime University, 3-5-7 Tarumi, Matsuyama, Ehime 790-8566, Japan United Graduate School of Agriculture, Ehime University, 3-5-7 Tarumi, Matsuyama, Ehime 790-8566, Japan Department of Silviculture, Faculty of Forestry, Bogor Agricultural University (IPB), IPB Dramaga Campus, Bogor 16680, Indonesia State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 26 Hexing Road, Xiangfang District, Harbin 150040, China Graduate School of Agriculture, Ehime University, 3-5-7 Tarumi, Matsuyama, Ehime 790-8566, Japan


INTRODUCTION
Population genetic analysis using molecular mark-ers and assisted by recent advances in computationally intensive statistical methods has provided us with a deeper understanding of the phylogeographic pattern and demographic history of natural populations for a wide range of taxa.This has given rise to new research fields, such as molecular ecology, and has become the theoretical background of practical fields, such as con-servation genetics and landscape genetics (Holderegger et al., 2006).Population genetic techniques have also been applied to forest tree populations in various climate classes in the world.Among them, microsatellite markers are the most commonly used technique because of their codominant nature and the large number of available loci with abundant variation (Balloux and Goudet, 2002).Microsatellites are essentially neutral and the allele frequency changes are based on purely stochastic processes (Li, 1997).Thus, the markers have great potential to detect genetic structures shaped by stochastic demographic processes, such as population expansion, colonization, isolation and reunion, that are associated with historical climate changes.This can often be useful for predicting potential natural vegetation (Tüxen, 1956), as well as setting outlines for seed and seedling transfer zones to define conservation units and management units (Manel et al., 2003).
The Japanese archipelago extends approximately 3,000 km from the north to the south in a narrow arc with a maximum width of 250 km.Present-day vegetation in Japan, like other temperate climate forests worldwide, has been largely influenced by past climate changes, especially the repeated ice ages in the late Quaternary period (the last one million years) (Hewitt, 2000).At the last glacial maximum (LGM) at about 20,000 years ago, annual temperatures decreased 6-8 °C and the sea level dropped 80-140 m below the present level in the temperate zones in East Asia (Yonekura et al., 2001).This has had a profound effect on present-day vegetation: vegetation shifted toward the south or lower altitudes and often became isolated into small refugia (Comes and Kadereit, 1998).In the course of Earth's warming since the LGM, northward and higher altitudinal vegetation range shifts have occurred (Comes and Kadereit, 1998).These processes must have been accompanied by population expansion and founding events during a series of colonizations, consequently altering genetic variation and structure.
Forest in Japan is largely divided into two types, the temperate broad-leaved deciduous forests in the north and the warm-temperate broad-leaved evergreen forests in the south.The transition zone occurs in eastern Honshu, with an isotherm of 85 °C on the warmth index (Kira, 1948).The broad-leaved evergreen forests contain many plant elements common to eastern China and Taiwan, which constitute the Sino-Japan floristic region (Kitamura, 1957).These forests are sometimes called lucidophyllous forests -forests with shiny leaves (Kira, 1977).Ubame oak, Quercus phillyraeoides A. Gray, is a short shrubby evergreen oak that grows on the Pacific coasts of Honshu from the Boso Peninsula in the northeast to the south on Izena Island, Okinawa.It is also abundant along the coasts of the Seto Inland Sea in western Japan, which separates Kyushu and Shikoku from Honshu (Fig. 1) (Horikawa, 1972), as well as in southern and eastern China below 34 °N (Xie et al., 2011).It has small hard leaves with slight serration and grows on dry, rocky, windy hills facing oceans.Quercus phillyraeoides often forms pure stands in its most favorable habitats, but commonly grows with other shrubby oceanic tree species, such as Pittosporum tobira, Rhaphiolepis indica and Eurya emarginata, and forms one of the typographic forest communities of lucidophyllous forests called, in socio-botanical terms, "Pittosporeto-Quercetum phillyraeoidis Suz.-Tok.et Hatiya 1951" (Miyawaki et al., 1983).Quercus phillyraeoides is the only species that belongs to the Ilex group (evergreen oaks in the subgenus Quercus) in Japan, whereas in China, at least 21 species, mostly endemic, are known in this group (Yang et al., 2017).The Chinese oaks in Ilex are further divided into three morphology-based sections (i.e., Heterobalanus, Englariana and Echinoilepides) (Pu et al., 2002;Denk and Grimm, 2009).Quercus phillyraeoides belongs to the section Englariana with six other species.A molecular genetic study revealed that Q. phillyraeoides is the sister species of Q. acrodonta in the section Englariana (Yang et al., 2017).These findings strengthen the inference that Q. phillyraeoides originated in mainland China, probably on the Yunnan-Guizhou Plateau, during the Miocene and migrated to Japan afterward (Xie et al., 2011).
The wood of Q. phillyraeoides is one of the hardest amongst Japanese trees, and is used for the production of a high-quality charcoal called Bincho-tan.Annual production of Bincho-tan charcoal has been maintained at 3,000 tonnes over the last two decades, mostly shared by Wakayama and Kochi prefectures (Forest and Forestry Statistics Handbook, 2016).Although there is not a large supply of wood for making Bincho-tan charcoal, it is most commonly harvested from natural stands and often depletes available trees.Raising next-generation forests for a sustainable supply has been a serious issue for local people involved in the industry (Consortium of Japan Bincho-tan meeting, 2011).Quercus phillyraeoides is also commonly used for gardens and hedges, and as roadside trees in western Japan because of its luxuriant foliage and vigorous branching, even in poor soils.Large parts of original Japanese coastal areas have been utilized for civilian purposes, such as living, industrial, and agricultural lands that include large areas of artificial forests to protect coastlines, which have generally been made on flat ground including sandy beaches.Many areas of coastal vegetation on rocky cliffs, however, still remain because of limited human access and are expected to have retained their original composition of natural vegetation.
The purpose of our study was to examine the genetic variation and structure of Q. phillyraeoides as one of the key species of unique seaside ecosystems, and to elucidate how the species has established itself along the coastal areas of southwestern Japan.This will enable Phylogeography of Quercus phillyraeoides in Japan us to outline genetically differentiated groups across the entire range of habitats and will provide important information for designating seed and seedling transfer zones, which may be the first step in the conservation and management of economically and environmentally important trees such as Q. phillyraeoides.We examined genetic variation using nuclear microsatellite markers and chloroplast DNA (cpDNA) sequences in non-coding and coding regions.Microsatellites are biparentally transmitted and, with a higher mutation rate, tend to show changes in genetic structure that occur in a relatively short time.As the chloroplast is maternally inherited in angiosperm plant species, the genetic structure revealed by cpDNA variation will reflect the expansion process mediated by seed migration.By combining the results obtained from markers with different modes of inheritance, we expect to obtain a comprehensive view of the demographic history of Q. phillyraeoides in Japan.

MATERIALS AND METHODS
Plant materials and DNA extraction Tree leaves were collected from 743 individuals in 41 populations across the entire distribution range of Q. phillyraeoides in Japan, including 19 populations used in a previous study (Liu et al., 2013).Collection sites are listed in Table 1 and shown geographically in Fig. 1.Collected leaves were kept dry in plastic bags with silica gel, and 50 mg of leaf tissue was crushed into powder using a TissueLyser (Qiagen, Hilden, Germany).DNA was extracted using the DNeasy Plant Mini Kit (Qiagen).The quality and concentration of DNA was confirmed by electrophoresis on a 0.8% agarose mini-gel along with a λ-HindIII digest as a DNA size standard.Extracted DNA was stored at − 30 °C until used.
Microsatellite genotyping Thirty-two candidate primer pairs developed for other Quercus species were examined for amplification of the loci in Q. phillyraeoides (Supplementary Table S1).We selected the following eleven polymorphic loci: bcqm42, bcqm76 and bcqm94 (Mishima et al., 2006); QM58TGT and QM69-2M1 (Isagi and Suhandono, 1997); QpZAG16 and QpZAG36 (Steinkellner et al., 1997); QrZAG7, QrZAG11 and QrZAG112 (Kampfer et al., 1998); and MSQ13 (Dow et al., 1995).Others were not successfully amplified or did not show polymorphism and were not used.The primer pairs for amplifying the loci bcqm42, bcqm76, bcqm94, QM58TGT and QM69-2M1 were redesigned and customized for Q. phillyraeoides.The primer sequences used for this study are listed in Supplementary Table S2.PCR amplifica-tion was performed using the Type-it Microsatellite PCR Kit (Qiagen K. K., Tokyo).Fluorescent dye-labeled primers were prepared by custom order from Thermo Fisher Scientific Japan (Yokohama, Japan).The 5'-end of the forward primers of bcqm76-Qp, QM69-2M1-Qp, QrZAG112 and QpZAG16 was labeled with FAM, bcqm42-Qp, bcqm94-Qp and QpZAG36 was labeled with HEX, QM58TGT-Qp, MSQ13 and QrZAG11 was labeled with NED, and QrZAG7 was labeled with PET.Two to four different dye-labeled primer pairs were combined to make primer mix sets containing 0.2 μM of each in a total volume of 50 μl.The multiple PCR reaction mixture contained 3 μl of RNase-free water, 5 μl of 2x Type-it Multiplex PCR Master Mix, 1 μl of the primer mix set, and 1 μl of template DNA (200 ng for each PCR reaction).The PCR conditions included an initial denaturation at 95 °C for 5 min followed by 28-31 cycles of denaturation at 95 °C for 30 s, annealing at 49 − 60 °C for 90 s, and extension at 72 °C for 30 s with a final extension at 60 °C for 30 min.PCR was conducted with a GeneAMP PCR System 9700 Thermal Cycler (Applied Biosystems) or a Veriti 96-Well Thermal Cycler (Applied Biosystems).The concentration of the amplified DNA was determined by electrophoresis on a 0.8% agarose mini-gel.PCR products were diluted 200-fold and applied to an ABI PRISM 310 Genetic Analyzer.Fragment size was determined using GeneScan (Applied Biosystems) and genotyped using Gene Mapper v.3.0 (Applied Biosystems).
Analysis of cpDNA sequence variation Nucleotide diversity (π) (Nei and Li, 1979) and scaled mutation rate (θ w ) (Watterson, 1975) were calculated for the five cpDNA regions using DnaSP v.5.10 ( Librado and Rozas, 2009).Gaps were excluded from the calculation.Tajima's D neutrality test (Tajima, 1989) was performed using DnaSP software.Statistical significance was determined by coalescent simulation implemented in the software.Sequences were manually assembled into haplotypes based primarily on nucleotide substitutions.We also corrected and redefined the haplotypes previously reported by Liu et al. (2013) because two haplotypes were misidentified by reading errors (i.e., haplotype B in Muroto-misaki and haplotype E in Ryugadake).To clarify the genetic relationships among the haplotypes, a haplotype network analysis was conducted using TCS v.1.21(Clement et al., 2000) including the corresponding sequences of Q. mongolica var.crispula (Okaura et al., 2007) and Q. variabilis (Chen et al., 2012) as outgroups.Haplotype diversity and genetic differentiation were also calculated using DnaSP.
Analysis of genetic variation in microsatellite markers Microsatellite variation was analyzed for 28 populations sampled across the distribution range of Q. phillyraeoides.The total number of alleles (Na), the number of private alleles (AP), and the observed and expected heterozygosities within populations (H O and H E ) for each locus, each population, and overall populations were calculated using GenAlEx v.6.4.1 (Peakall and Smouse, 2006).Inbreeding coefficients (F) for each population and their statistical significance were determined using FSTAT v.2.9.3.2 software (Goudet, 2002).The allelic richness (AR) (El Mousadik and Petit, 1996;Petit et al., 1998) was also determined using FSTAT.Fixation indices in the total populations (F IT ), among populations (F ST ) and within populations (F IS ) (Weir and Cockerham, 1984), as well as G ST and G' ST as defined by Nei (1987), were also determined for each locus and overall loci using FSTAT.AMOVA (analysis of molecular variance) (Excoffier et al., 1992) was performed using GenAlEx v.6.4.1 to partition genetic variance into components attributable to the variation among groups, among populations, within populations and within individuals.Pairwise genetic distance matrices for the 28 populations were made for several distance measures using GenAlEx and were used for PCoA (principal coordinate analysis) (Gower, 1966) to visualize the pattern of genetic relationships among populations.A phylogenetic tree was also constructed based on Nei's standardized genetic distance (D A ) (Nei et al., 1983) using POPTREE2 (Takezaki et al., 2010).The relationship between genetic distance and geographic distance was tested under the isolation by distance (IBD) hypothesis (Rousset, 1997) using the Mantel test.The possibility of a recent bottleneck was examined using BOTTLENECK v.1.2(Piry et al., 1999).When rapid population size reduction (bottleneck) events occur, rare alleles are easily lost (Nei et al., 1975); this causes an excess of heterozygosity computed from a sample of genes (H 0 ) compared to expected heterozygosity at mutation-drift equilibrium (H 1 ), and hence provides criteria for testing a recent bottleneck (Cornuet and Luikart, 1996).Bottleneck significance was determined by the Wilcoxon test under the two-phase model (TPM) in which the default setting [30% SSM (stepping stone model) and 70% IAM (infinite allele model)] was used.
The model-based clustering method implemented in STRUCTURE v.2.3 (http://pritch.bsd.uchicago.edu/)was applied to infer population structure.The program identifies distinct genetic populations (clusters) and assigns individuals to the clusters.To estimate the optimum number of clusters K, ΔK, the rate of change in the log probability of data [LnP(D)] between successive K values, was calculated according to the method of Evanno et al. (2005) using Structure Harvester (Earl and vonHoldt, 2012).A computational geometrical approach implemented in BARRIER was conducted to determine the locations of genetic barriers, which prevent gene flow, and to construct boundaries of genetic change that occurred at an abrupt rate (Manni et al., 2004).The significance of barriers was evaluated by bootstrap analysis imple-mented in the program.

RESULTS
CpDNA variation A total of 58 individual samples from 21 populations covering the entire distribution range of Q. phillyraeoides were sequenced for the five cpDNA regions, trnL-trnF spacer (444 bp), atpB-rbcL spacer (765 bp), matK gene (1,454 bp), trnH-psbA spacer (940 bp) and trnQ-trnS spacer (572 bp), covering a total aligned length of 4,175 bp (Table 1).A total of four nucleotide substitutions (two in the matK gene and two in the trnQ-trnS spacer) and two mononucleotide-repeat length variations (one in the trnH-psbA spacer and one in the trnQ-trnS spacer) were found (Table 2).No variation in the trnL-trnF and atpB-rbcL spacers was detected.π and θ w across the five regions were 0.00016 and 0.00023, respectively.Tajima's D was − 0.6837 and was not significantly different from zero.
Three haplotypes (A, B and C) were identified from the sequence variation detected in the five cpDNA regions (Table 2).The results showed that the matK gene sequence is diagnostic for defining haplotypes.Next, whole or partial sequences of the matK gene were sequenced for a total of 280 individual samples from all 41 populations to determine the haplotype.The haplotype frequency in each population is summarized in Table 1.Haplotype A was the most common (76.8%) and prevalent on the Pacific coast from the northeast population in the Boso Peninsula (IW) to Shikoku, and was also found in some populations in the Seto Inland Sea and Kyushu.Haplotype B was locally restricted (6.8%) and only found in three populations on the Kii Peninsula (TB, KS and SS).Haplotype C was the second most common (16.4%), but was found only in the western part of the distribution, namely the Seto Inland Sea, Kyushu and Okinawa.The global haplotype diversity (H d ) was 0.303 ± 0.00671 and the coefficient of genetic differentiation (G ST ) was 0.894.Geographic haplotype distribu- The polymorphic site positions are based on the standard sequences from the 5' end of accessions AB060064 (matK gene), AB650455 (trnH-psbA spacer) and AB095323 (trnQ-trnS spacer).
tion is shown in Fig. 1.A haplotype network of the five cpDNA regions was reconstructed, including corresponding Qm1 sequences in Q. mongolica var.crispula (Okaura et al., 2007) and H4, H5 and H11 in Q. variabilis (Chen et al., 2012), and showed that haplotype A was ancestral to haplotypes B and C. Haplotype B was derived from A with only one nucleotide substitution; haplotype C was derived from haplotype A with three nucleotide substitutions and two mononucleotide-repeat length changes (Fig. 2).
Microsatellite variation Microsatellite genotypes were determined for 536 individual samples from 28 populations.In total, 104 alleles were detected for the 11 microsatellite loci (Table 3).Population genetic parameters were estimated using GenAlEx v6.4.1 (Peakall and Smouse, 2006) for 24 populations, in which IG, OM and OO populations were combined because of their close geographic proximity.NT and YK populations were eliminated from the calculation because of small sample sizes.The population genetic parameters estimated in each locus and in each population are listed in Table 3 and Table 4, respectively.None of the inbreeding coefficients were significantly larger than zero (P < 0.05) after the application of Bonferroni correction for type I errors (Rice, 1989), although some populations (OG, HZ and RY) were very close to the significance threshold (Table 4).The Weir and Cockerham (1984) estimator of F IS for overall loci was significantly larger than zero at the 5%, but not at the 1%, level (Table 5).
The expected heterozygosity, which Nei (1987) called gene diversity, as well as allelic richness, are commonly used to evaluate genetic diversity in a population as they are altered in response to changes in population size.Relative values of these two estimates across the 24 populations are illustrated in Fig. 3.Both values decreased in the southern KK and IZ populations.In the northeastern populations, both values were relatively high in HZ, TB and KS, but a sharp decline of allelic richness occurred toward the northeastern population of IW.Lowered allelic richness relative to heterozygosity was also observed in some of the populations in the Seto Inland Sea, such as TM, YS and NG.To infer a recent population size change in Q. phillyraeoides populations, BOTTLENECK software was applied for the microsatellite data.Under the TPM, three populations showed significant excess in observed heterozygosity (H 0 ) compared to expected heterozygosity (H 1 ) (IW, P < 0.05; NG, P < 0.01 and AS, P < 0.05) by the Wilcoxon test (Table 4).The NK population showed a nearly significant heterozygosity excess (P < 0.051).None of the populations showed significant deficiency in observed heterozygosity (Table 4).
Genetic structure revealed by microsatellite markers The IBD hypothesis was examined to determine the correlation between genetic distance and geographic distance (Fig. 4).A significant correlation (R 2 = 0.3079, P < 0.001) was observed between pairwise geographic distance and genetic differentiation expressed as F ST /(1 − F ST ) (Rousset, 1997) by the Mantel test.Global genetic differentiation was examined using F-statistics (Weir and Cockerham, 1984).Genetic differentiation among populations, F ST , was 0.097 and significantly larger than zero (P < 0.01) (Table 5).The G ST (Nei, 1987)  Fig. 2. The chloroplast DNA haplotype network constructed using TCS 1.21 (Clement et al., 2000).Haplotype Qm1 is from Quercus mongolica var.crispula (Okaura et al., 2007).Haplotypes H4, H5 and H11 are from Q. variabilis (Chen et al., 2012).Small circles indicate putative intervening haplotypes.Nei (1987).Tests are based on 10,000 randomizations using the exact G-test (Goudet et al., 1996).large at 0.092 and significantly larger than zero (P < 0.001; Table 3).AMOVA showed that 10% of the total genetic variance was attributed to the variation among populations, 83% was attributed to the variation within individuals, and 7% to that among individuals.Cluster analysis was performed by constructing a NJ dendrogram based on D A distance (Nei et al., 1983) (Fig. 5).Three groups were recognized: one includes populations on the Pacific coast in Kanto-Tokai, the Kii Peninsula and the Muroto-misaki Cape, another includes populations in the Bungo-suido Channel (the channel between Kyushu and Shikoku) area, and the third includes the populations in southern Kyushu and Okinawa.Populations in the Seto Inland Sea belonged either to the first or to the second groups.Close genetic relationships supported by high bootstrap value were observed between KS and RY, and between KR and SG, in the dendrogram.Similar grouping was revealed by PCoA based on a pairwise genetic distance matrix using Shannon's index, in which some populations in the Seto Inland Sea (YS, OO and OM) were situated between the groups in the northeastern Pacific coast to the Muroto-misaki Cape and those in the Bungo-suido Channel area (Supplementary Fig. S1).
Population structure at an individual level was examined using the STRUCTURE program.The ΔK plot (Supplementary Fig. S2) suggests that the most likely  number of genetic clusters was K = 2, and the second was K = 4. Bar plots of individuals with an estimated membership fraction for K = 2 and K = 4 are shown in Fig. 6.At K = 2, populations were divided into two groups according to the dominating clusters: one was dominated by cluster I and included populations in the northeastern Pacific coast from IW to MR in Shikoku, and the other was dominated by cluster II and included populations in the southwestern Pacific coast from AS in Shikoku to IZ in Okinawa (Fig. 6).In the Seto Inland Sea, populations dominated by clusters I and II were intermingled (Fig. 6).The populations YS in the Seto Inland Sea and NM in Kyushu were a mixture of the two clusters.A significant genetic differentiation between the two groups of populations dominated by cluster I and cluster II was revealed by AMOVA and accounted for 3% of the total genetic variance (Supplementary Table S3).At K = 4, clusters I and II were further divided into Ia and Ib, and IIa and IIb, respectively (Fig. 6).Three regional groups of populations were recognized based on the dominating clusters.The first one (group 1), which is the same as the group with cluster I at K = 2, included populations on the northeastern Pacific coast to the Muroto-misaki Cape and mostly admixtures of cluster Ia and Ib.The second one (group 2) is dominated by cluster IIa and included the populations in the Bungo-suido Channel area.The third one (group 3) is dominated by cluster IIb and included populations in southern Kyushu and Okinawa.These groups mostly correspond to the three groups revealed by the NJ dendrogram (Fig. 5), and also by the scatter plot of PCoA (Supplementary Fig. S1).A significant genetic differentiation among the three groups of populations was revealed by AMOVA, accounting for 6% of the total genetic variance (Supplementary Table S4).
To infer genetic barriers on a geographic map, BARRIER analysis (Manni et al., 2004) was conducted on the microsatellite data (Supplementary Fig. S3).One barrier with a high bootstrap value appeared between IZ in Okinawa and the other populations.Secondary barriers were observed between island populations in the Seto Inland Sea.KK in southern Kyushu is also isolated from the others.No barriers were observed among the populations on the Pacific coast.

DISCUSSION
Genetic variation revealed by cpDNA sequences and microsatellite markers The π value across the five cpDNA regions of Q. phillyraeoides [0.00016 ± 0.00005 (SE)] was relatively low compared to other oak species in Japan.The value for Japanese oak, Q. mongolica var.crispula, was 0.00055 ± 0.00001 (SE) in four cpDNA regions (Okaura et al., 2007), and that of Japanese beech, Fagus crenata, was 0.00220 ± 0.00026 (SE) for three cpDNA regions (Okaura and Harada, 2002).Although the examined loci were different, the genetic diversity of microsatellite markers was also lower in Q. phillyraeoides.Gene diversity (H E ) averaged over the 24 populations for 11 microsatellite markers in this study was 0.535 ± 0.0104 (SE); in contrast, gene diversity was 0.724 ± 0.0056 (SE) in Q. mongolica var.crispula (Ohsawa et al., 2011) and 0.839 ± 0.0050 (SE) in F. crenata (Hiraoka and Tomaru, 2009).The high genetic diversity in Japanese oak and Japanese beech reflects their endemicity to the Japanese Kanto-Tokai and Kii Peninsula Seto Inland Sea Kyushu and Okinawa archipelago, which has enabled them to accumulate more genetic variation.Like Q. phillyraeoides, the oriental oak Q. variabilis found in Japan originated in mainland China.The oriental oak populations in Japan were found to be fixed to one cpDNA haplotype, while 26 haplotypes were detected in mainland China (Chen et al., 2012).The migration process must have been accompanied by a series of founding events and caused considerable loss of genetic variation in Q. phillyraeoides, especially regarding cpDNA variation.The Japanese archipelago belonged to the same ancient landmass at the eastern margin of the Eurasian continent and became separated by the opening of the Japan Sea, commencing 15 Ma (million years ago) (Taira, 1990;Osozawa et al., 2012).However, the western part of Japan was still connected with mainland China until the opening of the East China Sea in the early Middle Pleistocene (Minato and Ijiri, 1976).The connected area could have been the main migration route of Q. phillyraeoides (Xie et al., 2011).Fossils of Stegodon orientalis, which represents the Middle Pleistocene fauna of Southern China, have been found in strata aged between 0.5 and 0.3 Ma in a broad area of Japan (Kamei et al., 1988).The latest time probably corresponds to the opening of the East China Sea up to the Korean Peninsula.The Ryukyu Arc, which extends from Kyushu to Taiwan in a span of 1,200 km, is another possible migration route of Q. phillyraeoides to Japan.This route, however, was blocked by the development of the Tokara gap north of the Amami islands and the Kerama gap south of the Okinawa islands around 1.5 Ma, associated with the backarc rifting of the Okinawa Trough (Osozawa et al., 2012).After the opening of the Japan Sea and the East China Sea, immigration of animals and plants from the Eurasian continent was limited to land bridges formed during the glacial ages in the late Pleistocene in the north through Sakhalin and in the south through the Korean Peninsula (Taira, 1990).A migration of Q. phillyraeoides via a land bridge from the Korean Peninsula is unlikely, however, because the oak occurs neither in the Korean Peninsula nor in northern Kyushu.These considerations suggest that migration of Q. phillyraeoides from mainland China came to an end when the East China Sea opened about 0.3 Ma.Both allelic richness and heterozygosity in microsatellite markers were remarkably reduced in the southernmost island populations of IZ and KK (Fig. 3).There was no evidence of a recent bottleneck in these populations (Table 4).Together with the observed monotypic pattern of the STRUCTURE diagram, this suggests that these populations have been isolated for a long time, reaching a genetic equilibrium.The water depth between KK and the west coast of Kyushu is at most 100 m, and, hence, these regions must have been connected at the LGM approximately 20,000 years ago.Isolation could have occurred through separation from Kyushu by rising seawater levels.The IZ population could be a remnant population that migrated from mainland China across the land bridge of the Ryukyu Arc.Because we have no record of Q. phillyraeoides in the Ryukyu Islands except in Okinawa-jima, including Izena-jima (Hatsushima and Amano, 1994), further study to investigate the origin of IZ, including a possible ancient human transplantation, is needed.
Genetic structure revealed by cpDNA haplotype distribution and microsatellite STRUCTURE analysis cpDNA haplotypes mostly co-occurred with certain clusters revealed by STRUCTURE analysis of microsatellite markers.Haplotypes A and C associated with clusters I and II, respectively, at K = 2 (Fig. 6).This is obvious in the mixed populations of YS and NM.In these populations, individuals dominated by cluster I had haplotype A, while individuals dominated by cluster II had haplotype C (data not shown).This implies that haplotype A is essentially fixed in cluster I, and haplotype C is fixed in cluster II.The larger F ST value in cluster II than in cluster I revealed by STRUCTURE analysis (Fig. 6 legend) implies that cluster II experienced more extensive genetic drift.The differentiated pattern in the geographical distribution of cluster I with haplotype A and cluster II with haplotype C suggests that they independently migrated to Japan as two different lineages.Populations AS and KK are exceptional as they are dominated by cluster II but fixed with haplotype A. This probably occurred by organelle genome capture when the two types grew together in southern Kyushu.A recent study has shown that introgression is likely to take place preferentially from the resident species toward the invading species.It was also demonstrated by simulation that introgression is stronger for genes experiencing little gene flow, such as organelle genes (Currat et al., 2008).This suggests that cluster I with haplotype A was resident in this area and was invaded by cluster II with haplotype C. Haplotype B may have arisen from a de novo mutation in the group dominated by cluster I in the southern part of the Kii Peninsula.
At K = 4, STRUCTURE analysis revealed three regional groups with differentiated genetic backgrounds.Many populations in group 1 are admixtures of clusters Ia and Ib, indicating the occurrence of frequent gene flow between them as shown by the absence of barriers on the northeastern Pacific coast (Supplementary Fig. S3).The F ST values revealed by the STRUCTURE analysis (Fig. 6 legend) are similar for clusters Ia, IIa and IIb, but much smaller in cluster Ib.This suggests that the population size of cluster Ib remained fairly large even during the last glacial age.Cluster Ib dominates the populations in the Kii Peninsula (TB, KS and SR) including MR in the Muroto-misaki Cape.Cluster Ib probably existed as a refugial population on the coast of the Kii Peninsula and expanded its distribution both eastward and westward after the LGM.Another refugium with cluster Ia probably existed in the Boso Peninsula and expanded westward.The existence of multiple refugia of lucidophyllous forests in the LGM has been suggested at the southern ends of major peninsulas or promontories on the Pacific side of Honshu, Shikoku and Kyushu (Kamei and Research Group for the Biogeography from Würm Glacial, 1981;Hattori, 2002;Aoki et al., 2004).
In the Seto Inland Sea, populations SY, TM, YS, NG, IG, OM and OO shared a common genetic background of cluster Ib, whereas KR, SG, NT, SD and TS had the genetic background of cluster IIa.At the LGM, the Seto Inland Sea basin was completely dried (Yonekura et al., 2001).The present ecological preference of Q. phillyraeoides suggests the absence of this species in the dried basin.Two waterways had formed on both sides of the Bisan-seto south of the Kojima Peninsula as a divide (Fig. 1, g).One flowed eastward and into the Pacific Ocean through the present-day Kii-suido Channel, while the other flowed westward and into the Pacific Ocean through the present-day Bungo-suido Channel (Yonekura et al., 2001).The populations in the Seto Inland Sea with a genetic background of cluster Ib might have been derived from refugia in the Kii Peninsula following the opening of the Kii-suido after the LGM as a result of rising sea levels.Similarly, the populations with a genetic background of cluster IIa are likely to have been derived from refugia located in southern Kyushu when the Bungo-suido Channel opened.These two channels were connected at the position of Bisan-seto 8,500 years ago (Ohta et al., 2004).The barriers observed between the coastal and island populations in this area (Supplementary Fig. S3) suggest that the populations derived via the Bungo-suido and Kii-suido Channels reached Bisan-seto around this time and have since been isolated by rising sea levels.Some populations with a genetic background of cluster Ib (NG, IG, OM, OO) and those with cluster IIa (KR and SG) might have migrated further to the west and east of Bisan-seto, respectively, after the two channels connected.
Both cluster IIa and IIb occurred in southern Kyushu (Fig. 6).Based on palynological studies, Tsukada (1983) suggested that a major refugium of lucidophyllous forest was located in the paleo-Yaku Peninsula at the LGM, which included the present-day Osumi Peninsula, Satsuma Peninsula, Yakushima, and Tanegashima.A refugium of cluster II was probably located in the paleo-Yaku Peninsula at the LGM and split into clusters IIa and IIb afterward.Mixing of cluster IIa and IIb was observed in NM in the Osumi Peninsula and YK in Yakushima (Fig. 6).This suggests that splitting occurred before Yakushima (and Tanegashima) became separated from the southern part of Kyushu.
A close phylogenetic similarity was observed between RY and KS in the NJ dendrogram (Fig. 5).The RY population was most likely transplanted by humans.An example of human transplantation of Q. phillyraeoides by fishermen in the Kii Peninsula in the 17 th century was recorded on the Goto islands in Nagasaki prefecture near the Amakusa islands, where the RY population is located (http://shinkamigoto.nagasaki-tabinet.com/guide/52263).
Implications of the study for conservation of Q. phillyraeoides Along the Pacific coast, Q. phillyraeoides populations are divided into several genetically differentiated groups, and may have been derived from multiple independent refugia at the LGM.These regional groups with genetically differentiated backgrounds should be considered as conservation units.Two very specialized populations, KK and IZ, are likely to have considerable conservation value.KK is fairly monomorphic due to long-term isolation, and forms pure dense stands along a 4 km-long sand reef (Nagamenohama), constructing a very specific landscape.IZ has been retained in very small numbers and is seriously endangered (http:// www.pref.okinawa.jp/site/kankyo/shizen/hogo/).Hence, this population is valuable both for conservation and for studying the migration process of Q. phillyraeoides from mainland China through the Ryukyu Arc.Transplantation or replantation of Q. phillyraeoides to foster coastal protection forests and next-generation charcoal production forests should be done based on regional groups as management units, which share a common genetic background considering their genetic variation and differentiation.

Fig. 1 .
Fig.1.Collection sites of Quercus phillyraeoides and chloroplast DNA haplotype distribution.The Kii Peninsula area is enlarged in the inset on the lower right.Characteristic locations are shown as: a, the Boso Peninsula; b, the Izu Peninsula; c, the Kii Peninsula; d, the Kii-suido Channel; e, the Bungo-suido Channel; f, the Seto Inland Sea; and g, Bisan-seto.

Fig. 3 .
Fig.3.The geographic distribution of two population genetics estimates, expected heterozygosity (H E ) and allelic richness (AR).Standard deviation of each estimate is used as the graphed unit and scaled on the y-axis, adjusting the grand means to zero.

Fig. 6 .
Fig. 6.Bar plots of STRUCTURE analysis results.Each bar shows an individual with an estimated membership fraction in each of the inferred number of clusters, K. Upper: bar plot for K = 2. Lower: bar plot for K = 4. Population codes are shown at the top.Letters at the bottom show the chloroplast DNA haplotypes observed in the population.At K = 2, the cluster of dark green is designated as cluster I, and that of pink as cluster II.The F ST values for clusters I and II were 0.0709 ± 0.0004 (SE) and 0.0966 ± 0.0005 (SE), respectively.At K = 4, the clusters of dark green and pale green are designated as clusters Ia and Ib, respectively.Similarly, the clusters of pink and red are designated as cluster IIa and IIb, respectively.The F ST values for clusters Ia, Ib, IIa and IIb were 0.1843 ± 0.0010 (SE), 0.1023 ± 0.0007 (SE), 0.1797 ± 0.0005 (SE) and 0.1910 ± 0.0005 (SE), respectively.

Table 1 .
Sampling sites and cpDNA haplotype frequency in Quercus phillyraeoides a Numbers in parentheses indicate the number of samples in which all five cpDNA regions were sequenced.
. Each reaction mixture contained 2x GoTaq Hot Start Colorless Master Mix (Promega, Madison, WI, USA) with 400 μM dNTPs, 4 mM MgCl 2 , 10 ng of DNA, and 5 pmol of primers in a total volume of 25 μl.PCR conditions included initial denaturation at 94 °C for 2 min, followed by 35 cycles of denaturation at 94 °C for 1 min, annealing at 56 °C (51 °C for trnT-trnL spacer and 60 °C for trnQ-trnS spacer) for 1 min, extension at 72 °C for 2 min, and a final extension at 72 °C for 7 min using a GeneAmp 9700 Thermal Cycler (Applied Biosystems, Foster City, CA, USA) or a Veriti 96-Well Thermal Cycler (Applied Biosystems).PCR products were subsequently held at 4 °C until electrophoresis on a 0.8% agarose mini-gel to determine the concentration of the amplified DNA.Successfully amplified PCR products were purified using ExoSAP-IT (Affymetrix, Santa Clara, CA, USA).

Table 2 .
Polymorphic sites in three cpDNA regions in Quercus phillyraeoides

Table 3
Na, observed number of alleles; H O , observed heterozygosity; H S , average heterozygosity; H T , total heterozygosity; G IS , inbreeding coefficient; G ST , coefficient of genetic differentiation; G' ST , coefficient of unbiased genetic differentiation; P, probability of G ST being larger than zero.G IS , G ST and G' ST are from

Table 4 .
Summary of genetic variation estimated using the 11 microsatellite markers in each population and the result of Bottleneck test n, sample size; Na, observed number of alleles; AP, number of private alleles; AR, allelic richness; H O , observed heterozygosity; H E , expected heterozygosity; uH E , unbiased heterozygosity; F, inbreeding coefficient.a Bottleneck was examined by Wilcoxon test: A, probability from one-tailed test for heterozygosity deficiency; B, probability from one-tailed test for heterozygosity excess; C, probability from two-tailed test for heterozygosity deficiency or excess.

Table 5 .
F-statistics estimated for overall loci based on the genetic variation revealed by microsatellite markers