Genetic diversity and population structure of the long-tailed hamster Cricetulus longicaudatus in Shanxi Province, China

The long-tailed hamster Cricetulus longicaudatus is a dominant rodent in farmland of Shanxi Province, China, but little is known about its genetic diversity and population structure. In this study, the genomic DNAs of individuals from 13 populations captured in different fields of Shanxi were extracted and amplified by six pairs of microsatellite primers and by universal primers for mtDNA COI gene sequences. Our data revealed significant departure from Hardy–Weinberg equilibrium in four of the 13 populations. In all 13 populations, the mean observed heterozygosity was significantly lower than the expected heterozygosity. Mean-while, the mean inbreeding coefficient was statistically significant, which indicated non-random mating within populations. The pairwise genetic distance and natural logarithm of linear geographical distance were not significantly correlated for any C. longicaudatus populations. However, the correlation between genetic distance and resistance distance based on mountain landscape was significant, suggesting that the mountain landscape is an important factor affecting gene flow of C. longicaudatus. Pairwise F ST analysis of population structure showed moderate to high genetic differentiation among populations, and all individuals could be divided into two gene clusters. Phylogenetic analysis based on COI sequences also showed that many individuals originated from a single haplotype, suggesting the existence of gene exchange among these populations at some time in the past. Our research should provide a scientific basis for the analysis of genetic differentiation and gene flow among populations of C. longicaudatus .


INTRODUCTION
With the continuous innovation and development of biotechnology, more and more molecular markers have been developed and utilized in recent decades. Among them, microsatellite DNA (simple sequence repeats, SSRs) and the cytochrome c oxidase subunit I (COI) gene Genetic diversity and population structure of the long-tailed hamster Cricetulus longicaudatus in Shanxi Province, China have been widely used in the genetic differentiation of various groups due to their rich polymorphism and intraspecific (crossing) variation (Park et al., 2010;Wei et al., 2015;Foley et al., 2016;Chu et al., 2019). In 1991, Moore et al. proposed the concept of SSRs, and pointed out that using heterologous PCR primers would facilitate the use of DNA microsatellites in gene mapping studies in closely related species such as cattle and sheep, rat and mouse, or primates (Moore et al., 1991). Therefore, with the help of these technologies, researchers have constructed genetic linkage maps by analyzing the genetic diversity and population structure of species, and determined the precise path and taxonomic status of species evolution (Yang et al., 2016;Shimma and Tadano, 2019;Kaddumukasa et al., 2020;Liu et al., 2020a;Sugiura et al., 2020). As a special functional group, rodents are closely asso-ciated with human beings, and play an important role in the maintenance of ecosystems and ecological balance. Their behavior, such as feeding, storing, transporting and digging, is conducive to the dispersal of plant seeds, and can accelerate the flow and circulation of materials, nutrients and energy in the ecosystem. However, when rodent populations break out, they can do great harm to agriculture, industry, health and other fields. Therefore, it is of great theoretical and practical importance to study the genetic diversity and population structure of rodents in different areas to reveal their evolutionary history and adaptive potential. At present, an ongoing research hotspot in the field of life science is to clarify the relationship, evolutionary status and population dynamics of rodents from the perspective of molecular biology. For example, the occurrence of hybridization between white-footed mice Peromyscus leucopus and deer mice P. maniculatus in southern Quebec, Canada, which shared similar morphology, ecology and life history traits, was investigated by genotyping wild caught specimens using microsatellite markers. The results suggested that hybridization occurs at extremely low frequency between the two species (Leo and Millien, 2017). In a study of the genetic diversity, migration characteristics and evolution of the Asian house rat Rattus tanezumi in China, using nine microsatellite loci and the COI sequence, Guo et al. (2019) proposed that the main way for R. tanezumi to spread to the north of China is via transportation lines such as railways and highways and the developed logistics network. On the other hand, the wild nutria (Myocastor coypus), an introduced rodent species in South Korea, showed a low level of diversity and no signature of genetic structuring based on microsatellite loci, indicating that Korean nutria individuals originated from a single population or a highly inbred reared herd (Kim et al., 2019). The long-tailed hamster Cricetulus longicaudatus (Rodentia: Cricetidae), one of the important farmland rodents in North China, is mainly distributed in Shanxi, Hebei, Inner Mongolia, Shaanxi, Gansu, Qinghai, Xinjiang, Tibet, Sichuan and other areas (Zheng et al., 2008). The species generally inhabits the broad-leaved forest zone at higher altitude, dry wasteland, shrub grass and farmland. Previous studies on C. longicaudatus mainly focused on population ecology, physiology and biochemistry, and made some important observations (Chang et al., 2011;Li et al., 2014;Poplavskaya et al., 2018). However, there have been few studies on its genetic diversity and population structure from the perspective of molecular biology. According to previous research, C. longicaudatus, as an important rodent affecting the agricultural ecosystem, is the dominant rodent species in most of the farmland in Shanxi, accounting for more than 70% in some areas (Ji et al., 2008;Yang et al., 2014Yang et al., , 2019. However, the complex topography and geomorphology lead to a diverse agroecosystem in Shanxi, which makes it difficult for us to comprehend the genetic diversity and population structure of C. longicaudatus. In this study, the genomic DNAs of 131 individuals of C. longicaudatus belonging to 13 populations captured in different fields in Shanxi Province ( Fig. 1) were extracted. Microsatellite markers and mtDNA COI gene sequences were used to study their genetic diversity and population structure. We aimed to provide a theoretical basis for revealing the evolutionary history of this species, analyzing its genetic differentiation and positioning in the ecosystem.
Factors affecting isolation The genetic distance and linear geographical distance were calculated (Table   4). The minimum value of genetic distance was 0.3024 for XIX-YUX and the maximum was 1.8248 for LOF-JNL. Values less than 0.5 were observed for LNC-XIX, LNC-YNH, XIX-YNH, XIX-LOF, XIX-YUX, QNX-YAQ and QNX-YUX, suggesting their close relationship (Table  4). The results of simple Mantel tests showed that the association of pairwise genetic distance and natural logarithm (Ln) of linear geographical distance was not significant (P > 0.05, r = 0.014). Pairwise resistance distances were also calculated in the Circuitscape 4.0 program. The association between genetic distance and resistance distance was significant (P < 0.05, r = 0.314), revealing an influence of mountain landscape on gene flow among C. longicaudatus populations in this study.    (Table 5), while the nucleotide diversity of LIS from Lishi was the lowest (π = 0.00038, H p D = 0.250, K = 0.250) ( Table 5). Tajima's D was statistically significant only for the QNX population from Qinxian (D = 1.993*), indicating a decrease in population size and/or balancing selection of C. longicaudatus in Qinxian (Table 5).
Phylogenetic analysis based on COI sequences The median-joining network based on COI gene sequences of C. longicaudatus showed that the 22 haplotypes mainly formed two distinct haplogroups (Fig. 3). Haplotype H1 contained 33.6% of individuals and included individuals from ZHY, QNX, LOF, LIS, XIX, YNH, ZOQ, LNC and YUX among the 13 populations (Fig. 3). This indicates that H1 is probably the ancestral haplotype. The individuals of H6, another main haplotype, which included 14 individuals, were mainly from the YAQ and YUX populations.

DISCUSSION
The genetic diversity of a species is important for maintaining its adaptability and evolutionary potential (Frankham et al., 2002). Therefore, the study of genetic diversity is helpful to reveal a species' origin and dispersal history. Rodents, as the most differentiated and widely distributed group of mammals, have attracted much attention in this field. Previous research using D-loop and COI sequences of mtDNA as well as 14 microsatellite loci demonstrated the existence of two demes, which were not completely isolated but were probably reinforced by a geographical barrier (Lopes and de Freitas, 2012). A  (Table 5). The H p per population ranged from two to six. The highest nucleotide diversity was found in the JNL population from Jingle (π = 0.00912, H p D = 0.786, K = 5.964) ( Table  5). The HNY population from Hunyuan also showed study about the dispersal and population structure of Ctenomys australis indicated that geographic distance between populations was the main cause of population genetic differentiation (Mora et al., 2010). In Argentina, the genetic diversity of Lagostomus maximus is at low levels, and only a single genetically distinct population exists, indicating that metapopulation processes and changes in human population dynamics during the late Holocene were important factors shaping the population genetic structure in Argentina (Gariboldi et al., 2019). In this study, we collected the genomic DNAs of 131 C. longicaudatus samples of 13 populations in Shanxi (Fig. 1), and successfully amplified six microsatellite loci and COI sequences. The study of genetic diversity and population structure provides a scientific basis for the analysis of genetic differentiation and gene flow among populations of C. longicaudatus in Shanxi. Our data about genetic diversity of C. longicaudatus revealed significant departure from HWE in four of the 13 populations. Drug control of all rodents has been carried out in four of these populations-QNX, YUX, LOF and YAQ-in recent years. Moreover, the samples were collected from farmland with frequent human activities or from recently planted landscape woodland, environments in which some genetic drift usually occurs. In all 13 populations, the mean H O was significantly lower than the H E . Meanwhile, the mean F IS was statistically significant, which revealed non-random mating within populations. The exact pressures underlying these conditions are unclear, but we speculate that the fragmentation of the ecological environment in Shanxi hinders these animals' dispersal behavior. The correlation between genetic distance and resistance distance based on the mountain landscape supported this view. Mantel tests between pairs of populations showed significant association, which suggested that the mountain landscape was an important factor affecting gene flow of C. longicaudatus. In other words, the frequency of gene flow between populations with short resistance distance was higher than that between populations with long resistance distance. Another possibility is that the more recent divergence events occurred among populations with close resistance distance, and gene flow rarely occurred after divergence. However, it is inexplicable that the weakest support for genetic differentiation (F ST of LNC-JNL) and the minimum genetic distance (Nei's distance of XIX-YUX) existed in populations inhabiting different mountains. We need some evidence of historical events that have occurred in these populations. Additionally, we considered that the weaker self-migration ability of this species has also influenced its gene flow, as has also been observed in other rodent studies. The plateau zokor (Eospalax baileyi) mainly lives underground and has limited migration ability. Therefore, expected geographical distribution, dispersal pattern, different process of evolu-tion, and environmental factors were important causes of genetic differentiation between geographic populations (Liu et al., 2018). Previous studies have suggested that inbreeding is a contributory mechanism for species to survive, but it usually has adverse effects on species diversity and gene flow (Huenneke, 1991;Zou et al., 2015).
Unlike in other studies, pairwise genetic distance and natural logarithm of linear geographical distance were not significantly correlated for any C. longicaudatus populations in the present study. This conclusion differs from what is called the stepping-stone model of population structure (Hutchison and Templeton, 1999;Liu et al., 2020b). That is, the species (population) has occupied the distribution area for a long time if genetic distance and geographical distance show a positive correlation, and this time is sufficient to achieve a balance between gene flow and genetic drift.
Regarding population structure, pairwise F ST analysis indicated that almost all the values showed moderate to high genetic differentiation among populations, and all individuals could be divided into two gene clusters (Fig.  2). Although the proportion of the two clusters in some populations was quite different, they existed in every population. Phylogenetic analysis based on COI sequences also showed that many individuals originated from haplotype H1. This suggests the existence of gene exchange among these populations at some time in the past. In the future, we will need a larger sample size and more data analysis to reveal the molecular evolutionary history of the species.

MATERIALS AND METHODS
Sample collection and population distribution A total of 131 individuals of C. longicaudatus belonging to 13 populations (n = 8-12 hamsters per population) were collected from 13 distant locations of Shanxi Province, China, where no specific permits were required ( Fig. 1 and Table 1). The individuals were captured with living cages or clips, with peanuts, walnuts and apples as bait, in the field. The 13 populations were collected from Xixian (XIX), Yonghe (YNH), Zhongyang (ZHY), Lishi (LIS), Loufan (LOF), Yangqu (YAQ) and Jingle (JNL), located in the Lvliang Mountains, and from Lingchuan (LNC), Qinxian (QNX), Zuoquan (ZOQ), Yuxian (YUX), Wutai (WUT) and Hunyuan (HNY), located in the Taihang Mountains ( Fig. 1 and Table 1). The tail tips of individuals were collected and stored in absolute ethanol at -80 °C for DNA extraction. All procedures above were conducted in accordance with Chinese laws.
DNA extraction Genomic DNA extraction from tail tips used the Blood/Cell/Tissue Genomic DNA Extraction Kit (TIANGEN, DP304), and was performed according to the manufacturer's protocol. The DNA was eluted in TE buffer and stored at -20 °C.
Microsatellite amplification All samples were genotyped for six microsatellite loci obtained from previously published studies about C. barabensis (Xu et al., 2008), which belongs to the same genus as C. longicaudatus. DNA was amplified in a thermocycler (Bioer Technology, G1000) in 25-μl reactions containing 1 μl of total DNA (30-50 ng), 12.5 μl of 2 × Taq PCR MasterMix (TIANGEN, KT201), 9.5 μl of ddH 2 O, and 1 μl of each primer. Cycling conditions were as follows: 3 min at 94 °C, followed by 35 cycles of 30 s at 94 °C, 30 s at the annealing temperature provided by Xu et al. (2008) and 30 s at 72 °C, and then 5 min at 72 °C. The length of the PCR products was determined using GeneMapper v. 4.0 software (Applied Biosystems).
COI gene amplification and sequencing COI gene fragments of C. longicaudatus were amplified with universal primers BatL5310 (5′-CCTACTCRGCCATTT-TACCT ATG-3′) and R6036R (5′-ATCTCTGGGTGTC-CAAAGAATCA-3′) to explore sequence polymorphism and population genetics. The PCR reactions were performed in a G1000 thermocycler in a 25-μl mixture containing 1.5 μl of total DNA (30-50 ng), 12.5 μl of 2 × Taq PCR MasterMix and 1 μl of each primer. Cycling conditions were as follows: 5 min at 95 °C, followed by 35 cycles of 45 s at 95 °C, 60 s at the annealing temperature of 54 °C and 60 s at 72 °C, and then 10 min at 72 °C. The amplified products were identified by 1% agarose gel electrophoresis, and then sent to Shanghai Personal Biotechnology Company for sequencing.
Microsatellite DNA analysis Deviations from Hardy-Weinberg equilibrium (HWE), HWE P-values and linkage disequilibrium for each population were tested by GENEPOP version 4.0 (https://genepop.curtin.edu.au/) (Rousset, 2008). The average number of alleles (N A ), allelic richness (R S ), observed heterozygosity (H O ) and expected heterozygosity (H E ) were estimated using Microsatellite Analyser version 4.05 to characterize the genetic diversity per locus of every population (Dieringer and Schlötterer, 2003). The inbreeding coefficient (F IS ), pairwise fixation indices (F ST ) and P-values were calculated to estimate the statistical significance of population differentiation (Bonferroni correction was used to make significance level = 0.05/78 = 0.0006) with ARLEQUIN version 3.5.1.2 (Excoffier and Lischer, 2010).
The population genetic structures of C. longicaudatus were obtained using STRUCTURE version 2.3.4. Ten independent-repeated runs for each K-value from 2 to 10 were carried out to estimate the most likely number of clusters. Each replicate run consisted of a burn-in period of 50,000 steps, followed by a 1,000,000 Markov chain Monte Carlo replicate (Evanno et al., 2005;Yang et al., 2016;Wang et al., 2019).
Based on computing pairwise Nei's genetic distances using Popgen 32 (Saitou and Nei, 1987;Krawczak et al., 2006), simple Mantel tests between genetic distance and natural logarithm (Ln) of linear geographical distance were performed using R 4.0.5. Given that the sampled populations were located on two mountain ranges (Taihang Mountains and Lvliang Mountains), we hypothesized that the mountain landscape had an impact on gene flow of C. longicaudatus. Landscape resistance was modeled as a function of the mountain areas variable to estimate the effects of landscape features on gene flow (Cushman et al., 2006(Cushman et al., , 2013. The landscape resistance value of each grid cell was determined by the mountain area (the minimal resistance was 1, the maximal resistance was 50). The resistance distances were calculated in the Circuitscape 4.0 program (Zhao et al., 2020). The association between genetic distance and resistance distance was also analyzed by simple Mantel tests in R 4.0.5.
Sequence analysis In total, 119 COI gene sequences were aligned by the program Muscle within MEGA X (Edgar, 2004). The number of segregating sites (S), the number of haplotypes (H p ), haplotype diversity (H p D), nucleotide diversity (π) and the average number of nucleotide differences (K) were calculated with DnaSP version 5.10 (Librado and Rozas, 2009). Neutrality tests used the statistics from the Tajima's test in DnaSP version 5.10. The median-joining network of COI gene haplotypes was constructed by NETWORK 10.1.0.0 (Bandelt et al., 1999).