Allelic variation and genetic diversity of HMW glutenin subunits in Chinese wheat (Triticum aestivum L.) landraces and commercial cultivars

Wheat landraces have abundant genetic variation at the Glu-1 loci, which is desirable germplasms for genetic enhancement of modern wheat varieties, especially for quality improvement. In the current study, we analyzed the allelic variations of the Glu-1 loci of 597 landraces and 926 commercial wheat varieties from the four major wheat-growing regions in China using SDS-PAGE. As results, alleles Null, 7+8, and 2+12 were the dominant HMW-GSs in wheat landraces. Compared to landraces, the commercial varieties contain higher frequencies of high-quality alleles, including 1, 7+9, 14+15 and 5+10. The genetic diversity of the four commercial wheat populations (alleles per locus (A) = 7.33, percent polymorphic loci (P) = 1.00, effective number of alleles per locus (Ae) = 2.347 and expected heterozygosity (He) = 0.563) was significantly higher than that of the landraces population, with the highest genetic diversity found in the Southwestern Winter Wheat Region population. The genetic diversity of HMW-GS is mainly present within the landraces and commercial wheat populations instead of between populations. The landraces were rich in rare subunits or alleles may provide germplasm resources for improving the quality of modern wheat.


Introduction
Wheat (Triticum aestivum L.) is one of the three main cereal crops in the world. It is a staple food and an important source of protein for humans (Payne 1987). The gluten protein in wheat grain is the chemical bases for making various end-products. Its processing quality is mainly determined by seed storage proteins that consist of polymeric glutenins and monomeric gliadins , Shewry et al. 1986). Glutenin plays a major role in dough elasticity, while gliadin mainly affects dough viscosity (Anjum et al. 2007, Ciaffi et al. 1996. According to the relative mobility of the glutenins in sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), glutenin proteins are divided into two groups: high-molecular-weight glutenin subunits (HMW-GS) and low-molecular-weight glutenin subunits (LMW-GS) (Payne 1987). Although HMW-GS only accounts for 10% of the total gluten proteins, it is the main component of gluten polymer, as the "network backbone" of gluten protein, and plays a decisive role in gluten elasticity (Don et al. 2006, Payne et al. 1980, Shewry et al. 1992 HMW-GS are encoded by a multi-gene family located at Glu-A1, Glu-B1 and Glu-D1 loci on the long arms of chromosomes 1A, 1B and 1D, respectively . Previous studies confirmed each locus contains two tightlylinked genes encoding the x-and y-type subunits: the larger x-type subunits with 80-88 kDa and the smaller y-type subunits with 67-73 kDa (Harberd et al. 1986, Liu et al. 2009, Shewry et al. 2003. Compared with y-type HMW-GS, the x-type HMW-GS has a greater contribution to gluten strength (Zhang et al. 2007). Most common bread wheat cultivars usually express only three to five subunits because some genes encoding the y subunits are silent (Roy et al. 2021, Shewry et al. 2003. Therefore, in common bread wheat, 1Ax and 1Ay subunits are encoded by Glu-A1, 1Bx and 1By encoded by Glu-B1, and 1Dx and 1Dy encoded by Glu-D1 (Payne 1987, Payne and Lawrence 1983. Various studies have shown that the Glu-1 locus exhibits different allelic variations and gene inactivation in bread wheat, which are closely related to the end-use quality (Ma et al. 2005, Shewry et al. 1992). The quality and quantity of HMW-GS has a profound influence on the quality of bread-making and dough properties of wheat flours (Payne 1987). The allelic composition and expression level of HMW-GS are strongly associated with dough properties and bread-making quality (Gupta et al. 1993, Payne 1987, Roy et al. 2018, Shewry et al. 1992. Particularly, some subunits such as 1Dx5+1Dy10, 1Bx14+1By15, and 1Ax1 have been associated with superior bread-making quality while 1Dx2+1Dy12, 1Bx20 have negative effects on dough strength (Altpeter et al. 1996, Redaelli et al. 1997. Subunits 1Ax2* and 1Dx17+1Dy18 also have a positive impact on bread-making quality (Ma et al. 2005. The high expression levels of some subunits also show positive effects on wheat processing quality (Butow et al. 2003). In particular, two functional copies of the 1Bx7 OE subunit significantly improve the dough strength (Ragupathy et al. 2008). The low frequency of subunit 5+10 and high frequency of subunit 2+12 are partially responsible for the typical weak gluten characteristics of Chinese wheat commercial varieties (He et al. 1992). In recent years, the Chinese wheat breeding priority has been shifted from yield to quality, resulted in an increased demand for germplasms with a broad diversity of HMW-GS alleles (Hu and Wang 2016). A trend occurred in recent years marks the exploration of the glutenin diversity in wheat relatives (Zhang et al. 2018). Nevertheless, the glutenin diversity in the vast number of Chinese wheat landraces characterized by heterogeneity and rich genetic backgrounds has not been systematically studied (Newton et al. 2010).
The specific distribution of subunit combinations in the main Chinese wheat growing areas is not clear. This study focuses on the distribution of subunit or allele combinations in the main wheat growing regions of China. As results, a large number of new subunit combinations were only found in landraces. Narrow genetic diversity was found for the subunit combinations of the Chinese wheat commercial varieties, especially in the southwest winter wheat region. These results are potentially useful for Chinese wheat breeding.

Plant materials
A total of 1523 Chinese common bread wheat lines were analyzed (Supplemental Table 1), including 597 landraces and 926 commercial varieties. The 597 landraces are from Huang-huai Winter Wheat Region (HWWR) and Middle and Low Yangtze Winter Wheat Region (MLYWWR). The commercial varieties are from four major wheat growing areas in China, including 504 from HWWR, 131 from MLYWWR, 133 from Southwestern Winter Wheat Region (SWWR), 158 from Northwest Spring Wheat Region (NSWR). Among them, the 635 wheat varieties from HWWR and MLYWWR were used for comparative analysis with wheat landraces. Since the HWWR is the main wheat growing area in China, the number of commercial varieties collected from this region was rather large. Due to the small number of NSWR region commercial varieties held in our laboratory, data of 71 NSWR commercial varieties were cited from another study (Zheng et al. 2020). Cultivars Zhongyou9507 and Chinese spring were used as standard cultivars for allele identification.

Electrophoretic analysis
For the protein extraction, three individual seeds were ground to a fine powder using a tissue grinder (TissueLyser II). Meal flours of 40 mg were extracted into 600 μl SDS-PAGE sample buffer (62.5 mM Tris-HCl pH 6.8, 2% (w/v) SDS, 10% (v/v) glycerol, 5% (v/v) 2-mercaptoethanol, 0.002% (w/v) bromophenol blue), 2 min under continuous vortex mixing, and shaken for 30 min, placed in a 90°C boiling water bath for 5 min. The mixtures were immediately centrifuged for 5 min at 13,000 rpm, and 8 μl of supernatant was used for SDS-PAGE analysis. The SDS-PAGE was conducted with 12% (w/v) running gels and 4% (w/v) stacking gels. In an electrode buffer of 1 times Tris-glycine, the gels were carried out at a constant current of 25 mA for about 5 h (Liuyi-Beijing mini-cell apparatus). After electrophoresis, the gel plate was placed in 0.05% (w/v) Coomassie Brilliant blue solution (R-250) for 8-10 h, and then decolorized with water for 10-12 h. HMW-GS was classified using the nomenclature of Payne and Lawrence (1983).

Statistical methods
Microsoft Excel was used to calculate number of allele combinations and the allelic frequencies. The method of Gao et al. (2020) was used for cluster analysis with some modifications. Subunit band was recorded as "0 and 1", with "1" as present and "0"as absent. Data were entered into a (0, 1) matrix. The cluster analysis between glutenin subunit compositions were performed using the R method.
According to Liu's calculation method, the following genetic diversity indicators were calculated: alleles per locus (A), percent polymorphic loci (P) (≤0.99 criterion), effective number of alleles per locus (Ae), and expected heterozygosity (He), to measure the genetic diversity of the population .
The Nei (1973) calculation method was used to calculate total genetic diversity (H t ), genetic diversity within populations (H s ), genetic diversity among populations (D st ), proportion of genetic variation occurring among populations (G st ) and interpopulational gene diversity relative to the intrapopulational gene diversity (R st ), and evaluate the degree of genetic differentiation among populations and in the population. example of HMW-GS patterns in some landraces and commercial varieties is shown in Fig. 1.

Cluster analysis of HMW-GS compositions in different wheat planting regions
In order to determine the difference in the proportion of high-quality allele combinations between different wheat planting regions, 73 allele combinations were used to perform a cluster analysis on all varieties. Rare combination "null, 20X+20Y, null" was not used for cluster analysis. Cluster analysis based on allelic similarity at the Glu-1 loci classified varieties into six major categories (Fig. 2). Calculating the quality scores of the allelic combinations, the category II ranged from 4 to 10. The quality scores of categories IV, V and VI ranged from 6 to 10 with most allelic combinations above 8. Due to rare allelic combinations, the quality score was not calculated in categories I and III. The category II consists of 88.15% of the landraces and 87.22% of the same region commercial varieties. Only 6.92% of the landraces and 2.04% of the same region commercial varieties fall in categories I+III. Categories IV+V+VI contain 5.38% of the landraces and 10.55% of the same region commercial varieties, of which most of the cultivars had high quality scores. The landraces had lower genetic diversity and proportion of high-quality subunit combinations than these of the same region commercial varieties.
Cluster analysis of allelic combinations generated six categories, suggesting that significant variations exist in the HMW compositions among the four wheat regions (Fig. 2). Most commercial cultivars from the four regions fell into the category II. While the IV+V+VI categories contained 11.53%, 6.86%, 26.3% and 5.68% of the HWWR, MLYWWR, SWWR and NSWR cultivars, respectively. Only 1.99%, 2.29%, 13.53% and 6.32% of the commercial varieties in HWWR, MLYWWR, SWWR and NSWR belong to categories I+III, respectively. These results indicate that the genetic diversity of the SWWR cultivars was the highest, while the lowest was MLYWWR. The SWWR cultivars had the highest proportion of high-quality subunit combinations while the NSWR had the lowest.

Genetic variation at the Glu-1 loci within and between populations
According to the germplasm originating sites, the research materials are divided into six populations, including four commercial cultivar populations from HWWR, MLYWWR, SWWR and NSWR, one landrace population and one commercial cultivar population come from HWWR and MLYWWR. The genetic differentiation was analyzed within and between populations. The values of genetic differentiation parameters of the HMW-GS locus of the six populations were presented in Table 4. When the four individual commercial variety populations combined together, the genetic diversity indicators were calculated as A (alleles per locus) = 7.33, P (percentage of polymorphic loci) = 1.00, Ae (effective number of alleles per locus) = 2.347 and He (mean diversity index) = 0.563, and the average values of the four regions were A = 5.09, P = 1.00, Ae = 2.349 and He = 0.552 (Table 4). The largest A value was 6.00 in the HWWR population. The P value for every population was 1.00. The Ae and He were 2.841 and 0.591 in the SWWR population, respectively. Among the four commercial region populations, the highest genetic diversity was found in the SWWR population (A = 4.00, P = 1.00, Ae = 2.841 and He = 0.591), while the MLYWWR population showed the lowest (A = 4.67, P = 1.00, Ae = 2.090 and He = 0.520). The genetic diversity of the landraces plus the same region commercial varieties was A = 8.33, P = 1.00, Ae = 1.994 and He = 0.486, and the average values of the landrace population and the same region commercial variety population was A = 7.00, P = 1.00, Ae = 1.892 and He = 0.442 (Table 4). In comparison with the landrace population and the same region commercial variety population, the former had A = 6.33, P = 1.00, Ae = 2.247 and He = 0.548, which was greater than that of the latter population (A = 7.67, P = 1.00, Ae = 1.537 and He = 0.337). This result was similar to the result of cluster analysis, mainly as the frequency of allele distribution within the commercial variety population was relatively uniform, and the distribution was relatively concentrated in the landrace population.
Genetic differentiation at the Glu-1 loci in different wheat regions are presented in Table 5. When all commercial wheat varieties were analyzed together, the average total genetic diversity (H t ) for the three loci was 0.563 (Glu-A1 at 0.521, Glu-B1 at 0.656 and Glu-D1 at 0.511). The mean genetic diversity of each population (H s = 0.552) was much higher than the mean genetic diversity among populations (D st = 0.011), indicating that the genetic diversity between populations was lower than that within populations. The average relative differentiation among populations was G st = 0.017 and ranged from 0.002 at Glu-A1 to 0.024 at Glu-B1, indicating that 1.7% of the gene diversity was among populations while 98.3% of the gene diversity was within populations.
The average total genetic diversity (H t ) of the Glu-1 loci was 0.486 in the landraces combining the same region commercial varieties, the mean genetic diversity within populations (H s ) and mean genetic diversity among populations (D st ) was 0.442 and 0.044, respectively. This shows that the genetic diversity between the landrace population and the same region commercial variety population was lower than that within these two wheat populations. The average relative differentiation between populations was G st = 0.091, indicating 9.1% of the gene diversity was between the two  a Alleles were subunit combinations of Glu-A1, Glu-B1 and Glu-D1 loci, for example, "afa" allele composition was defined "the first a = 1 subunit at Glu-A1, f = 13+16 subunit at Glu-B1, the second a = 2+12 subunit at Glu-D1", and the "afd" allele composition was defined "a = 1 subunit at Glu-A1, f = 13+16 subunit at Glu-B1, d = 5+10 subunit at Glu-D1".
Allelic variation and genetic diversity of HMW-GS in wheat Breeding Science Vol. 72 No. 2 populations while 90.9% of the gene diversity was within populations.

Discussion
Utilization of gluten allelic variation in breeding programs is the key to improving wheat quality . Twenty-six different allelic variations at the Glu-1 loci were found among 1523 wheat lines in the current study. Among them, 23 and 22 different HMW-GS alleles were detected in landraces and commercial varieties, respectively. According to previous reports, Zheng et al. (2011) found 22 HMW-GS alleles in studies involving 485 landraces from the Yangtze River region. Liu et al. (2007) identified 16 HMW-GS alleles in 111 Hubei wheat landraces. Zhang et al. (2002) detected 28 HMW glutenin alleles from 3,459 Chinese landraces. Liu et al. (2005) reported 16 alleles in 251 Chinese commercial cultivars. More recently, Dai et al. (2020) detected 16 alleles at the Glu-1 loci in 300 Xinjiang wheat landraces. Gao et al. (2020) found 15 allelic variations at the Glu-1 loci in commercial cultivars from China. In comparison, Yasmeen et al. (2015) detected much higher number of alleles at Glu-1 in indigenous landraces and commercial cultivars of Pakistan. Previous reports showed that subunits 1 or 2* at the Glu-  A1 locus have better effects to improve bread-making quality than null subunit (Luo et al. 2001). The frequency of other Glu-A1 alleles in durum wheat landraces have shown higher than that of null allele (Ammar et al. 2000, Branlard et al. 1989, Magallanes-López et al. 2017. In some reports, the null allele was the most frequent at the Glu-A1 locus for Chinese wheat landrace (Dai et al. 2020, Wei et al. 2000, Zheng et al. 2011. In this study, the null allele was observed to be the most frequent at Glu-A1 in wheat landraces (77.72%) and the same region commercial cultivars (55.91%). Whereas 41.73% of the commercial varieties contains the subunit 1, far higher than that of the landraces (16.08%). The null allele was observed in a high frequency in three regions (HWWR, MLYWWR and SWWR). In NSWR, the frequency order is 1 > null > 2*, which is consistent with the results of Liu et al. (2005). The high frequency of the Glu-A1a had been observed in Spanish (86.5%) and European (84.81%) wheat varieties , Caballero et al. 2004. Yasmeen et al. (2015) showed that the 2* and null alleles were the most frequent in Pakistan wheat varieties and landraces. For the Glu-B1 locus, 14 and 11 alleles were detected in the landraces and commercial cultivars, respectively. The most frequent alleles were 7+8 and 7+9. The frequency of 7+8 was 74.04% in the landraces, which is consistent with the results of other Chinese and Japanese landraces (Dai et al. 2020, Nakamura 2000, Zheng et al. 2011. In contrast, the major allele in India and Pakistan landrace was 17+18 (Goel et al. 2018, Niwa et al. 2008. Two alleles, Glu-B1b (7+8) and Glu-B1c (7+9) appeared most frequently in four commercial wheat cultivation regions. Liu et al. (2005) identified 7+8 (Glu-B1b) and 7+9 (Glu-B1c) were the major alleles in Chinese commercial cultivars. Alleles (Glu-B1b) 7+8 and (Glu-B1c) 7+9 were predominant in varieties from France, Argentina and Pakistan (Branlard et al. 1989, Lerner et al. 2009, Tabasum et al. 2011, whereas alleles 13+16 (Glu-B1f) was the most frequent in varieties from Spanish , Caballero et al. 2004. Liu et al. (2007) found that the Glu-B1h (14+15) allele was a unique allele in Chinese wheat cultivars, which was reported to enhance the dough quality parameters such as SDS sedimentation value and resistance breakdown value (Brites and Carrillo 2001). In the current study, the Glu-B1d (6+8) allele frequency was higher in SWWR than that of other Chinese wheat Regions. Subunits 7 and 17+18 were both identified with low frequency in the landraces and commercial cultivars. Allele 17+18 (Glu-B1i) had positive effect on sedimentation and mixograph (Carrillo et al. 1990, Ram 2003. Alleles Glu-B1ao (7+16) and Glu-B1g (13+19) occurred at lower frequencies in the landraces and were not detected in four commercial wheat growing regions. Allelic variations at the Glu-D1 locus were important for dough quality (Kolster et al. 1991). Gupta et al. (1994) showed that allele Glu-D1d (5+10) is associated with superior bread-making quality while allele Glu-D1a (2+12) reduces the bread-making quality. In the current study, the frequency of the Glu-D1a (2+12) allele was far higher in the landraces than commercial cultivars. Previous reports also showed that the allele Glu-D1a (2+12) was the most frequent in Chinese and Japanese wheat landraces (Dai et al. 2020, Nakamura 2000, Zhang et al. 2002, Zheng et al. 2011. For commercial cultivars, allele Glu-D1a (2+12) also appeared more than allele Glu-D1d (5+10) although the latter allele also appeared as a common allele, which is in agreement with previous results (Gao et al. 2018, Liu et al. 2005. According to Yasmeen et al. (2015), allele Glu-D1d (5+10) was the most frequent in Pakistan commercial wheat varieties. It is worth noting that in our study we detected rare subunits 2.2+12 and 2+11 among commercial wheat cultivars. Nakamura (1999) showed that the subunit 2.2+12 was found frequently in Japanese varieties. Meanwhile, allele Glu-D1h (5+12) was detected in our study, which was reported to have shown better overall quality characteristics and bread loaf volume in synthetic hexaploids (Peña et al. 1995). Overall, the frequency of subunits Glu-D1d (5+10) and Glu-D1h (5+12) were higher in the Chinese commercial wheat cultivars than wheat landraces in our study.
The allele combinations in the current study were classified into six categories, and the proportion of high-quality allele combinations of wheat commercial varieties was greater than that of the landraces. The average Glu-1 quality score of Chinese landraces were similar to that of the landraces from Japan, but lower than that of the landraces of Pakistan (Nakamura 2000, Yasmeen et al. 2015. Other results showed that the genetic diversity and proportion of high-quality subunit combinations of the SWWR region was most abundant among the four regions. The high value of genetic diversity means the high proportion of rare alleles (Novoselskaya-Dragovich et al. 2011). High-quality alleles Glu-B1i (17+18), Glu-B1f (13+16) and Glu-B1h (14+15) were found in the IV, V and VI categories, which occurred at a high frequency of 26.3% in SWWR. However, the highest proportion of allele combinations of 10 quality scores were detected in HWWR and NSWR. HWWR and NSWR are major noodle-consuming area, Allelic variation and genetic diversity of HMW-GS in wheat Breeding Science Vol. 72 No. 2 resulting in a relatively high frequency of the high-quality subunit combinations. The frequencies of the 10 quality score allele combinations were far higher in the commercial cultivars than in the landraces, indicating the impact of modern breeding programs in China. The average Glu-1 loci quality scores of commercial wheat varieties in China was below average in global aspects. For example, Novoselskaya-Dragovich et al. (2011) reported that the average Glu-1 loci quality score of Chinese wheat commercial varieties was similar to that of the wheat varieties from Italy but were less than those from Russia and Canada. The level of genetic diversity indicators for HMW-GS found in the four commercial wheat regions was He = 0.563, with the SWWR population showing the highest (He = 0.591) and the MLYWWR population showing the lowest (He = 0.520). Our results were similar to that reported by Zhang et al. (2002). Novoselskaya-Dragovich et al. (2011) found a similar genetic diversity index in Chinese wheat cultivars, which was higher than these of the Canada and England lines but lower than these of the French, Italy and Australia bread wheats. The genetic diversity index in all Chinese commercial cultivars was higher than that of the Argentinean wheats (He = 0.458) (Lerner et al. 2009). The genetic diversity index of the wheat landraces was 0.337, lower than that of the same region commercial cultivars (0.548) but higher than that of the Chinese core collection (0.232) (Zhang et al. 2002) and Hubei landraces (0.238) ). Nakamura (2000) found a lower genetic diversity in the Japanese landraces (He = 0.265) (Ruiz et al. 2002). The number of alleles per locus (A) of the landrace population was 7.67, which was higher than that of the same region commercial cultivars (6.33). The results indicated that the distribution of variation types of Chinese commercial varieties was more balanced than that of the landraces. Due to self-pollination and natural isolation of wheat, gene exchange between populations was very limited, resulting in some alleles only exist in a relatively narrow range. In the commercial varieties, the introduction of artificial hybridization promoted the exchange of genetic information between different countries and ecological regions, which significantly improved the genetic diversity of the breeding varieties (Novoselskaya-Dragovich et al. 2011). As a matter of fact, 1.7% of the gene diversity was among the four regions while 98.3% was within regions. This result proves that the allele types selected by breeding in the four major regions are similar.
The increase of genetic diversity in commercial varieties from the four Chinese wheat regions may be due to the increase of the proportion of high-quality subunits in breeding programs. Due to high allele diversity, the Chinese landraces have the potential to be used wheat quality breeding in China.
A large number of new subunit combinations exist in landraces, which did not exist in commercial varieties. Cluster analysis of the allelic combinations showed that the proportion of high-quality subunit combinations was higher in commercial varieties than that of the landraces. The rare alleles found in landraces indicated that the Chinese landraces can potentially useful for wheat quality improvement. The genetic diversity analysis revealed limited subunit combinations in Chinese commercial wheat varieties, especially in the southwest winter wheat region.

Author Contribution Statement
X. W., R. S., Y. A., H. P. and S. G. conducted the experiments measurements. X. R. conceived this study and designed the experiments. X. W. performed the statistical analysis and wrote the manuscript. X. R. and D. S. coordinated the experiments and oversaw the data analysis. X. R. revised the manuscript. All authors had read and approved the final version of the manuscript.