Development of EMS-induced Mutagenized Groundnut Population and Discovery of Point Mutations in the ahFAD2 and Ara h 1 Genes by TILLING

conversion of oleic acid linoleic with of a second double bond into the hydrocarbon Abstract: Reducing allergenicity and increasing oleic content are important goals in groundnut breeding studies. Ara h 1 is a major allergen gene and Delta(12)-fatty-acid desaturase ( FAD2 ) is responsible for converting oleic into linoleic acid. These genes have homoeologues with one copy in each subgenome, identified as Ara h 1.01 , Ara h 1.02 , ahFAD2A and ahFAD2B in tetraploid groundnut. To alter functional properties of these genes we have generated an Ethyl Methane Sulfonate (EMS) induced mutant population to be used in Targeting Induced Local Lesions in Genomes (TILLING) approach. Seeds were exposed to two EMS concentrations and the germination rates were calculated as 90.1% (1353 plants) for 0.4% and 60.4% (906 plants) for 1.2% EMS concentrations in the M 1 generation. Among the 1541 M 2 mutants, 768 were analyzed by TILLING using four homoeologous genes. Two heterozygous mutations were identified in the ahFAD2B and ahFAD2A gene regions from 1.2% and 0.4% EMS-treated populations, respectively. The mutation in ahFAD2B resulted in an amino acid change, which was serine to threonine predicted to be tolerated according to SIFT analysis. The other mutation causing amino acid change, glycine to aspartic acid was predicted to affect protein function in ahFAD2A . No mutations were detected in Ara h 1.01 and Ara h 1.02 for both EMS-treatments after sequencing. We estimated the overall mutation rate to be 1 mutation every 2139 kb. The mutation frequencies were also 1/317 kb for ahFAD2A in 0.4% EMS and 1/466 kb for ahFAD2B in 1.2% EMS treatments. The results demonstrated that TILLING is a powerful tool to interfere with gene function in crops and the mutagenized population developed in this study can be used as an efficient reverse genetics tool for groundnut improvement and functional genomics.

chain of the former 17 . The mutations in ahFAD2A and ahFAD2B cause to loss of function of oleoyl-PC desaturase activity which is responsible for high oleic acid content 15, 18 20 . Thanks to functional mutations in these genes, the different high oleic varieties were developed 21 23 and later registered 24 26 .
The eliminating allergenic effect is other important breeding target in groundnut. Because this crop is in charge of the most food-related allergies related with fatal food-induced anaphylaxis 27 . Among the allergenic proteins detected in groundnut, Ara h 1 vicilin is composed of 12-16 of total seed protein 28 and it is identified by serum IgE from 90 of groundnut-sensitive patients 29 . Two homologues of Ara h 1 have been identified as Ara h 1.01 and Ara h 1.02 in tetraploid groundnut and different studies have been conducted to knock out of these genes for preventing the allergic potential of this crop 30 32 .
Reverse genetics is a potentially important approach to identify novel mutations in gene of interest. TILLING, one of the reverse genetics methods, can be applied to any plant species, regardless of its genomic structure and ploidy level 33 . This non-transgenic method does not require transformation compared to other reverse genetic methods such RNAi technology 34 and T-DNA insertion mutagenesis 35 . TILLING aims to find nucleotide changes induced by chemical mutagenesis in target genes 36,37 which make it possible to modify the protein function. EMS ethyl methane sulfonate is widely preferred mutagen in this strategy because it induces single nucleotide alterations by alkylation of specific nucleotides causing wide spectrum mutations 38,39 . This can be silent, nonsense, missense and splicing mutations in gene coding regions 40,41 . TILLING was successfully applied to different plant species such as arabidopsis 42 ; oat 40 ; canola 43 ; soybean 44 46 ; rice 47 ; wheat 48 ; groundnut 30,31 ; tomato 49 , sunflower 50 and tobacco 51 showed that this method is an important alternative for functional analysis in plant species. From this perspective, we report to development of EMS induced groundnut mutant population and TILLING analysis in M 2 generation using four homologues genes Ara h 1.01, Ara h 1.02, FAD2A and FAD2B to discover point mutations.

Plant material
The groundnut cultivar, NC-7 Virginia market type , belongs to subsp. hypogaea var. hypogaea. It has good agronomic traits such as large pods, high yielding and shelling percentage 52 . The seeds of this cultivar were obtained from the West Mediterranean Agricultural Research Institute of Turkey. Before the mutagen applications, the original seeds were grown in two generations to ensure homogeneity.

Mutagenesis
Approximately 3000 seeds were imbibed in tap water for 10 hours. These seeds were then transferred to aqueous solution of mutagen. Two different concentrations of EMS, 0.4 and 1.2 , were used for TILLING. The 1500 seeds were soaked in 0.4 v/v EMS solution at the ratio of 75 seeds/100 mL and shaken gently with 10 h at room temperature. In the other application, remaining 1500 seeds were soaked with agitation 2.5 hours on shaker in 2000 mL of distilled water containing 1.2 v/v EMS concentration. After these treatments, seeds were thoroughly washed with deionized water for three times and rinsed extensively in running water overnight. The EMS-treated seeds M 1 were sown in the experimental field with an inter-row spacing of 70 cm and an intra-row spacing of 20 cm. M 2 seeds were collected from individual M 1 plants and one pod from each M 2 seeds was planted in greenhouse and later thinned to have a single plant per pot. M 3 seeds from each M 2 plant were packed and stored for further studies.

DNA isolation and pooling
Total DNA was isolated with a modified CTAB method 53 using leaf tissue from individual M 2 plants. All genomic DNAs were quantified on a 1.0 agarose gel using lambda DNA and normalized. The equivalent amounts of DNA from individual M 2 plants were pooled four-fold in 96-well format.

Primers and ampli cation of studied genes
The homologues of Ara h and FAD2 genes were selected to conduct TILLING study. Primer sets for these genes Ara h 1.01, Ara h 1.02, FAD2B and FAD2A were previously designed by Chu et al. 54 and Guo et al. 31 . The product sizes and primers used for the amplification were listed in Table 1.
The target regions for FAD2 genes were amplified on a Thermo Scientific Arktik Thermal Cycler Vantaa, Finland using 96-well plates and carried out in a 20 μL volume consisting 1.5 μL of 10X PCR buffer, 2.5 mM MgCl 2 , 0.2 mM dNTP mix, 0.4 μL each of forward and reverse primers, 5 unit of Taq DNA polymerase Thermo Fischer Scientific , 1.5 μL of genomic DNA template and Milli-Q water to make up the final volume. The thermocycling condition was 95 for five minutes for initial denaturing, followed by 30

Mutation discovery by TILLING
TILLING process was conducted with use of mutation discovery kit DNF-910-K1000T AATI, Ames, USA which offers easy, high-throughput and profitable results for mutational screening. In this protocol, the working solution of the dsDNA cleavage enzyme by diluting the enzyme with the T-Digest buffer in the ratio of 1:125 was prepared and only 2 μL of this working enzyme solution was added for each sample well after heteroduplex formation step. This was followed by incubation of 96-well plates at 45 for 45 minutes and then 20 μL of the dilution buffer added to each well. These samples were analyzed in the Fragment Analyzer TM which was automated capillary electrophoresis Fragment Analyzer TM , Advanced Analytical Technologies GmbH, Heidelberg, Germany . In this capillary system, solutions for the mutation discovery kit DNF-910-K1000T were used to visualize digested products ranging from 35 to 5000 bp. Raw data were analyzed using PROSize TM software Version 1.2.1.1 Advanced Analytical Technologies, AMES, IA, USA . To confirm a potential mutant from the pool, individual genomic DNAs were used to PCR amplification with specific primers for the target gene. Before sequencing, PCR products were purified using the GeneJET Gel Extraction Kit Thermo according to the manufacturer s instructions. The amplicons were sequenced by Macrogen China and sequence analysis was performed in MEGA software version 5 55 . The sequences having mutations were translated to protein sequence with the online source https://web.expasy.org/translate/. The online software SIFT Sorting Intolerant from Tolerant https://sift.bii.a-star.edu.sg/www/SIFT_seq_submit2.html was used to predict whether an amino acid substitution affects protein function 56 . SIFT score was ranged from 0 to 1 and the amino acid substitution is predicted damaging is the score is 0.05, and tolerated if the score is 0.05.

Development of TILLING population
A large seeded cultivar, NC-7, was used for generating TILLING population. Two different EMS mutagenesis treatments, 1.

Identi cation of mutations
The single copy genes, Ara h 1.01, Ara h 1.02, FAD2B and FAD2A, were selected to identify mutations in TILLING study. The total amplified region length of these genes was about 5571 bp Table 1 and each EMS-treated M 2 plant was screened in that frame. After heteroduplex PCR and digestion with enzyme, Fragment Analyzer TM detected potential mutations in eight pooled wells of plates for four genes and these 32 individuals were sequenced data not shown . Among them, two induced mutations were confirmed in FAD2B and FAD2A gene regions  Table S1 resulting in an amino acid change, Serine to Threonine, compared to wild type sequence Table S2 . This substitution was predicted to be tolerated with a score of 0.28 according to SIFT analysis. Similarly, the other mutation was heterozygous SNP G A position 861 in the FAD2A gene sequence Table S1 and causing amino acid change, Glycine G to Aspartic acid D Table S2 in DNA from M 2 plant coded 547. SIFT analysis showing this substitution from G to D was predicted to affect protein function with a score of 0.01. These identified point mutations  Figs. 3 and 4 . The cleaved bands corresponding to the induced mutation were located about 470 bp and 900 bp of amplified products for FA2DB gene Fig. 3 . The lengths of bands were 340 and 530 bp for the cleaved fragments formed by digestion in mutation point for amplified product of FAD2A gene Fig. 4 . No mutations were detected in Ara h 1.01 and Ara h 1.02 for both EMS-treatments after sequencing. Mutation frequency was calculated as the total number of confirmed mutations divided by the total number of base pairs screened 31 . The average mutation frequency was estimated to be one mutation per 1566 kb for FAD genes considering two EMS applications together. The mutation rates were 1/317 kb for FAD2A in 0.4 EMS 12 h and 1/466 kb for FAD2B in 1.2 2 h , respectively. Based on the mutation frequency in the four targeted genes Table 1 , we estimated the overall mutation rate to 1 mutation every 2139 kb 2 mutations in 5571 bp of DNA from the 768 M 2 plants screened .

Discussion
TILLING is a flexible method for modifying gene functions by mutations which possibly affect protein function which might cause to partial phenotypic change or expression difference in the target gene 57 without involving transgenic modification. The optimization of mutagenesis is the key factor for the success in this method because of the toxicity and sterility on germinal tissue 58 . In groundnut, Fig. 1 a, b, c EMS through induced mutation leading to deformation and retardation in the developmental stages of groundnut. d wild type. two EMS mutagenesis treatments 1.2 and 0.4 were applied to seeds by Knoll et al. 30 who monitored high germination frequencies and detected many SNP mutations in fatty acid desaturase and major allergen genes. Guo et al. 31 also captured nucleotide changes with the application of 0.4 EMS mutagen on this crop. These tested EMS doses were therefore selected in the present investigation without preliminary experiments to produce mutagenized populations. Although 1.2 EMS concentration treatment is detrimental to the survival of the different plant species such as arabidopsis 38 and rice 47 , we obtained about 60 germination frequency in that mutagen dose. The cultivated groundnut has a polyploid genome and this ploidy level can provide tolerance to higher mutation dose because of genetic buffering 59 , similar to that observed in polyploid mutant populations of wheat 57,60 . The polyploid genomic structure might also affect the observed phenotypic mutation rate which was 1 in our study and similar low fre-quency was previously described in groundnut 30 and also other polyploid crops such as wheat 57 .
Based on the mutation frequency in all the studied genes Ara h 1.01, Ara h 1.02, FAD2A and FAD2B , we estimated the overall mutation rate as 1/2139 kb for each EMS applications. This value was found to be higher than in TILLING population of barley Hordeum vulgare L. 1/2500 kb 61 and lower when compared to other populations in sorghum 1/526 kb 62 , rice 1/294 kb 47 , sunflower 1/475 kb 50 , rice 1/1000 kb 63 and Aegilops tauschii 1/77 kb 64 . The difference in mutation density might be sourced of various factors such as species, type of mutagen, treatment procedure, mutagen concentration and treated organs 62,65 . Based on the single gene, mutation densities were 1/317 kb for FAD2A and 1/466 for FAD2B genes in the 0.4 and 1.2 EMS applications, respectively. Although groundnut seeds were exposed with same concentration of EMS, different mutation rates were observed in the study conduct-  ed by Knoll et al. 30 who reported lower mutation frequency was 1/501 kb for FAD2A in 0.4 EMS application and not captured any mutations for FAD2A and FAD2B in 1.2 EMS dose. Knoll et al. 30 and Guo et al. 31 also recorded higher overall mutation frequencies on groundnut compared with our result in 0.4 EMS application. The mutation detection method might be one of the reasons for the variable values in mutation rates. Because our TILLING process was conducted with use of mutation discovery kit DNF-910-K1000T AATI, Ames, USA differently from the studies were carried out by Knoll et al. 30 and Guo et al. 31 used CEL I/LI-COR and TILLING by sequencing approaches, respectively. The effect of environmental conditions on the plant response 44 and difference in the selected genotype 62 might also be other reasons. The point mutations in our tilling study indicated that nucleotide changes were one transversion T A and one transition G A and both of types were observed in TILLING populations of rice 47 , tomato 49 and cucumber 41 .
In groundnut, delta-12-desaturase oleoyl-PC desaturase converts oleic into linoleic acid and it is coded by two homologous genes ahFAD2A and ahFAD2B 15,18 . The repression of these two genes makes possible higher oleic content with changes G to A transition at the 448 position of the coding region of ahFAD2A 66 , and "A" insertion in 441_442 position in the coding region of ahFAD2B 15,20 .
This missense mutation in ahFAD2A caused to aspartic acid to asparagine transition at position 150 20 . The cultivar, NC-7 subsp. hypogaea var. hypogaea used in this study had this spontaneous mutation 52 which was frequent among subspecies hypogaea accessions 67 . A new mutation was therefore identified at position 861 causing glycine to aspartic acid transition in the present investigation. The 'protein function was predicted to be affected by this amino acid change according to SIFT analysis. Similarly, Knoll et al. 30 pointed out that the most of the nucleotide changes in ahFAD2A gene causing aspartic acid transition in groundnut TILLING population treated with 0.4 EMS.
In homologue gene ahFAD2B, insertions cause to severe stop codons which leading to high oleic ratio 15,20,68 . Differently from this type mutation, the nucleotide change captured at position 993 in the coding region of ahFAD2B resulting in amino acid change, serine to threonine which was rare transition in that gene compared other studies in groundnut conducted by Knoll et al. 30 , Fang et al. 21 and Guo et al. 31 . Differently from common mutations in ahFAD2B, Nadaf et al. 23 30 identified five mutations among the 3420 plants in similar EMS doses. This showed that higher mutation frequencies and greater allelic variation should be obtained with the increase the size of the required population in TILLING population 69 . On the other hand, the number of detected mutations in the same gene might be affected by enzyme activity and probability of mutations presenting in the effective region 60,70 . Not only population size and the mutation frequency, but also the method used to identify specific mutations in the population is equally important for successful TILLING applications 40 73 . The mutation discovery kit AATI, Ames, USA and a detection platform Fragment Analyzer TM was used to detect SNP changes in the present TILLING approach. In this technology, mutation detection times are cut in half because of the detection system's speed and streamlined methods used with it compared to gel systems. Identification multiple cuts in one gene, no clean up step and eliminating use of labeled primer sets are also important features of this setup. It also examines fragments up to 10,000 base pairs and has user-friendly software to screen mutations. These are significant advantages compare to CEL I/LI-COR heteroduplex detection method which requires labeled primers in PCR reactions 58 and also analysis of TILLING gel images is difficult 71 . In addition, Fragment Analyzer TM has faster electrophoresis run times than other mutation detection method, Matrix-assisted laser desorption/ionization-time-of-flight mass spectrometer MALDI-TOF 40 . Although these features and benefits, we faced some drawbacks with mutation discovery kit used in this study. The first problem was undesirable peaks and fragments Fig. 3 which were monitored in software gave rise to false SNPs. We observed cleaved fragments and peaks in eight wells, however, the right SNPs were detected only two of them after sequencing. The residues of main and digested fragments might generate these unexpected results. The other issue was peak size. Main fragment and cleaved products were observed together in output graphic and this issue caused to lower peak size volume for cleaved fragments resulted with uncertainty to identify mutations, for example Fig. 4. To overcome these detection problems, we sequenced potential mutant DNAs to confirm nucleotide changes and similarly, identified mutations with the mutation discovery kit were later checked by Sanger sequencing in the study conducted by Mascher et al. 73 . With respect to economical side, the mutation discovery kit is costly and needs to initial investment in terms of detection platform like denaturing high-performance liquid chromatography DHPLC method 74 . The sequencingbased method of TILLING seems to better approach to detect mutations because it does not require specific mutation-detection platform. It can be used for direct mutation determination without any pre-screening 33 and would be cost-friendly with the pooling large numbers of samples 45 with the high sensitivity compared to CEL I/LI-COR 31 and Mutation discovery kit/Fragment Analyzer. It also eliminates disadvantage of CEL I enzyme which especially recognizes certain mismatches might decrease the sensitivity 75 .

Author Contributions
Engin Yol performed research, analyzed data, and wrote the original manuscript. Merve Başak, Sibel Kizil and Kürşat Karaman conducted field trails and laboratory studies. Bülent Uzun supervised the research design and reviewed the article. All authors read and approved the article.