Evaluation of Wild Almond Genotypes Grown as a Rain-fed Crop in Sulaimani Governorate using RAPD and ISSR Markers

Almond is considered the most common and essential nut grown in the rainfed area. It has many species wildly distributed in the Iraqi Kurdistan region, which are called Chaqala by local people. To understand and improve the characteristics of cultivated almonds, a comprehensive study of wild almonds in terms of population structure and genetic diversity is needed to transfer new traits into domesticated almonds. So, twelve different genotypes were collected and analyzed using random amplified polymorphic DNA (RAPD) and inter-simple sequence repeat (ISSR) markers. The results revealed polymorphic bands for both markers, the mean value of 5.8 for the RAPD marker and 7.6 for the ISSR marker was also documented. The polymorphic information content (PIC) values were obtained for RAPD primes that range between (0.64-0.84) and ISSR primers were also verified between (0.52-0.91), which shows the discriminatory power of these markers. “Jaccard similarity coefficients” were obtained between 0.34 (G11 vs. G12) to 0.77 (G1 vs. G10), and clustered into four groups with a mean similarity (0.35) for ten RADP markers. For ten ISSR markers, 0.35 (G8, G10) to 0.79 (G6 vs. G11) were also observed, which is clustered into four groups with a mean similarity (0.39). Overall, these findings showed the diversity between the studied genotypes and among groups which are highly important for future almond breeding and conservation programs.


Introduction
Almond is considered the most common and essential nut belonging to the large family Rosaceae, which is capable of surviving in arid and semi-arid conditions (18). In Iraq, several species were recorded under subgenera (Prunus and Amygdalus). The wild almond (Locally Chaqala) belongs to the subgenus Amygdalus (L.) Fock. (24 species) (29 and 30).
The domesticated almond is cultivated commercially all over the world. In addition, it believes to have grown in the arid mountainous areas of central Asia (6). Wild almonds are used for different purposes by local people, for instance, direct consumption, grazing of livestock, and oil extraction. As far as the environment is concerned, this plant prevents soil erosion (31). Furthermore, these related wild species have been used as germplasm and plant genetic resources to introduce into the cultivated almonds (3 and 15).
The taxonomic classifications of wild almonds based on morphological characters revealed no obvious relationships between species (17). Thanks to genetic markers, which have numerous uses in identifying varieties and genetic variability among germplasm, detailed and accurate information can obtain (17). Accuracy, simplicity, and affordability are considered the most applicable characteristics for the random amplified polymorphic DNA (RAPD) and intersimple sequence repeat (ISSR) compared to other methods (9). Moreover, there are other advantages of using RAPD markers reported by , such as a higher frequency of polymorphisms, quick results, straightforward, no need the detailed information about DNA sequences and requiring only a small amount of DNA. (14) indicated that SSR is one of the most popular and simple PCR-based marker procedures, which amplifies DNA segments between two identical microsatellite repeat regions. ISSR markers have higher reproduction and premium repeatability compared to RAPD markers, with lowers cost (due to predominant markers) than the AFLP (amplified fragment length polymorphism) markers, despite the drawbacks of the complicated SSR markers (21).
The researchers Gardziel et. al. (3), Hend et. al.(7) and Tahan et. al. (27) claimed that to generate more effective methods for identification and conservation of genetic resources in nut breeding programs, and determining gene pools, it is crucial to quantify genetic diversity and identify the connections between cultivated almonds and their associated wild species. Therefore, to obtain an effective and successful breeding program, we must identify, estimate and understand genetic diversity. To develop agriculture through a plant breeding program, we have to be aware of genetic diversity, and germplasm for conservation, detection, and sustainable use in plants (25). Wild almond species can be used as a valuable basis of genetic diversity and exploited for improving the existing cultivars. As a result, the related wild almond species' extensive adaption as significant genetic resources suggests that they have the potential to create plants that are resistant to both biotic and abiotic stressors. In addition, high-quality nuts can obtain by crossing different cultivars with wild types (6).Wild almond species have late blooming, self-fertility, and resistance to abiotic stress. As a result, they can be used as good genetic resources, preserved as priceless germplasm, and used in breeding programs (13). After crossing almond and related species, many spontaneous interspecific hybrids have been reported. For non-irrigated conditions, several species of wild almonds, such as Prunus scoparia used as a rootstock (10). Additionally, wild species have the potential to provide resistance related to their abilities to tolerate difficult circumstances. Prunus lycioides is considered a valuable genetic resource based on its significant role in almond breeding programs (6).
In the Kurdistan region in Iraq, molecular studies have not been conducted regarding the assessment of genetic assortment, and the relationships between cultivated and wild almond genotypes concerning molecular analysis. The application of genetic diversity for wild almonds is crucial to encourage more research relating to developing new cultivars with desired traits. Consequently, the wild almond (Chaqala) population structure in Sulaimani will develop for the first time using ISSR and RAPD markers. Our goal in this study is to determine and characterize genetic polymorphism for Chaqala cultivar as a rain-fed crop using RAPD and ISSR markers. Consequently, the generated knowledge can be used for germplasm conservation and involve wild types of almonds in breeding programs.

Plant material and location:
Seeds of twelve wild almond genotypes were collected from different locations in Sulaimani governorate and used as experimental materials. All seeds have undergone a stratification process to break dormancy and were stored in a fridge for about three months. Seeds were grown in Nursery Plant Poly Bag using peat moss. After two months, the leaves were harvested for DNA extraction.
Total DNA Isolation: Total DNA from fresh leaves was extracted according to Ahmed et. al.(1) protocol with some modifications. The quantity and quality of harvested Genomic DNA were determined using a nanodrop spectrophotometer and stored at -20° C.

RAPD Analysis:
Ten primers of RAPD markers were used (Table 1). They were designed based on the information obtained from (26). A kit produced via (Sigma, USA) was used for PCR-RAPD amplification. The guid illustrated in (Table 2) was followed for the (1X) reaction. The regarding PCR program was set up under the guide shown in (Table 3).To check the PCR product, gel electrophoresis was prepared using (1.5%) agarose. 0.25 μg ml -1 ethidium bromide was added after the gel was cooled using tap water. In addition, (2 μl) of Hyper Ladder was added for size calibration and the gel was run at a steady voltage of 100V for one hour. After the electrophoresis, the gel was visualized by a UV transilluminator. Finally, the images were captured by a gel documentation system.

ISSR Marker Analysis
Ten primers of ISSR markers were used shown in (Table 4). All primers used are available in our lab. The same guide was used for the RAPD marker (see Table 2), regarding master mix preparation. In addition, the same program was used for PCR amplification (see Table 3). However the annealing temperature was changed from 50°C to 36°C.    Scorable bands were coded manually as number (1) for present and (0) for absent. This data was used to create a dendrogram using the "Jaccard method" by using XLSTAT 2017 software (28).The calculation of polymorphism information content (PIC), gene diversity, and major allele frequency (MAF) were conducted using Power Marker version 3.25 software (12). A model analysis of the software STRUCTURE (version 2.3.4) was performed for the population structure, to infer the genetic structure and clarify the number of subpopulations (19).

Polymorphism Parameters of RAPD and ISSR Markers
The molecular markers were carried out to analyze the genetic diversity among the collected accessions. These markers are highly informative for detecting genetic variations among and between populations (8 and 19). In addition, due to their highly variable nature and less time and money consumption in comparison to other marker systems, molecular markers are significantly used in the genetic study of the population (5 and 16). The current study results revealed that all RAPD and ISSR primers gave scorable and good amplification products presenting high polymorphisms among the twelve analyzed wild almond accessions. There is a positive relationship between allelic and genetic diversity, which can exploit to determine appropriate populations to be used for breeding and conservation using molecular markers. Interestingly, the number of markers used in this area increased significantly, which can cause the improvement of the obtained data. To amplify fragments used 10 RAPD and 10 ISSR primers, for reproducibility and high polymorphism as illustrated in (Tables 1  and 4). The diversity parameters recorded for both markers are the polymorphic band number, gene diversity, major allele frequency, and polymorphic information content, which were demonstrated in (Tables 5 and 6) (Figures 1 and 2). The polymorphic bands ranged from 4 -7 bands for RAPD markers and from 3 -11 for ISSR markers. The mean value of polymorphic bands was 5.7 for RAPD and 7.4 for ISSR. This confirms that ISSR is highly informative and superior to RAPD (23), and the ISSR is also highly powerful for studying closely related species (32). These variations may be due to some biological processes such as hybridizations, gene flow, sexual propagation, and artificial selection (24). To evaluate markers' efficiency, the PIC values were analyzed. This efficiency of the molecular markers system in the discrimination of genotypes is significantly based on the variations it can realize (4). The high PIC value is a significant indicator of the informativeness of markers. The polymorphic information content values were obtained for RAPD primes that ranged between (0.6437 to 0.8468) and in another marker (ISSR) primers were also recorded from 0.5295 to 0.9103 which is in agreement with the reporting of (20). The MAF started from 0.25 (RAPD12) to 0.41 (RAPD 13,14,15,17), with 0.34 as an average allele per marker for RAPD. In ISSR minimum value of 0.08 was recorded in ISSR 11 and a maximum value of 0.58 in primer (UBC-888) with 0.27 as an average allele per marker. The gene diversity values were detected in the genotypes of 0.69 to 0.86 with an average value of 0.78 per RAPD and 0.58 to 0.91 with an average value of 0.81 per ISSR marker (Tables 5 and 6).

Clustering and Population Structure Analysis of Wild Almond Accessions
Multivariate statistical methods play a crucial role in understand results obtained from genetic diversity. One of the multivariate statistical systems is the cluster analysis which parts individuals into graphs based on the intervals. Clustering analyses were used for interpreting the relationship between wild almonds accessions, depending on "Jaccard similarity coefficients" by applying the unweighted pair groups method ("UPGMA") (11). The range of "Jaccard similarity coefficients" ranged from 0.34 (G11 vs. G12) to 0.77 (G1 vs. G10), all, twelve genotypes clustered into four groups (A, B, C, and D) with mean similarity (0.35) for ten markers in RADP. Cluster A contains (G1, G2), and cluster B includes (G10, G11, G12), genotypes G3 and G9 were observed in clade C and the rest genotypes were clustered within clade D (Figure 3).
The "Jaccard similarity coefficients" were also observed between 0.35 (G8, G10) to 0.79 (G6 vs. G11) by using ten ISSR markers that were clustered into (A, B, C, and D) with mean similarity (0.39). The first cluster comprises G6, and cluster B comprises (G2, G3, G4, G5, G7, G8, G9, G10), clade C includes (G11) and the rest are in cluster D, (G1, G12) ( Figure 4). Interestingly, these results of both ISSR and RAPD markers signify that there are obvious variations within the population of wild almonds and even among individuals too, which is very important to introduce these accessions into the active and new breeding programs of cultivated almonds in our territories. Identification of Genetic Structure Using RAPD and ISSR Markers for All Genotypes: Allele frequencies using structure analysis were used to determine the population structure for twelve almond genotypes by (2). Locating the valid number of clusters (K) in a sample of individuals was detected that the peak started at 2 and the real K value with the highest value of K= 2 for 10 RAPD markers ( Figure 5). Results show that ΔK of the genotypes is divided into 2 groups (sub-populations), and represented by color including group one (mentioned by the red line) and group two (mentioned by the green line) ( Figure 6). In addition, many genotype combinations were observed with more than one background including (2, 4, 6, and 10) and the rest shown is one background that may have a complicated historical background created from    Determining the exact number of clusters (K) in a sample of individuals was recorded and showed that the peak started at (2) and the real K value with the highest value of K= 8 for 10 ISSR markers ( Figure  7). Regarding ISSR, results show that ΔK genotypes were split into eight subpopulations or groups, and represented by color including group 1 (red line) group 2 (green line) group 3 (blue line) group 4 (yellow line) group 5 (pink line) group 6 (syne line) group 7 (brown line) and group 8 (purple line) ( Figure 8). In addition, many genotype combinations were observed with more than one background including (1, 4, 5,6,7,8, and 10), and the rest have a single background that perhaps belongs to the ancient complicated relationship and crossing between different wild types because of the gene flow between genotypes. Figure 7. using (ΔK) to Identify the ideal value of K based on ISSR data.

Figure 8. Clustering twelve wild almond genotypes to various subpopulations based on ISSR data using STRUCTURE analysis. Genotype coordination takes into account predestined membership coefficients (q) in K= 8. Numbers (1-120) indicate almond genotypes.
A strong correlation is observed between the presence of admixed genotypes and the clustering of genetic diversity. The exchange and hybridization of plant germplasm were thought to be associated with the admixture of ancestry in the past. Genetic drift has significant impacts on the population of this species, which means that genetic variations in this situation can be attributed partially to gene flow.

Conclusion
The genetic diversity of twelve wild almond accessions was evaluated using RAPD and ISSR markers. This can be considered a first attempt to associate the genotypic level with molecular markers of this plant. Consequently, all results obtained from this study can be further analyzed to find out the genetic differences between all genotypes. The difference between genotypes may be caused by the interaction between genes and the environment. This observation can be interpreted as a mechanism used by the plants for building a new suitable environment in the wild. Therefore, these results can be applied and exploited for future breeding programs and preservations. We highly recommend involving quantitative traits loci (QTL) in the future research and analysis genomewide associated with the development of quantity and quality of plant genotypes. In addition, more studies can be done in different locations. The accessions can be increased by using other markers such as SRAPs, SNIPs, Scot, ALFPs, and SSRs.