Anthropological Science
Online ISSN : 1348-8570
Print ISSN : 0918-7960
ISSN-L : 0918-7960
Original Articles
Distinctive genetic signatures of Alu/STR compound systems revealed by analyses of Mediterranean and Middle East populations
RAOUDHA BAHRIESTHER ESTEBANABIR BEN HALIMAPEDRO MORALHASSEN CHAABANI
Author information
JOURNAL FREE ACCESS FULL-TEXT HTML
Supplementary material

2014 Volume 122 Issue 2 Pages 81-88

Details
Abstract

The Middle East and Mediterranean represents one of the most ancient and largest areas of human civilization. Although several genetic studies have been carried out on certain regions of this area, it would be interesting to take advantage of additional global studies, including larger numbers of regions. In this paper we aim to expand previous genetic studies on populations from this area by investigating new populations and providing a global view based on the distribution of CD4 and FXIIIB Alu/STR (short tandem-repeat) compound systems. Haplotype frequencies of these two systems are determined in 352 DNA samples from Libya, Bahrain, and southern Iran. Comparative analyses and MDS plot representation show an evident genetic differentiation among the three population groups studied, i.e. North and South Mediterranean and Middle East. In addition, they assert the genetic richness and differentiation of Libya from other North African populations. Three haplotypes, CD4 90(+), FXIIIB 180(−) and CD4 110(−), were determined to be specific to Middle Eastern populations. The distribution of the two Alu/STR system haplotypes suggests that population movements between the North Mediterranean and Middle East are relatively less important than those involving the South Mediterranean with both the North Mediterranean and Middle East. However, the impact of these population movements is minimal compared to the long-standing settlement of the three population groups that have retained their genetic identity. Anthropological data thus obtained from the use of CD4 and FXIIIB Alu/STR compound systems would reflect the effect of advantages peculiar to these two systems. The determination of their haplotypes in more populations from the Middle East (particularly from the Arabian Peninsula) and from East Asia will provide more details on human evolutionary history.

Introduction

The Middle East, which represents the heart of the ancient world, has had a complex population history involving remarkable relationships with the Mediterranean region. Around 10000 BC the Neolithic culture first developed in the area known as the Fertile Crescent (the area of land arching from the Persian Gulf over the watersheds of the Tigris and Euphrates rivers in Iraq, and through to the eastern coast of the Mediterranean into Egypt). The Mediterranean Sea has represented the central crossroads of contacts, trade, and cultural exchange among the peoples of the Fertile Crescent and those of the Mediterranean, as witnessed by the succession of well-known empires that have existed in the region: Egyptian (3000–1000 BC); Assyrian and Babylonian (1000–500 BC); Persian (550–330 BC); Greek (330–60 BC); and Roman, particularly during its greatest period of expansion (60 BC–140 AD). In modern times, the Ottoman Empire (1300–1923 AD) became the largest political entity in South Europe, the Middle East and North Africa (for review, see Abulafia, 2011).

Several population genetic studies on the Middle East and the Mediterranean region have been carried out using classic and molecular markers. Among the latter, both Alu and short tandem-repeat (STR) polymorphisms, used separately, have given useful information (e.g. Terreros et al., 2009; Bahri et al., 2012, 2013; Triki-Fendri et al., 2013). However, the combined use of haplotypic variation consisting of both a fast-evolving STR and a slow-evolving Alu marker (Alu/STR compound system) has provided more anthropological details. The first Alu/STR compound system was determined at the CD4 locus on chromosome 12 and was used by Tishkoff et al. (1996), then two others, FXIIIB Alu/STR on chromosome 1 and DM Alu/STR on chromosome 19, were determined and studied particularly in Mediterranean human populations (González-Pérez et al., 2010; El Moncer et al., 2010).

In this paper we describe for the first time the genetic variation of CD4 and FXIIIB Alu/STR compound systems in a sample from Libya and two samples from the Middle East (Bahrain and southern Iran). Thus, our first aim is to enlarge the number of populations in which these two Alu/STR compound systems have been studied. In fact by studying Libya we extend the genetic North African background observed in the Moroccan, Algerian and Tunisian populations to another more easterly region. The study of Bahrain and southern Iran allows us to include the first Middle Eastern populations in a wide comprehensive analysis of the whole Mediterranean and Middle East, based on these data. Our second objective is to check for new population-specific Alu/STR haplotypes and to analyze their distribution among the other haplotypes. We will then assess the potential use of the CD4 and FXIIIB Alu/STR compound system through comparison with a set of nine Alu individual polymorphisms.

Materials and Methods

352 DNA samples from three populations have been analyzed in this study. All three samples are from healthy individuals of both sexes. Two population samples, already described in Bahri et al. (2013), are composed of 97 unrelated natives of Bahrain and 65 unrelated natives of southern Iran. The third sample is composed of 190 Libyans who were selected on the basis that they were unrelated and natives of Libya for at least three generations. All participants gave written informed consent following the Ethical Committee of the Monastir University and the Tunisia Association of Anthropology. This sample, representative of the general Libyan population, is mainly composed of individuals from the cities of Benghazi, Tripoli, and Sabha (see geographic localizations in Figure 1).

Figure 1

Geographical location of regions from which the Libyan sample was collected.

Genomic DNA was extracted from blood by standard phenol–chloroform techniques. Alu genotyping was carried out by polymerase chain reaction (PCR) followed by electrophoresis separation on 2% agarose gels. Two STRs, a pentanucleotide repeat from the CD4 locus, and a tetranucleotide repeat from the FXIIIB gene, were determined by PCR amplification with fluorescently labeled primers. PCR products were electrophoresed on an ABI PRISM 3700 DNA sequencer (Applied Biosystems, Foster City, CA, USA). GeneScan and GeneMapper 3.0 programs (ABI PRISM; Applied Biosystems) were used to genotype individuals. Technical details of the PCR and electrophoresis are extensively explained in our previous works (González-Pérez et al., 2007, 2010) for both Alu and Alu/STR combinations.

Allele frequencies were calculated by direct counting, and Hardy–Weinberg equilibrium was checked through an exact test (Guo and Thompson, 1992). Heterozygosity by population and by locus was estimated according to Nei’s formula (Nei, 1987). The nomenclature of Alu/STR combinations consists of a number indicating the size in base pairs of the corresponding STR allele, followed by a symbol + (presence of the Alu element) or – (absence of Alu). Maximum likelihood haplotype frequencies were computed using the EM algorithm. The geographical structure of the allele frequency variance was tested by hierarchical analyses of molecular variation using Wright’s F-statistics from populations clustered according to geographical criteria. These calculations were performed using the GenePop 3.3 (GenePop, Montpellier, France) and Arlequin packages (Arlequin, Berne, Switzerland) (Raymond and Rousset, 1995; Excoffier et al., 2005). Population genetic relationships for the Alu and Alu/STR data were also assessed by pairwise FST-related genetic distances (Reynolds et al., 1983), and represented by a multidimensional scaling (MDS) plot from the distance matrix with STATISTICA software, version 6 (StatSoft, Inc., 2001). The MDS plot provides a visual representation of the FST-related genetic distance matrix in a two-dimensional pattern (x and y axes).

In order to compare the CD4 and FXIIIB Alu/STR haplotype frequencies, 20 Mediterranean samples previously tested for the same genetic markers are considered (González-Pérez et al., 2007, 2010; El Moncer et al., 2010). The geographical location of the Mediterranean populations used in this study is indicated in Figure 2. These samples included eight continental northern Mediterranean groups spanning from the Iberian Peninsula to Turkey; five samples from the western Mediterranean islands of Majorca, Corsica (center and west center) and Sicily (west and east), and, finally, a set of seven North African populations composed of four Moroccan Berber group samples (from the north-east and middle Atlas, and two groups from the high Atlas), an Algerian Berber sample, and two samples from Tunisia (south and north-center). The published frequencies obtained in these 20 Mediterranean samples are presented as supplemental data (see Appendix). To carry out a MDS according to nine Alu insertions of the same populations, we have used previously published data (references are indicated in the legend of Figure 3B).

Figure 2

Geographical location of population samples studied in the present paper and those used for comparison. 1, Bahrain; 2, southern Iran; 3, Libya (present study); 4, Tunisia (north-center); 5, Tunisia (south); 6, Algeria (Mozabite Berbers); 7, Morocco (north-east Atlas Berbers); 8, Morocco (middle Atlas Berbers); 9, Morocco (high Atlas Berbers from Asni); 10, Morocco (high Atlas Berbers from Amizmiz); 11, northern Spain; 12, Pas Valley; 13, Basque Country; 14, central Spain; 15, southern Spain; 16, Majorca; 17, southern France; 18, Corsica (center); 19, Corsica (west); 20, Sicily (west); 21, Sicily (east); 22, Greece; 23, Turkey.

Figure 3

Multidimensional scaling plot of FST-related genetic distance matrix in 23 populations: (A) based on data from CD4 and FXIIIB Alu/STR haplotypes (stress of 0.086); (B) based on data from nine Alu insertions polymorphisms (stress of 0.122). Open triangles, southern Europeans; open circles, North Africans; open stars, Middle Eastern populations. Sources: (A) Libya, Bahrain and southern Iran (present study); Tunisia (El Moncer et al., 2010); all other populations (González-Pérez et al., 2007, 2010). Corresponding frequencies are indicated in Table 1 and in the Appendix. (B) Bahrain and southern Iran (Bahri et al. 2013); Libya (Ben Halima et al., 2014); Tunisia (Bahri et al., 2008; El Moncer et al., 2010); Algeria (Mozabite Berbers), Morocco (north-east Atlas Berbers), Morocco (middle Atlas Berbers), Morocco (high Atlas Berbers from Asni), Morocco (high Atlas Berbers from Amizmiz), northern Spain, Pas Valley, Basque Country, central Spain, southern Spain, southern France, Greece, and Turkey (González-Pérez et al., 2010); Majorca, Corsica (center), Corsica (west), Sicily (west) and Sicily (east) (González-Pérez et al., 2007).

Results

Alu/STR genetic variability in the samples from Libya, Bahrain and South Iran

Alu/STR haplotype frequencies for the CD4 and FXIIIB loci are shown in Table 1. Haplotype diversity values in Bahrain for CD4 (0.855) and FXIIIB (0.853) are slightly higher than those described in Libya and southern Iran, which show the lowest diversity values (0.755 and 0.815, respectively). Libya and Bahrain have twice as many different CD4 haplotypes (only those having frequency values >1%, 11 and 10 haplotypes, respectively) than southern Iran (five haplotypes). For the FXIIIB haplotypes, the three samples show a similar number of different haplotypes (nine in Libya, and eight in Bahrain and southern Iran).

Table 1 CD4 and FXIIIB Alu/STR halotype frequencies in Lybia (n = 190), Bahrain (n = 97) and Iran (n = 65). Haplotypes are represented as STR alleles expressed in base pairs together with the presence (+) or absence (−) of the Alu insertion.
CD4 Frequencies FXIIIB Frequencies
Libya Bahrain Iran Libya Bahrain Iran
85 (+) 0.2895 0.1830 0.2500 172 (+) 0.0669 0.0372 0.0711
90 (+) 0.0824 0.1890 0.3530 176 (+) 0.0111 0.0000 0.0000
95 (+) 0.0343 0.0000 0.0000 180 (+) 0.0738 0.0830 0.0704
100 (+) 0.0327 0.0000 0.0000 184 (+) 0.0000 0.1030 0.0181
105 (+) 0.0000 0.0106 0.0000 188 (+) 0.0500 0.1150 0.0960
110 (+) 0.1019 0.1680 0.1100 172 (−) 0.1462 0.0920 0.1150
115 (+) 0.0659 0.0133 0.0000 176 (−) 0.0392 0.0000 0.0000
120 (+) 0.0136 0.0213 0.0000 180 (−) 0.1510 0.2270 0.3130
85 (−) 0.1283 0.0939 0.0000 184 (−) 0.1942 0.1150 0.0748
90 (−) 0.1812 0.1940 0.2350 188 (−) 0.2212 0.2210 0.2410
95 (−) 0.0135 0.0000 0.0000
110 (−) 0.0247 0.0870 0.0502
115 (−) 0.0000 0.0398 0.0000
Gene diversity 0.842 0.853 0.755 Gene diversity 0.847 0.855 0.815

Haplotype frequency comparisons yield no significant differences between Bahrain and southern Iran (P = 0.0801 for FXIIIB; P = 0.0501 for CD4). However, when we compared these samples with Libya we observed significant differences for both compound systems (FXIIIB: P < 0.001 for Bahrain, and P = 0.0025 for southern Iran; CD4: P = 0.0017 and P < 0.001, respectively). When we extend the comparison of Libya to other North African groups we found significant differences (in all cases with P < 0.001) with the total samples of Morocco, Algeria and Tunisia.

Population relationships among Mediterranean and Middle Eastern populations

Comparative analyses of haplotype frequencies have been conducted grouping samples into North Mediterranean, South Mediterranean (North Africa) and Middle East. We observed significant differences among the three groups: Middle East vs. both North (P < 0.0001) and South (P < 0.0001) Mediterranean; North vs. South Mediterranean (P = 0.008).

The MDS representation of the FST-related genetic distances calculated through the two Alu/STR compound systems (Figure 3A) underlines the genetic differentiation of these three groups. Genetic distance averages within groups give a similar value for North (0.0143 ± 0.0077) and South (0.0154 ± 0.0099) Mediterranean populations, whereas the genetic distance within the Middle East group, namely between Bahrain and southern Iran, is slightly higher (0.0205). Comparisons among groups reveal that North and South Mediterranean populations are the closest groups (0.0378 ± 0.0122), whereas Middle East samples show a higher genetic distance to North (0.0778 ± 0.0234) than to South (0.0569 ± 0.0239) Mediterranean populations. Libya appears in an extreme position among the North African populations and shows a genetic distance with Bahrain (0.0207) similar to the lowest one obtained with other North African samples (0.0222, south Tunisia).

The analyses of molecular variance emphasize the genetic distinctiveness of Libya. The non-hierarchical FST value within North Africa (FST = 0.0100, P < 0.001) dramatically decreases to 0.002 (P = 0.079) when Libya is excluded from the calculations. In the comparison between North and South Mediterranean populations, the genetic variance among groups (FCT = 0.021, P < 0.001) is considerably higher than that found among populations within groups (FSC = 0.007, P < 0.001), underlining the genetic differentiation of both Mediterranean shores in terms of these two compound systems. From our analysis of the genetic variance value for each compound system, we noted that the values of genetic variance among groups and among populations within groups determined for the FXIIIB locus (FCT = 0.029, P < 0.001 and FSC = 0.008, P < 0.001) were higher than those determined for the CD4 locus (FCT = 0.014, P < 0.001 and FSC = 0.006, P < 0.001)

The high number of different haplotypes found in our samples (13 for the CD4 locus and 10 for the FXIIIB locus, see Table 1) and previous studied samples (e.g. González-Pérez et al., 2007, 2010; El Moncer et al., 2010) represents the source of the great ability of the two systems to discriminate microdifferentiation processes. In Table 2 we list the frequency distribution of four haplotypes selected because of their specificity to one of the three population groups for which we have available data. Two haplotypes have been found in higher frequencies in our two Middle Eastern samples, CD4 90(+) and FXIIIB 180(−), whereas the two other haplotypes cited in Table 2 were already described (González-Pérez et al., 2007, 2010; El Moncer et al., 2010, Flores et al., 2000) as relatively more frequent in Berbers (CD4 110(−)) or only frequent in Sub-Saharan Africans (CD4 100(+)).

Table 2 Frequency distribution of four Alu/STR haplotypes from the CD4 and FXIIIB loci selected by their differential distribution among Mediterranean, Middle East and Sub-Saharan African groups.
Alu/STR haplotype Populations
Mediterranean Middle East (West Asia) (n = 162) Sub-Saharan Africa (n = 235)
North Africa (n = 1019) South Europe (n = 1304)
CD4 90(+) 0.000–0.125 0.000–0.059 0.189–0.353 0.000–0.083
FXIIIB 180 (−) 0.004–0.151 0.000–0.177 0.227–0.313 0.007
CD4 110(−) 0.000–0.069 0.000–0.040 0.051–0.087 0.000
CD4 100(+) 0.007–0.038 0.000–0.023 0.000 0.111–0.180
References Present study; El Moncer et al., 2010; González-Pérez et al., 2010 González-Pérez et al., 2007, 2010 Present study Flores et al., 2000; González-Pérez et al., 2010

Haplotype CD4 110(−) was found in our three samples, with a relatively high frequency in Bahrain (0.087) as compared with southern Iran (0.050) and Libya (0.0247). This combination is absent in Sub-Saharan African and European samples except in some Mediterranean groups from Spain, Sardinia and Sicily. Concerning our three samples, the CD4 100(+) haplotype, common in Sub-Saharan African groups, is found only at low frequency in Libya.

Population relationships were also assessed through the information provided by nine Alu (ACE, APO-A-I, PV92, B65, D1, A25, TPA25, FXIIIB and CD4) in the same populations (Figure 3B). The frequency of the CD4 Alu insertion in Bahrain and southern Iran populations was determined in the present study (0.620 and 0.728, respectively). The MDS representations are, on the whole, similar to that obtained using the two Alu/STR compound systems (Figure 3A). However, (i) in the MDS based on nine Alu, the genetic distances among Middle Eastern and North (0.043 ± 0.016) and South Mediterranean (0.032 ± 0.015) populations have lower values; and (ii) within the North African group, populations are plotted more in accordance with geography in the MDS based on Alu/STR compound systems. In addition, the study of each Alu/STR compound system could give information on population movements from the distribution of its population-specific haplotypes that cannot be provided by individual markers. All these findings show the higher anthropological potential of using two Alu/STR compound systems compared with that of nine individual markers.

Discussion

This paper contributes to the study of the evolutionary relationships within and between human populations of the Mediterranean and the Middle East using CD4 and FXIIIB Alu/STR compound systems. We have expanded previous studies conducted in Mediterranean populations (González-Pérez et al., 2007, 2010; El Moncer et al., 2010) by describing a new North African sample from Libya. In addition, we present for the first time new data on Bahrain and southern Iran, providing a larger global view on the genetic differentiation of populations ranging from the Mediterranean region to the Middle East.

The first question is focused on the genetic distinctiveness of North African populations. As can be seen both in the MDS plot Figure 3A and the genetic variance analyses, North African samples constitute a relatively homogeneous group when we consider only Moroccan, Algerian and Tunisian samples (FST = 0.2%). However, when Libya is added the genetic variance of this group increases to 1% (P < 0.001). In fact, this sample is halfway between Tunisia (genetic distance 0.222) and Bahrain (genetic distance 0.0207). Libya is genetically diverse in both CD4 and FXIIIB compound systems: gene diversity values of CD4 (0.842) and FXIIIB (0.847) are higher than the averaged values observed in North Africans (0.811 and 0.829, respectively). To illustrate this, the CD4 100(+), a common Sub-Saharan African haplotype, has been found in the Libyan sample with a frequency of 0.0327. This genetic richness of the Libyan population, also stated in previous work (Ben Halima et al., 2013; Cherni et al., 2011; Fadhlaoui-Zid et al., 2011), could be due partly to the fact that Libya is located more in the south (in continuity with the south of Tunisia) nearer to central Sub-Saharan countries and at the same time it has a long Mediterranean coastline to the north.

Regarding the genetic relationships among Middle East and North African samples, if we exclude the Libyan sample, we do not observe any particular genetic closeness among these groups. We have found significant differences (P < 0.0001) between Middle Eastern and South Mediterranean groups. This does not show a lack of genetic closeness between North African populations and those from the Arabian Peninsula because Bahrain natives, according to our previous study, are mainly a mixture between Arabs and Iranians (Bahri et al., 2013), and therefore could not be considered as absolutely representative of an Arab background. However, there were no other available data for CD4 and FXIIB Alu/STR haplotypes to be used as representative of the Arabian Peninsula.

The Mediterranean region (including North and South shores) and the Middle East are clearly differentiated. However, a detailed analysis of the more than 23 haplotypes of the two Alu/STR combinations allows us to deduce some suggestions about population movements between these populations. In fact, genetic distance is highest (0.0778 ± 0.0234) between the Middle East group and that of the North Mediterranean, suggesting that migrations between these two regions were more unusual than those between South Mediterranean and each of North Mediterranean and Middle Eastern groups, which exhibit lower genetic distances (0.0378 ± 0.0122 and 0.0569 ± 0.0239 respectively).

The two haplotypes CD4 90(+) and FXIIIB 180(−) are specific to Middle Eastern populations; the haplotype CD4 110(−) is rare and has been found with a relatively higher frequency in Bahrain (0.087) as compared with southern Iran (0.050) and Libya (0.0247), and deserves particular comment. This combination is absent in Sub-Saharan African and European samples except in some Mediterranean groups from Spain, Sardinia and Sicily. It was proposed in previous works (Flores et al., 2000; González-Pérez et al., 2010) to be representative of an ancient Berber background because it was observed throughout North Africa. However, its presence in Bahrain and southern Iran makes a new interpretation necessary: this haplotype could be a common haplotype of those human groups from the Middle East that colonized the southern shore of the Mediterranean in the Upper Paleolithic, if we take into account that its estimated age is around 36000 BP (95% CI: 3200–186000; González-Pérez et al., 2010). However, we cannot exclude the possibility that population movements from the Middle East (particularly from Arabia) to the South Mediterranean, agreeing with the Neolithic spread or, in more recent times, with the substantial expansion of Arabs in the seventh century AD, could be also responsible for the presence of this haplotype. Accordingly, several genetic studies (e.g. Chaabani, 2002; Cerný et al., 2011; Bahri et al., 2012) have suggested ancient demographic movements from Arabia (particularly Yemen) to ancient Mesopotamia and North Africa as proposed by some historians (e.g. Goodspeed, 2007). In addition, these ancient movements are supported by recent mtDNA data that inidicate a strong affinity between Yemenite people and those of Egypt and North and East Africa (Badro et al., 2013).

All these results and explanations presented above, as well as the comparison between the obtained MDS (Figure 3A) and that based on nine Alu insertions (Figure 3B), show the great power of the CD4 and FXIIIB Alu/STR coumpound systems to elucidate the study of human evolutionary relationships. This anthropological potential is due to the effect of three principal advantages: (i) within each system the Alu and STR are physically linked, generating several haplotypes; (ii) Alu and STR have very different mutation rates, hence inferences regarding both molecular events and processes of population history can be drawn from such a combination, and additionally for Alu insertions the ancestral allele can be identified; and (iii) the two systems, located on different chromosomes, evolve independently.

Conclusion

Our results have shown evident genetic differentiation among three groups comprising North and South Mediterranean and Middle Eastern populations. In spite of the population movements suggested between the two shores of the Mediterranean, and between the Middle East and the South Mediterranean, these regions have retained their genetic identity. Three haplotypes, CD4 90(+), FXIIIB 180(−) and CD4 110(−), were determined as specific to Middle Eastern populations. Our results and comparative analyses show that CD4 and FXIIIB Alu/STR compound systems offer exceptional advantages as powerful tools for inferring the history of populations. The determination of their haplotypes in more populations from the Middle East (particularly from the Arabian Peninsula) and from East Asia will provide more details on human evolutionary history.

Acknowledgments

We thank all the anonymous subjects for their participation in this study. This research has been supported in part by the Tunisian Ministry of Higher Education within research unit UR12ES11, by the Spanish Ministry of Educación y Ciencia Grant CGL2008-03955 and CGL2011-27866, the Generalitat de Catalunya 2009SGR1408 grant, and by the Universitat de Barcelona grants of the Oficina de Mobilitat i Programes Internacionals and the Vicerrectorat de Relacions Internacionals i Institucionals.

References
 
© 2014 The Anthropological Society of Nippon
feedback
Top