Friday, February 21, 2014

YDNA E-M123; A closer look

E-M123 (as well as E-M34) was first discovered by Underhill(2000) and is found with a low to medium frequency distribution in East Africa and the Middle East, while it has a low frequency distribution in North Africa and Europe.

Figure 1 - Current and previous E-M215 phylogenetic structure 

Figure 1 shows a comparison of the basic phylogeny of E-M215/M35 as was known before 2011 (a) and after (b), with a 'who and when' key for the Discovery of the UEPs. Notice the impact the rearrangement has on the phylogenetic placement of E-M123, specifically the fact that E-M123 is shown to have a more recent common ancestor with the East and Southern African variants of E-M35, i.e. E-V42 and E-M293, before it does with any of the other variants of E-M35.

Previous publications:

While it is unfortunate that all of the research that has previously been published on E-M123 was done under the consideration of the older (and rather out of date) configuration of the basic structure of E-M35, it is still worth while to look at articles that have tried to untangle the origins and history of this lineage, of these, 3 come to mind:

(1) Semino et al., in which the following paragraph was said with respect to E-M123:
"The very low frequency of E-M123 in Ethiopia does not allow any inferences about the origin of this clade. The network of E-M78 and that of E-M123 are in agreement with the hypothesis of their ancient presence in the Near East and their subsequent expansion into the southern Balkans. The divergence time (TD) (Zhivotovsky 2001) between the Near East and European lineages has been estimated to a range of 7–14 thousand years (ky) ago. Cinnioğlu et al. ( 2004) found a high degree of variance of E-M123 in Turkey, which has been interpreted as being due to multiple founders rather than a single early dispersal event that has remained geographically circumscribed."

(2) Cruciani et al., in which this was said about the lineage:
"In our data set, all the E-M123 chromosomes also carry the M34 mutation (E-M34), with the exception of one E-M123* subject from Bulgaria. This paragroup has been previously reported only in one individual from Central Asia (Underhill et al. 2000). Although the frequency distribution of E-M34 could suggest that eastern Africa was the place in which the haplogroup arose, two observations point to a Near Eastern origin: (1)Within eastern Africa, the haplogroup appears to be restricted to Ethiopia, since it has not been observed in either neighboring Somalia or Kenya (present study) or Sudan (Underhill et al. 2000). By contrast, E-M34 chromosomes have been found in a large majority of the populations from the Near East so far analyzed (Underhill et al. 2000; Cinniog˘ lu et al. 2004; Semino et al. 2004 [in this issue]; present study). (2) E-M34 chromosomes from Ethiopia show lower variances than those from the Near East and appear closely related in the E-M34 network (fig. 2D). If our interpretation is correct, E-M34 chromosomes could have been introduced into Ethiopia from the Near East. The high frequency of E-M34 observed for some of the Ethiopian populations could be the consequence of subsequent genetic drift, which can also explain the lower frequencies (2.3% [Underhill et al. 2000] and 4.0% [Semino et al. 2002]) reported for two large independent samples of Ethiopians."

(3) The last comes from a thesis, commonly referred to in this blog as the 'Hirbo thesis' (2011), where it said:

"The E3b3 haplotype, defined by the M123 mutation, was previously suggested to have originated in the Middle East and is found at low frequencies both in the Middle Eastern and East African populations [153, 434] (Appendix 6a). This conclusion was based on the fact that the E3b3 haplotype was observed only in Ethiopian samples from among the nine East African populations analyzed in a previous study [153], and lower STR variances in the Ethiopian E3b3 samples than in the Middle East [153]. However, extensive analysis of East African populations in the current study shows that this haplotype is found in Kenyan and Tanzanian Cushitic speakers as well, albeit, at low frequency (Figure 3.3.2, Appendix 6a). Its frequency maximum is centered in northeastern Africa (Table A9.1.1, Figure A9.1.7). Considering that the highest frequency is observed among the Ethiopian Jews [164] (Appendix 6a) a population that has been shown to be paternally [505, 565] and maternally [219] distinct from other Jewish populations, and genetically most similar to Sub-Saharan Africans [219, 505, 565], and the highest variance is observed in African populations (Table A9.1.2), the origin of E3b3 will most probably be among Cushitic/Omotic speaking populations of Southwest and Central Ethiopia."


Below are some of the more significant (>5%) frequencies of E-M123 found in published papers. Note that almost all E-M123 haplotypes also belong to E-M34, but since some papers directly test for E-M34 and others test just for E-M123, I have noted all the UEPs as E-M123 for the sake of uniformity:

Semino et al. found:
5% of E-M123 in Ethiopian Oromo

Cruciani et al. found:
14% of E-M123 in the Beta Israel of Ethiopia

Cinnioglu et al. found:
5-9% of E-M123 in all regions of Turkey except regions 1,2 & 8

Cruciani et al. found:
24% of E-M123 in Ethiopian_Amhara
8% of E-M123 in Ethiopian_Wolayta
8% of E-M123 in Erzurum Turkish
8% of E-M123 in Ethiopian_Oromo
8% of E-M123 in Omanite
7% of E-M123 in Bedouins
7% of E-M123 in Sicilians
5% of E-M123 in Sephardi Turks
5% of E-M123 in Northern Egyptians

Semino et al. found:
13% of E-M123 in an Albanian community of the Cosenza province in Italy
12% of E-M123 in Ashkenazi Jews
10% of E-M123 in Sephardi Jews
5% of E-M123 in Tunisians
5% of E-M123 in Lebanese

Arredi et al. found:
10% of E-M123 in Algerian Berbers
9% of E-M123 in Northern Egyptians
7% of E-M123 in Southern Egyptians

Shen et al. found:
20% of E-M123 in Libyan Jews
12% of E-M123 in Ethiopian Jews
10% of E-M123 in Ashkenazi Jews
10% of E-M123 in Yemeni Jews

Moran et al. found:
11% of E-M123 in Ethiopian Track and Field
10% of E-M123 in Ethiopian Marathon
6% of E-M123 in Ethiopian General Control
5% of E-M123 in Ethiopian Arsi Control

Luis et al. found:
10% of E-M123 in Arabs from Oman
5% of E-M123 in Arabs from Egypt

Flores et al. found:
31% of E-M123 in Jordainians from the Dead Sea

Beleza et al. found:
12% of E-M123 in Beja, Portugal
5% of E-M123 in Coimbra, Portugal

Cadenas et al. found:
8% of E-M123 in Yemenis

Contu et al. found:
5% of E-M123 in Tempio, Sardinia

Hammer et al. found:
10% of E-M123 in Israelite Jews
5% of E-M123 in Cohanim Jews

Di Gaetano et al. found:
11% of E-M123 in Mazara del Vallo, Sicilly
11% of E-M123 in Piazza Armerina, Sicilly
10% of E-M123 in Troina, Sicilly

The supplemental data of the Plaster thesis found:
25% of E-M123 in Ethiopian_Maale
13% of E-M123 in Ethiopian_Amhara
10% of E-M123 in Ethiopian_Oromo

The Hirbo thesis found:
9% of E-M123 in Ethiopian_Burji
8% of E-M123 in Kenyan_Yaku
5% of E-M123 in Kenyan_Boni

Bekada et al. found:
11% of E-M123 in Sahara + Mauritania
7% of E-M123 in Egypt
6% of E-M123 in Turkey

Figure 2 - E-M123 Contour Map from the Hirbo Thesis               

The contour map shown above is taken from Figure A9.1.7 of the Hirbo thesis and shows the general spatial frequency distribution of E-M123 in Africa, Near East and Southern Europe. While the map obviously does not include data from sources published after 2011, like Bekada (2013) for instance, also note that it does not include the E-M123 data found from the Plaster thesis as well.


Cruciani 2004 used Microsatelite networks (Figure 2D in the publication) to infer that Ethiopian E-M34 variance is lower than that found in the near east, since the Ethiopian M34 haplotypes appeared to be more closely related than the Near Eastern ones in the network. Hirbo on the other hand, inferred higher variance (Table A9.1.2) in Africa than outside of Africa.

Below, I have used the ASD approach to compute and compare TMRCAs of several E-M123/M34 datasets:

  1. The Plaster E-M34, N = 34 dataset, representing haplotypes from Ethiopia, filed under Ethiopian_EM34.csv.
  2. The Plaster Ethiopian Amhara Dataset, N = 9, A subset of (1)
  3. The Plaster Ethiopian Maale Datastet, N = 16, A subset of (1)
  4. The global E-M123 dataset from publicly available FTDNA haplotypes, N = 129, note that a very vast majority of these are not of African origin, although a few could be, filed under FTDNA_EM123.csv
  5. The global E-M84 dataset from publicly available FTDNA haplotypes, N = 69, note that E-M84 is a variant of E-M34, filed under FTDNA_EM84.csv

The TMRCAs for these haplotypes were computed using the germline mutation (pedigree) rates and effective (Zhivotovsky) rates separately, see Figures 3-4 below. In addition, 2 different sets of markers were analyzed, the first set included all markers available in the calculator, intersected with all the markers available in the Plaster thesis, yielding 14 markers. The second set comes from an intersection of the markers from the thesis with the recommended Zhivotovsky markers, yielding 9 markers.

Figure 3 - Zhivotovsky Central TMRCA Estimates

Figure 4a - Pedigree Central TMRCA Estimate Ranges - 9 Markers

Figure 4b - Pedigree Central TMRCA Estimate Ranges - 14 Markers

The results of the comparative TMRCA calculations are by no means unequivocal, but nevertheless allow for several observations to be made:

  • The overall Ethiopian E-M34 haplotypes have comparable and greater central TMRCA estimates than the global E-M123 samples with the exception of the scenario of 14 markers with the use of pedigree mutation rates (Figure 4b)
  • The Ethiopian Maale dataset consistently shows the least central TMRCA estimate compared with the other datasets, conversely, the Ethiopian Amhara datastet consistently shows the highest central TMRCA estimate compared to all other datasets.  
  • The remaining Ethiopian haplotypes, mostly belonging to the Oromo, have an intermediate TMRCA estimate between the Amhara and Maale samples, although a lot closer to the Amhara estimates. 

  • The current phylogenetic positioning of E-M123 (Figure 1b) reduces the probability of the lineage originating in the Near East (vs. Eastern Africa), relative to the independent positioning  that the lineage used to have within E-M35 (Figure 1a). This is because the new position reveals that E-M123 shares a more recent common ancestor with lineages of E-M35 that are either East African (E-V42) or South/East African (E-M293) specific, rather than, either (a) showing a closer relationship to the main variant of E-M35 that is found outside of Africa, i.e. E-M78 or (b) maintaining its old independent position within E-M35.
  • Of the two arguments lending support to a Near Eastern origin of E-M123 brought forth by Cruciani (2004), the first, "restriction of the haplogroup within Ethiopia", has been invalidated by sampling that has been done since the report, namely (a) the finds of E-M123 in the Yaaku, Boni and Turkana of Kenya in the Hirbo Thesis, (b) one of the highest ever recorded frequencies of E-M123 found in southern most Ethiopia (Semien Omo Zone) among the Omotic speaking Maale in the Plaster thesis and (c) the E-M123 find in a Somali dataset in Sanchez (2005), albeit at quite a low frequency. The second argument, "lower variance of E-M34 chromosomes in Ethiopia", proves to be at best inconclusive and at worst wrong, after carrying out further analysis on E-M34 haplotypes from Ethiopia relative to large global samples of publicly available E-M123 haplotypes, see Figures 3 - 4, in addition to the analysis carried out in the Hirbo thesis (Table A9.1.2).
  • The previous understanding of a decreasing frequency of E-M34 haplotypes from Northern to Southern Ethiopia has been upset by the samples that appeared with the Plaster thesis, since the highest frequency of the lineage was found in the Omotic speaking Maale, however, while the frequency appears to be higher in the South and lower in the North, the diversity of the lineage appears to have an opposite pattern, i.e. higher in the North and lower in the South.
UPDATE (04/04/2014)
Median Joining Networks (created using the Fluxus Network Software)



  1. Very interesting and throughout. The only thing I miss is a neighbor-joining graph of haplotypes and maybe, being overly ambitious, a further contextualization within E-M35, whose expansion E-M123 most likely took part in.

    Based on the frequency map, it would seem that, even if the lineage effectively originated in Ethiopia (not too sure if that's your thesis but it would seem plausible at least), it probably had a secondary center in Palestine (maybe with Egyptian help, as is the case with other E subclades with presence in the Eastern Mediterranean). By comparison with its scatter around the Mediterranean/Red Sea, the penetration in Africa beyond The Horn seems sharply interrupted in NW Kenya. Not sure what to think but it looks like the Ethiopian center of this lineage was not so dynamic/expansive as the Palestinian one. I guess that the same happens with other lineages, right? In any case it does suggest that Ethiopian-plus (Horn, Nile) genetics had more impact to the North than to the South and West - any thoughts on this asymmetry?

    Off topic: in case you did not notice yet, you may be interested in this new study on Bantu genetics of Souther (Southeastern) Africa by Chiara Barbieri: (not too surprising but informative anyhow).

    1. "The only thing I miss is a neighbor-joining graph of haplotypes"

      Indeed, a network graph would have been interesting, unfortunately I haven't done any of those yet.

      " In any case it does suggest that Ethiopian-plus (Horn, Nile) genetics had more impact to the North than to the South and West - any thoughts on this asymmetry?"

      If you look at it from E-M215/M35 and below, then yes, with the exception of E-M293, it looks like most lineages were headed North. But if you look at it from the level of E-P2 and below, then No, the impact was throughout the continent, not particularly any one side.

      "Off topic: in case you did not notice yet, you may be interested in this new study on Bantu genetics of Souther (Southeastern) Africa by Chiara Barbieri: "
      Thanks, looks interesting.

  2. This is a great study,,,I happen to be ashkenazi jewish e1b1b1c1a, m-123 positive, and feel that its presence in all jewish communities at 10% or more including ethiopian, ashkenazi, sephardi, and yemini is remarkable..This shows that our forefathers were a big part of these ancient semetic communities and biblical times, although from the information ive studied through the years indicates M123 is not a Semite but is Hametic Meditid peoples. We were the canaanites who got taken into the jewish and arab relgions, or something like that..

    As for my E-M123 community,,the studies which seam to show that my ashkenazim forefather likely was absorbed into the jewish communities when Canaan was conquered, and the eventually left during exile being taken through italy, or north through turkey, or north across the black sea to southern eastern europe (where my family was from)...

    Thanks for the post

  3. Interesting study. However, E-M123 has also been found in 35% of a sample of Dead Sea Bedouins. I couldn't find the 'New phylogeny study' from 2011 that supposedly puts M-123 as a much closer related to a ' Recent South-African common ancestor' and more 'downstream' descendant of M-35. Neither could I find links to the aforementioned thesis that supposedly finds 'higher haplogroup diversity' for M-123 in East Africa, as opposed in the Levant and outside of Africa. In contrast to other M-35 descendants, M-123 hasn’t been as widely and thoroughly investigated, so further study seems appropriate - such populations as Bedouins from the Negev desert, the Sinai Peninsula and Petra and Dead Sea in Jordan could reveal deeper insights about the origins and diversity of M-123. Another curiosity of mine would be to discover the Y DNA haplogroup of the so-called 'Natufians of Palestine', who inhabited the Levant around 11,000 BCE and although belonging to the Epipaleolithic, were apparently a sem-sedentary people and are thought of as the immediate ancestors of very first Neolithic cultures in the world.

    1. “ However, E-M123 has also been found in 35% of a sample of Dead Sea Bedouins”

      I know, hence why Flores (2005) is referenced in the blog post. Please read the post well before you attempt to criticize.

      “ I couldn't find the 'New phylogeny study' from 2011 that supposedly puts M-123 as a much closer related to a ' Recent South-African common ancestor' and more 'downstream' descendant of M-35. “

      Again, please read the post carefully before you attempt to criticize . What I said, is that E-M123 is more closely related to the East African Variants of E-M35 (E-V42 & E-M293) before it is to the other variants of E-M35 that are found outside East Africa (E-V257 & E-M78), this is because of the E-Z830 mutation found at the end of 2011. I wrote this blog post before Trombetta (2015) came out , however it does not change any of my observations above, as the new mutation found in Trombetta (2015), i.e. E-V1515, which unites all the East/South African variants that are found under E-Z830, is still the closest living lineage to E-M123.

  4. Thanks a lot for your great blog, and allow me to challenge your conclusion that E-M123 originated in East Africa ( excuse my poor English)

    1- E-m123 could be originated in the same area with the other E-M35 lines and its more recent common ancestor E-V1515 could be originated in the same area and then moved south to East/South Africa.
    2- In the Kenyan Samples E-M123 are either small samples or combined with J samples which support that it may came from outside.
    3- E-M123 among the Maale (25%) could be come from outside similar to the J in the same sample (6%) specially that it is younger than the north samples.
    4- The one sample in Somalia in Sanchez study most probably came from the Arabian peninsula ( again like the J in the same sample)
    5- E-M34 samples in Ethiopia are Younger than E-M34 samples out of Africa as per your post on Jan 2013, which support the Idea that it originated out of this area.

    1. 1 - The latest Trombetta paper really has not changed anything in terms of the postulated origin of E-M35, i.e. East Africa, sure it found a unifying SNP for the many E-M35 East African variants that were previously known to us, E-V1515, but there is no indication that this unifying mutation had any other origin than East Africa.

      2. The point was to show that the distribution of E-M123 reaches further than Ethiopia, or specifically just northern Ethiopia as was being implied , and obviously it could have come from outside but the point is not to disprove that it came from 'outside', but rather to question the validity of even having that initial assumption in the first place.

      3. It depends on what you mean by 'outside' again but look above

      4. Possible, I have no proof for that though.

      5. No, look at the STR analysis of this post which is the latest, the previous ones had some issues, the combined Ethiopian E-M34 haplotypes have at the minimum comparable TMRCA estimates with non-Ethiopian samples, and in most cases exceed them in time depth.

    2. thanks for your replay, my point is that your still there is no valid argument in your conclusion support the EA origin of E-M123. the only argument that could support your conclusion is the TMRAC calculation. however, if i read Figure 4b correctly which should be more accurate than 4A,since it done on 14 Markers the combined Ethiopian E-M34 are younger than the non-Ethiopian samples.

  5. Great post, cheers. E-M136 here, paternally Ashkenazi if that's relevant. Would you suppose E-M123 spread with Afro-Asiatic languages into the Middle East, or would you guess it spread there after?

    1. Yes IMO, likely with the Afro-asiatic....including the neolithic cultures