Saturday, June 6, 2015

More Ethiopian Uniparental Data (More resolution.. less clarity)

A new paper attempting to decipher the out of Africa exit route by focusing on Ethiopian and Egyptian autosomal genetics was published a couple of weeks ago. Putting aside the 'hocus pocus' autosomal analysis for a moment, I was quite intrigued by the more concrete uniparental relative frequency images published in the supplemental material, not a lot of clarity is attached with these images however as the actual numbers are not given.


Note that the phylogeny they reference for the results here, is from Phylotree Y.

Below I have attempted to interpret some of the colors from the image into Numerical approximations, note these are only approximations and not a substitute for the real data, of which I am not privy to.


Amhara Eth Somali Gumuz Oromo Wolayta
A-M13 27% 0% 55% 19% 48%
B-M150 0% 0% 4% 0% 0%
B-M8495 0% 0% 35% 0% 0%
E-M96 3% 4% 0% 6% 12%
E-M215 3% 0% 0% 0% 0%
E-V22 9% 0% 0% 5% 3%
E-Z1902 8% 80% 4% 20% 0%
E-Z830 0% 0% 0% 0% 3%
E-M34 3% 0% 0% 5% 13%
EM4145 17% 0% 0% 25% 20%
J 25% 11% 0% 19% 0%
T 3% 4% 0% 0% 0%

A-M13 :

The prevalence of this haplogroup in Ethiopia has always been known to us, however the extremely high frequency in the Wolayta is quite a surprise, this could be due to the relatively small sample size however, as the much higher sample size of the Wolayta found in the Plaster thesis, only showed 13% of A-M13.

B-M150 and  B-M8495 :

Only found in the Gumuz, we have known for a while that B is not prevalent at all in the wider Ethiopian population, rather it is a continuation of the much larger B frequencies found in Niloitic Sudan. Still, it is good to see a finer resolution of B, and that the majority of B clades in Ethiopia belong to the small B-M8495 branch.

E-M96:

This could potentially be a wide variety of things, but my money would be on E-M329, sister clade to E-M2 and  child clade of E-V38, which in turn is a sister clade to E-M215, the most prevalent YDNA lineage in Ethiopia.

E-M215

As this is showing only in Northern Ethiopia, I would think it maybe E-V92, it still could however be a basal "E3b" lineage.

E-V22

A variant of E-M78, this lineage has always been found in low amounts in Ethiopia, with moderate amounts in Sudan and Egypt.

E-Z1902

This is a lineage that is found downstream of E-M78, but unites E-V12 with E-V65, which means the results would include E-V32 , a sublineage of E-V12 and the most frequent YDNA lineage in Somalis, I would wager that all of the E-Z1902 is actually E-V32, since E-V65 has never been found in Ethiopia thus far. There is a chance that some E-V12* could be in the mix as well.

E-Z830

This lineage has been discussed before, it unites many lineages in Ethiopia, including E-M34,E-M293 and E-V42. It looks like they did not test for E-V42 from the image however, so it could be E-V42.

E-M34

The prevalence of this lineage in southern Ethiopia from the image above, could be further confirmation of the high frequency of E-M34 found in the omotic speaking Maale from the plaster thesis.

EM4145

This is a tricky one, I am not sure what it is , I have searched for SNPs named as such and came back empty handed, to complicate things further, it is shaded a similar color as E-M293, but I discounted that lineage based on the fact that the lineage they report here is found in relatively high frequency in Ethiopia, whereas previous data shows that E-M293 is only found in low to moderate  frequencies in Ethiopia. My best guess for this SNP would be something equivalent to E-V6, if not that then E-P2(x E-M215), but with less confidence for the latter, as if that was the case, I would think they would have given it a more basal presence in the hierarchy of YDNA lineages from the image above.

J and T

These F belonging lineages look both to be inline with what we already know in terms of frequency distribution throughout Ethiopia.

refs:
http://ethiohelix.blogspot.com/2010_12_01_archive.html
http://ethiohelix.blogspot.com/2012/01/e1b1b-update.html 
http://ethiohelix.blogspot.com/2012/11/extensive-doctoral-thesis-on-ethiopian.html
http://ethiohelix.blogspot.com/2013/05/another-extensive-thesis-on-east.html

Update 06/07/2015 - MTDNA



Find below approximations for the frequencies of lineages found from the image above



Amhara Eth Somali Gumuz Oromo Wolayta
L0 8% 4% 11% 4% 17%
L1 7% 0% 0% 0% 0%
L2 16% 27% 0% 4% 12%
L3(x M,N) 8% 41% 52% 32% 28%
L4 0% 0% 22% 0% 8%
L5 0% 0% 13% 4% 6%
L6 0% 0% 0% 4% 4%
M 24% 3% 0% 12% 4%
N(x H-X) 7% 20% 0% 4% 0%
R0 9% 3% 0% 18% 10%
U 4% 0% 0% 4% 0%
Other 15% 0% 0% 12% 9%


L0
L0 has been found readily in Ethiopia before,  it is mostly of the L0a type, this finding is therefore inline with what was known before.

L1
Low presence of L1 has also been found in Ethiopia, typically it has been of the L1b variety, here it was only found in 1 out of the 5 populations sampled from Ethiopia

L2
Significant frequencies of this lineage have been documented before, mostly of the L2a variety but to a lesser extent of the L2b variety as well.

L3(x M,N)
L3 encompasses all maternal lineages outside Africa, and many inside Africa. The results shown here are L3 lineages that do not include the signature lineages of the out of Africa migration , i.e. M and N. These L3 lineages have an ample variety of sub-lineages found in Ethiopia, look here for a more detailed accounting for this lineage.

L4
This data shows L4 only being found in the Gumuz and Wolayta, but in fact it has been found throughout Ethiopia in moderate frequencies , albeit much higher frequencies have been found in hunter gatherers further south from Ethiopia (Hadza).

L5
Similar to L4 , L5 has also been found throughout Ethiopia but in slightly lower frequencies.

L6
This is quite a rare lineage outside of Ethiopia, and within it, has only been found in low frequencies.

M
The first of L3's sub lineages found outside Africa,  the frequencies shown here are consistent with previous findings. The origin of this lineage is obscure and has a TMRCA of 60 KYA, close to when the out of Africa migrations are thought to have occurred and only 10 KYA younger than its predecessor L3. Almost all lineages found in Ethiopia are of the M1a variety however, with an estimated TMRCA of 20-30 KYA.

N(x H-X)
The other L3 sub lineage found outside of Africa, N encompasses all non-African maternal lineages that do not belong to M. It is also estimated to have a TMRCA of about 60 KYA. Most N designated lineages in Ethiopia are further categorized as N1a with a TMRCA of ~ 20 KYA.

R0
A sub-lineage of N, that is found throughout East Africa but mostly in Ethiopia, the sginifcant varient of this lineage is R0a1. Note that in the image provided, it is hard to distinguish between R0a, T and K, all of which have been found in Ethiopia before, however,  R0a with much larger frequencies.

U
Another sub-lineage of N with an ancient presence in Ethiopia, it should mostly be of the U6a1 variety.

Other
Other refers to lineages that belong to N, but not to R0 or U. These 'other' lineages are for the most part limited to HV, I and T.

Refs:
http://ethiohelix.blogspot.com/2012/01/mother-of-mothers.html#uds-search-results
http://ethiohelix.blogspot.com/2012/04/copernican-reassessment-of-human.html
http://ethiohelix.blogspot.com/2013/01/east-african-mtdna-variation-has.html#uds-search-results 
http://ethiohelix.blogspot.com/2013/12/more-east-african-mtdna-charts.html#uds-search-results 

17 comments:

  1. 1. I am surprised that Y-DNA T is so scarce in Ethiopian Somolis given its high frequency in Somolias generally.

    2. I am surprised that Y-DNA J in the Oromo (linguistically Cushitic) is almost as high as in the Amhara (linguistically Ethio-Semitic).

    3. The ratio of typically Eurasian Y-DNA probably of recent origins to typically Eurasian mtDNA probably of recent origins (excluding, for example, M1) is quite a bit lower than I would have expected, suggesting that late Holocene migration to Ethiopia was more gender balanced than is commonly assumed.

    ReplyDelete
  2. ". I am surprised that Y-DNA J in the Oromo (linguistically Cushitic) is almost as high as in the Amhara (linguistically Ethio-Semitic)."

    The plaster thesis I linked to above noted pretty much the same thing over two years ago with a much bigger sample size, N = 396 and 149 for the Amhara and Oromo respectively , with J @ 26% and 24% respectively. So we have actually known for quite a while that the introduction of haplogrooup J in Ethiopia has nothing to do with the contemporary 'ethnic' stratification of its people. High levels of J are only probably true for samples of Oromo speakers only from Central Ethiopia however, far southern Oromos, i.e. Borana, are likely to have much lower amounts of J. We have also known for a while that the difference between Cushitic and Semitic speakers in Ethiopia has very little to do with the genetic makeup of the associated speakers, which means that either Semitic speakers shifted from Cushitic (or some now extinct AA type of language ) with very little external gene flow or that Semitic itself arose in Ethiopia, while the latter proposition is the less likely one from a linguistic variance stand point, it is still a possibility that deserves to be studied closely.

    "The ratio of typically Eurasian Y-DNA probably of recent origins to typically Eurasian mtDNA probably of recent origins (excluding, for example, M1)"

    Almost none of those lineages that putatively arose outside of Africa have been shown to be of 'recent' origin, unless by recent you mean 10-20000 years ago.

    ReplyDelete
    Replies
    1. Thanks. By "recent" I was thinking about the arrival of the Neolithic revolution locally and/or the Bronze Age, so more recent than 10,000 to 20,000 years ago; probably more like the last 5,000-6,000 years or so.

      Delete
    2. This is entirely off-topic but:

      http://anthromadness.blogspot.ae/2015/06/sudanese-arabs-beni-ameri-beja-and.html

      ^ Do you think you could check if Beni-Amer Bejas have any Omotic admixture?

      Delete
  3. Please cover these new Sudanese autosomal study: http://www.nature.com/srep/2015/150528/srep09996/full/srep09996.html

    PLINK data of said study: https://drive.google.com/file/d/0B9o3EYTdM8lQb0plX1AwYS1iV2M/view

    Good luck!

    ReplyDelete
    Replies
    1. Could you please contact me: Awaleking@gmail.com? I have some questions for you. And did you get that PLINK data from the Sudanese thread on ABF where I posted it or my blog? Just curious as to whether or not you ever go onto ABF.

      Delete
  4. Btw, Ethio-Helix-> I can tell Uniparental data intrigues you perhaps more than autosomal DNA data so you might be very pleased to know that one of the three main authors of the autosomal study on Sudan also headed a Y-DNA & mtDNA study on Sudan:


    https://www.dropbox.com/s/1nmaqtc59wabowt/Genetic%20Patterns%20of%20Y-chromosome%20and%20Mitochondrial.pdf?dl=0


    ^ I've been told that you're an old member of ABF so you can thank old Beyoku for sharing that if you remember him:


    http://www.forumbiodiversity.com/showthread.php/44360-Y-chromosome-and-mtDNA-Variation-with-Implications-to-the-Peopling-of-the-Sudan?highlight=

    ReplyDelete
  5. Do you mind if I inquire what Haplogroup you belong to?

    ReplyDelete
    Replies
    1. I don't know I'm waiting on my result. Thanks for the quick reply.

      Delete
  6. @ Andrew
    I am not aware of any uni-parental lineages of significant frequency in Ehtiopia that also have a putative non-African origin and with a TMRCA in the 5-6 KYA range, perhaps you can enlighten me.

    @ Awale
    Re: Sudan Uniparental data; the underlying YDNA data was published seven years ago and I have blogged about it under "Sudan YDNA". However the mtDNA (both ancient and contemporary) have not been publicly published before and are interesting.

    Re:Autosomal data; I haven't run ADMIXTURE, or other GWA for that matter, in over 3 years, as new African samples , from regions that are not covered in my Intra-Africa V2 analysis have not appeared, however these samples you point out may fit the bill as I don't have any samples from North / West Sudan, if my old dataset intersects well with this one, I will run Intra-Africa V3, but it will take me a while to get up and going as I have to locate all my old files, scripts and what not.

    @ AWI
    see my response above

    ReplyDelete
    Replies
    1. "Sudan Uniparental data; the underlying YDNA data was published seven years ago and I have blogged about it under "Sudan YDNA". However the mtDNA (both ancient and contemporary) have not been publicly published before and are interesting."

      I remember your Sudan Y-DNA post, I've linked people to it in the past. Hadn't realized this was the same data. Well, I'll be waiting on your future post about the mtDNA data. ;)

      "Re:Autosomal data; I haven't run ADMIXTURE, or other GWA for that matter, in over 3 years, as new African samples , from regions that are not covered in my Intra-Africa V2 analysis have not appeared, however these samples you point out may fit the bill as I don't have any samples from North / West Sudan, if my old dataset intersects well with this one, I will run Intra-Africa V3, but it will take me a while to get up and going as I have to locate all my old files, scripts and what not."

      Take your time, but do me a favor and email me your results once you're done if you don't end up posting it on your blog for some reason: Awaleking@gmail.com


      Take care, mate,

      Delete
  7. My post was lost. I think the most interesting thing about these results are the Gumuz and their high frequency of B lineages which are in fact B2b, usually ascribed to Hunter Gatherers. This is likely the most northern concentration of B2b. Perhaps the B2b in Gumuz is mated with their L4 and is associated somehow indirectly with their presence of Omotic autosomal signature.

    ReplyDelete
  8. M4145 is a sub-clade of E-M123 found in "Arab Christians". https://sites.google.com/site/compositeytree/e1b1-2

    ReplyDelete
    Replies
    1. yes, I had a reader email me that link with the tentative placement of E-M4145 several days ago, here below was my response to him,

      ....Very Interesting, many thanks for the heads up.
      According to that tree of E-Z827 that you linked to, the M4145 SNP would be a variant of the East African/ Levant branch of E-Z827, rather than the typically Northwest African variant, i.e. E-V257. Specifically it is saying that it is parallel to M34 while being downstream of M123, so again more related to the East African / Levant variant of E-Z830, rather than the more restricted South/East African variant of E-Z830, i.e. E-M293 . There have indeed been significant (10-20%) unclassified variants of E-M35 found in Ethiopia in the past 10-15 years of testing, E-M34 would have been discounted for as it is an old SNP and has been tested for since these tests begun, but a moderate number of those samples did not test for the parental node of the E-M34 SNP, i.e. M123, so it is a possibility that a large number of these unclassified E-M35 variants in Ethiopia could belong to this new bifurcating branch of E-M123, if indeed this tentative placement of E-M4145 is correct.

      However, several things leave me wanting, first of all, we don't know where E-V6 belongs within E-M35's latest phylogeny, does it belong to E-Z827 ? , and if so, is it related to E-Z830 or E-V257 or independent ?, it is important to place E-V6 correctly because it has been found in significant frequencies in Ethiopia before, for example it was found @ 30% in a sample of 66 Afar in the plaster thesis that I have linked to in my blog, and like the frequencies of E-M4145, E-V6 is restricted to only Ethiopia and hardly found in Egypt (save for some Siwa berber samples). In addition, the E-Z827 tree that you linked to does not place E-V42, which is another known variant of E-Z830 that has been found not only in Ethiopia (Trombetta '11) but also in the near east from private FTDNA tests, granted, current knowledge has it parallel to both M123 and M293 type variants, so this tentative placement of M4145 would be irrelevant, but still current knowledge maybe incorrect. Finally, there is E-V92, another loner lineage within Ethiopia that has been poorly studied.

      As far as the populations associated with the tree, like the Christian Arabs, I would largely disregard, especially for African centered lineages, the reason is because most private dna testing is skewed towards non-African samples, specifically it has a majority of European samples and does not reflect the true geographic spread of the frequencies of African centered lineages, as an example, E-V32 is for the most part found in Somalis specifically and the Horn generally, but if we go by FTDNA results it is mostly found in the peninsula (Arabian), there are many other examples.

      Again, thanks for this very interesting bit of information...........

      Delete
    2. The follow up email also made the reasonable suggestion that what was found in the Ethiopians above was likely E-M293 and not E-M4145, which was likely the one found in the Egyptians , for which I responded

      ........ I did consider the possibility of 'it' being E-M293 in the Ethiopians, and consequently what was found in the Egyptians of similar color to be the only E-M4145 found @ ~3%, since there is a slight difference in color. But if that is the case, it would be quite a surprise, since the discovery of the E-M293 mutation in 2008 it has never been found in Northern Ethiopians in the past. That SNP is characteristic of areas South of Ethiopia like Kenya, Tanzania, etc.. and to a much lesser extent Southern Ethiopia itself. To my knowledge, in Ethiopia that lineage had only in the past been found in the Wolayta (~8%) (Trombetta '11) and the Konso (~5%) (Hirbo Thesis) both Southern Ethiopian populations, even an N=58 sample of the Borana, who reside in the far southern Ethiopian border with Kenya only had it at ~7% according to the Hirbo thesis that you can find in my blog. And of course, from the publication that first described the E-M293 mutation itself (i.e. Henn 2008) only 2% of the 88 Ethiopian samples had that lineage. So if it indeed is E-M293 at such significant frequency of 15-25% in Ethiopia, that would be something new, or inconsistent with previous data, hence why I discounted it........

      Delete