Wednesday, May 1, 2013

Analyzing YDNA J lineages in Ethiopian linguistic groups

The extensive YDNA dataset found in the Plaster paper has a total of 691 YDNA lineages that belong to haplogroup J, although there is no more detailed SNP resolution reported for most of these lineages, it is safe to assume, from previous data on Ethiopia, that a vast majority of them would belong to J1-M267. There is a limited set of STR data that accompanies these lineages as well, namely only for the markers; 19, 388, 390, 391, 392 and 393.

According to the report, J lineages are proportionally found higher in Semitic speakers in Ethiopia, ~21% ,followed by Omotic speakers at ~ 12% and Cushitic speakers at ~  8%.  Out of the 691 YDNA J lineages reported, 259 were Semitic speakers, 266 spoke some type of Omotic language and most of the remainder spoke Cushitic languages.

Using the STR data provided, along with linguistic information, below I have estimated the respective TMRCAs using the previously outlined ASD method (and calculator) for the major linguistic groups, in addition to selected populations within those linguistic groups that were found with a high frequency of Haplogroup J.

  • Generally the Semitic speaking groups harbour the oldest J lineages, followed by Omotic and then Cushitic speaking groups
  • It is very rare to see similar and even less TMRCA estimates between the Zhiv. rates and the pedigree/familial rates as can be seen above (especially for Chandler), this could be due to the small numbers of markers used however
  • Within the specific groups tested, the Omotic speaking shekecho appear to have the oldest J lineages, followed by the Semitic speaking Gurage and the Cushitic speaking Kembata, while the Yem and Afar seem to harbour the youngest J lineages
  • Note that different samples of Agews are found in both the Semitic and Cushitic groups, that is because in those that are labeled under Agew_Cush, they are classified as Cushitic speakers but self identified as Agews, where as those labeled under Agew_Sem are those classified as Semitic speakers while also self identifying as Agews. 
  • Similarly, for those labeled under 'Amhara' in the Semitic group, are only who identified as Amhara, and not for all who spoke Amharic as a first language, since almost all the (239/259) samples that were classified as Semitic speakers, spoke Amharic as their first language, but identified differently from their first language, i.e. as Gurage, Tigray or Agew (and also other IDs traditionally held amongst non-semitic speakers)
  • Details of the analysis can be seen here


  1. Check out this new study by Hirbo, if you haven't done so already. It's mainly focused on small Kenyan and Tanzanian ethnic groups, but also has some interesting theories on the Afro-Asiatic language family:

    1. Page 86 of the pdf for the new Y-DNA samples, page 103 for the new mtDNA samples.

    2. Thanks for the heads up, that looks like it has a wealth of information. Appendices 1 & 2 (pg. 176 -183) show a listing of all the new samples for Y and mtDNA typing done for the Dissertation. I am currently working on processing the Haplogroup A data from the plaster thesis, I will make a post about this paper after I am done with that.