Tuesday, May 21, 2013

Development of Middle Stone Age innovation linked to rapid climate change

The development of modernity in early human populations has been linked to pulsed phases of technological and behavioural innovation within the Middle Stone Age of South Africa. However, the trigger for these intermittent pulses of technological innovation is an enigma. Here we show that, contrary to some previous studies, the occurrence of innovation was tightly linked to abrupt climate change. Major innovational pulses occurred at times when South African climate changed rapidly towards more humid conditions, while northern sub-Saharan Africa experienced widespread droughts, as the Northern Hemisphere entered phases of extreme cooling. These millennial-scale teleconnections resulted from the bipolar seesaw behaviour of the Atlantic Ocean related to changes in the ocean circulation. These conditions led to humid pulses in South Africa and potentially to the creation of favourable environmental conditions. This strongly implies that innovational pulses of early modern human behaviour were climatically influenced and linked to the adoption of refugia.

http://www.nature.com/ncomms/journal/v4/n5/full/ncomms2897.html

From the Press:

Rapid climate change during the Middle Stone Age, between 80,000 and 40,000 years ago, during the Middle Stone Age, sparked surges in cultural innovation in early modern human populations, according to new research.


Professor Ian Hall, Cardiff University School of Earth and Ocean Sciences, said: "When the timing of these rapidly occurring wet pulses was compared with the archaeological datasets, we found remarkable coincidences.
"The occurrence of several major Middle Stone Age industries fell tightly together with the onset of periods with increased rainfall."
"Similarly, the disappearance of the industries appears to coincide with the transition to drier climatic conditions."
Professor Chris Stringer of London's  commented "The correspondence between climatic ameliorations and cultural innovations supports the view that population growth fuelled cultural changes, through increased human interactions."
The South African archaeological record is so important because it shows some of the oldest evidence for modern behavior in early humans. This includes the use of symbols, which has been linked to the development of complex language, and personal adornments made of seashells.


Read more at: http://phys.org/news/2013-05-human-culture-linked-rapid-climate.html#jCp

The climate of South Africa was once much wetter than it is today, and those lush times may have spurred human populations through especially innovative periods, new research shows.


Evidence from these ancient periods suggests humans produced new tools, and used symbolism in wall engravings. The findings suggest a tight link between abrupt climate changes and the emergence of modern human traits, researchers say.

"We provide for the first time really good evidence that the occurrence and disappearance of these first finds of human innovation are linked to climate change," said study author Martin Ziegler, an earth science researcher at Cardiff University in Wales.

Before these periods of innovation, humans were quite primitive, with the most impressive technology being hand axes, Ziegler said. But during these wet periods, more advanced stone and bone tools appear in the fossil record, as well as painted symbols on cave walls that suggest the development of language.

Archaeologists have also found some of the first evidence of constructed plant beds during these periods, and shells thought to be worn as adornments or jewelry, Ziegler said. Among the most important periods analyzed in the study date to 71,000, and a period between 64,000 and 59,000 years ago.

Read more at: http://www.livescience.com/34546-climate-stone-age-innovation.html



Wednesday, May 8, 2013

Another Extensive thesis on East African DNA


It was brought to my attention last week, thanks to a comment on this blog made by the user 'Umi', that another thesis on East African DNA variation was publicly available online:

Complex Genetic History of East African Human Populations

This is also an extensive thesis with a wealth of information akin to Plaster's thesis, the primary differences being that this one was more focused on parts of East Africa that are found further to the South of Ethiopia, and in addition to uni-parental analysis, it also included some Autosomal model-based inference, albeit of quite low resolution in today's standards; 848 microsattelites and 479 indels (refer to Tishkoff et al. 2009 for marker details).

Due to the extensive nature of the report I haven't had a chance to cover its entire scope, instead, for starters, I have first focused on the YDNA data by creating a relative frequency chart from the results reported in Fig. 3.3.2. 

Several things to initially point out here,

  • The report outlines the discovery of 4 new SNPs, TL1-4. The first two were found in Haplogroup B and downstream from B-M150 and B-M112 respectively. The last two, TL3 and TL4, were found in haplogroup E and downstream from E-U174 and E-V32 respectively. Incidentally, the fourth SNP that is under E-V32, TL4, could potentially be the same as Z808/Z809 as identified recently by the geneological community, however, as the report does not give the Y-Chromosome location of the SNP in a NCBI Build 36/37 format, this can not be verified, at least by me, at the moment.
  • A couple of the frequency results in Fig. 3.3.2 do not add up, in particular, the frequency results for the Boni and the Baggara, but also to a lesser extent for the Kanuri and Teita.  I have labeled the missing frequency results with a “?” in the relative charts for those specific populations.
  • The Burji and Konso are labeled as being only from Kenya throughout the report, however most Burji are from Ethiopia, and the Konso are exclusively found in Ethiopia, I have reflected this in the charts.
  • STR data is not readily available to perform TMRCA estimates on, however, some TMRCA results are reported using Zhivotovsky's rates in Table 3.3.1, nevertheless, these are estimates only for different lineages found in the dataset for all the samples and not necessarily comparing TMRCAs in the different populations under study.
  • J-M62, while a subclade of J-M267, is not the main subclade of J-M267 found in East Africa, that would be J-P58, therefore, the results for J-12f2.1 (x M62, M172) reported, may after all be, or largely include, J-P58 lineages, off-course those results could also include variants of J-M267 other than J-P58 and J-M62 as well since the SNP was not directly tested. 
  • E-P2* lineages are abundantly found (> 30%) in the Konso, Burji and Mbugwe, however on closer examination and correlation with current data, these could be E-M329, E-V38* or even E-M215*, as none of these SNPs were directly tested. Genuine E-P2* lineages would be positive for E-P2 and negative for V38 and M215 (See Trombetta et al. 2011)
  • Similarly, the E-M35* lineages reported could be members of relatively newly discovered lineages of E-Z830*( See this post for details), or some of the untested variantes of E-M35, i.e.  E-V42, V92 and maybe even E-V68 (x M78)


Tuesday, May 7, 2013

Analyzing YDNA A-M13 lineages in Ethiopian linguistic groups

Similar to the previous analysis of J lineages found in Ethiopia from the Plaster paper, the other prevalent lineage in Ethiopia, A-M13 (formerly known also as A3b2), is also analyzed below. A total of 616 A-M13 lineages were reported in the study, of which ~32% were classified as Semitic speakers, ~40% as Cushitic speakers, ~17% as Omotic speakers and the remainder within the Nilo-Saharan speaking macro-phylum.

The prevalence of Haplogroup A lineages in Ethiopia according to the paper ranges from ~20% in Nilo-Saharan speakers, to about 5% in Omotic speakers, with an intermediate prevalence in Semitic and Cushitic speakers of 16 and 12% respectively.


Wednesday, May 1, 2013

Analyzing YDNA J lineages in Ethiopian linguistic groups

The extensive YDNA dataset found in the Plaster paper has a total of 691 YDNA lineages that belong to haplogroup J, although there is no more detailed SNP resolution reported for most of these lineages, it is safe to assume, from previous data on Ethiopia, that a vast majority of them would belong to J1-M267. There is a limited set of STR data that accompanies these lineages as well, namely only for the markers; 19, 388, 390, 391, 392 and 393.

According to the report, J lineages are proportionally found higher in Semitic speakers in Ethiopia, ~21% ,followed by Omotic speakers at ~ 12% and Cushitic speakers at ~  8%.  Out of the 691 YDNA J lineages reported, 259 were Semitic speakers, 266 spoke some type of Omotic language and most of the remainder spoke Cushitic languages.

Using the STR data provided, along with linguistic information, below I have estimated the respective TMRCAs using the previously outlined ASD method (and calculator) for the major linguistic groups, in addition to selected populations within those linguistic groups that were found with a high frequency of Haplogroup J.




Sunday, April 21, 2013

Source code for the ASD based TMRCA calculator (Octave)

The code for the TMRCA calculator of YDNA STR haplotypes that I use can be downloaded from here : https://dl.dropboxusercontent.com/u/42082352/TMRCA_ASD.zip

See also here for instances of where I have used the calculator in the past:
http://ethiohelix.blogspot.com/2012/06/finding-tmrca-of-ethiopian-ydna.html
http://ethiohelix.blogspot.com/2012/11/extensive-doctoral-thesis-on-ethiopian.html
http://ethiohelix.blogspot.com/2013/01/tmrca-calculations-from-plaster-nry.html
http://ethiohelix.blogspot.com/2013/02/the-zhivotovsky-multiplier.html
http://ethiohelix.blogspot.com/2013/03/african-sahel-ydna.html

The code is written for Octave and is also Matlab compatible. There is also an instruction file that explains how to run the calculator in the folder that is linked above which can also be found below:
---------------------------------------------------------------------------------------------------------


To check if the TMRCA program is correctly working on your system, first run it with the dataset
provided here before trying different datasets, to do so:

(1) Make sure you have Octave loaded on your system (either Windows or Linux will work) and start octave in the command line.
(2) In the command line, change your working directory to the directory where you saved the unzipped  folder by using: cd ~PATH/TMRCA_ASD/
If you are unsure of your current working directory, type the command: pwd()
(3) Type: fcompositeTMRCA("Buckova_EM78","all")
(4) If this produces results, then the program and functions are correctly installed and you can proceed to reading and analysing different datasets.


Reading and analysing new Data

After correctly executing the above steps, read and analyse new data by using the following steps:
(1)open the example STR data file in the "TMRCA_ASD/Loaded_Data/" folder entitled "EM35_STR.xls"
(2)Any STR data file to be analysed should first be made in the same format as the "EM35_STR.xls" file , specifically:
(a) DYS names in the first row should have the exact same nomenclature (the orders can be different however).
(b) Each row (except the first) should represent one sample.
(c) Each coloumn (except the first) should represent repeats for one marker/DYS#.
(d) The first column should represent sample identifiers, ex. Kit#, sample ID,...
(e) The cell found in the first row and first column should have the Dataset's name, this will be the same   name used throughout the analysis.
(f) No cell shall contain null values and avoid having cells that contain characters which have spaces in between them.
(3) In Excel or openoffice, convert the "EM35_STR.xls" workbook to a ".csv" file by saving the file as "YSTR.csv" and placed into the
same "TMRCA_ASD/Loaded_Data/" folder. The program will only look for a file entitled "YSTR.csv", so make sure that the same name is used for your file.
(4) Start octave, in the command line, change the working directory to "~PATH/TMRCA_ASD/Loaded_Data/"
(5) Type on the octave prompt: readdata
(6) Octave will start reading the dataset and create the file "EM35-Balanced" in the folder "/TMRCA_ASD/Loaded_Data/" when it is finished.
(7) If you want to analyse a specific set of markers from your dataset go to setep 8, otherwise go to step 9
(8) Go to the file "/TMRCA_ASD/Markerlist/49markerlist.txt", and pick the markers you want to use for the analysis. Then save your chosen
markers into a new txt file in the same folder as "/TMRCA_ASD/Markerlist/". Take a look at the file "8_Chiaronimarkerlist.txt" for
an example of how the marker list should look.
(9) In octave, change your working directory back up one level by typing: cd ..
(10) If you are specifying a set of markers to use in the analysis, then run the program by typing: fcompositeTMRCA("EM35-Balanced","8_Chiaronimarkerlist.txt"), otherwise, just type: fcompositeTMRCA("EM35-Balanced","all").
----------------------------------------------------------------------------------------------------------
Update : Version2 -  *.CSV read, + Auto path detect. (fcompositeTMRCA.m, fmarkerextract.m, readdata.m)
Update(04/25/13) : Version3 - Add option for using all available markers, print used/unused markers. (fcompositeTMRCA.m, fmarkerextract.m, fAssignmutation.m)

Saturday, April 13, 2013

Next-generation sequencing on Egyptian mummies

Nature has a news article out on a paper supposedly published last week in the Journal of Applied Genetics by Khairat, R. et al. which carried out next-generation sequencing on five Egyptian mummified heads, the paper is not accessible, some excerpts from the news article:

The ancient Egyptians could soon be getting their genomes sequenced as a matter of routine. That’s the view, at least, of the first researchers to use next-generation techniques to analyse DNA from Egyptian mummies.....

....Now, Pusch and his colleagues, including Rabab Khairat, have carried out next-generation sequencing on five Egyptian mummified heads held at the University of Tübingen. The heads date from relatively late in ancient Egyptian history — between 806 bc and 124 ad....

....they show that human DNA survives in the mummies and that it is amenable to sequencing...

....The researchers determined that one of the mummified individuals belongs to an ancestral group, or haplogroup, called I2, believed to have originated in Western Asia. They also retrieved genetic material from the pathogens that cause malaria and toxoplasmosis, and from a range of plants that includes fir and pine — both thought to be components of embalming resins — as well as castor, linseed, olive, almond and lotus....

....In mummies, “DNA preservation appears to be independent of temperature,” he says.....

.....Now that Pusch and his colleagues have demonstrated next-generation sequencing in Egyptian mummies, however, moving on to entire genomes “isn’t rocket science”, Gilbert says. “What limits you is the size of a sample. For Denisova Man they had just a finger bone. Here they have the whole mummy.”....

....“entire-genome sequencing of ancient Egyptian individuals is likely to become standard in the not-too-distant future”.....

http://www.nature.com/news/egyptian-mummies-yield-genetic-secrets-1.12793#/b1

Edit: Found link to the Abstract here: Khairat, R. et al.

We applied, for the first time, next-generation sequencing (NGS) technology on Egyptian mummies. Seven NGS datasets obtained from five randomly selected Third Intermediate to Graeco-Roman Egyptian mummies (806 BC–124AD) and two unearthed pre-contact Bolivian lowland skeletons were generated and characterised. The datasets were contrasted to three recently published NGS datasets obtained from cold-climate regions, i.e. the Saqqaq, the Denisova hominid and the Alpine Iceman. Analysis was done using one million reads of each newly generated or published dataset. Blastn and megablast results were analysed using MEGAN software. Distinct NGS results were replicated by specific and sensitive polymerase chain reaction (PCR) protocols in ancient DNA dedicated laboratories. Here, we provide unambiguous identification of authentic DNA in Egyptian mummies. The NGS datasets showed variable contents of endogenous DNA harboured in tissues. Three of five mummies displayed a human DNA proportion comparable to the human read count of the Saqqaq permafrost-preserved specimen. Furthermore, a metagenomic signature unique to mummies was displayed. By applying a “bacterial fingerprint”, discrimination among mummies and other remains from warm areas outside Egypt was possible. Due to the absence of an adequate environment monitoring, a bacterial bloom was identified when analysing different biopsies from the same mummies taken after a lapse of time of 1.5 years. Plant kingdom representation in all mummy datasets was unique and could be partially associated with their use in embalming materials. Finally, NGS data showed the presence of Plasmodium falciparum and Toxoplasma gondii DNA sequences, indicating malaria and toxoplasmosis in these mummies. We demonstrate that endogenous ancient DNA can be extracted from mummies and serve as a proper template for the NGS technique, thus, opening new pathways of investigation for future genome sequencing of ancient Egyptian individuals.

Thursday, March 28, 2013

Global Contour Map for the Dual ADMIXTURE Components.

Below is a contour map representing the African ADMIXTURE component at K=2 for the Global data set (V2) which  can be downloaded here, and population specific percentages that can be seen here

Contour map generated using Mapviewer7, Kriging method was used for gridding. ADMIXTURE outputs for all New World, Jewish, Singapore-Chinese and Singapore-Indian populations were removed before the generation of the map.

African cline from ADMIXTURE, K=2 . Black dots represent locations of sampled populations


 Some things to note,

  • Since this is a K2 run, the OOA or the 'other' component has a complete mirror distribution relative to the distribution of the African component seen in the above.
  • The regions where the brown color dominates (20-35% African ) are the same regions that are later on absorbed by the new component that arises @ K=3, which finds its peaks in West Eurasians and has an FST that is intermediate between those of the African and East Asian/Amerindian components.
  • It is notable to observe the congruence of the above with the distribution of global genetic as well as phenotypic diversity (below)1


Global phenotypic and genetic Diversity 
1.The effect of ancient population bottlenecks on human phenotypic variation

Thursday, March 7, 2013

African Sahel YDNA


Multiple and differentiated contributions to the male gene pool of pastoral and farmer populations of the African Sahel


ABSTRACT

The African Sahel is conducive to studies of divergence/admixture genetic events as a result of its population history being so closely related with past climatic changes. Today, it is a place of the co-existence of two differing food-producing subsistence systems, i.e., that of sedentary farmers and nomadic pastoralists, whose populations have likely been formed from several dispersed indigenous hunter-gatherer groups. Using new methodology, we show here that the male gene pool of the extant populations of the African Sahel harbors signatures of multiple and differentiated contributions from different genetic sources. We also show that even if the Fulani pastoralists and their neighboring farmers share high frequencies of four Y chromosome subhaplogroups of E, they have drawn on molecularly differentiated subgroups at different times. These findings, based on combinations of SNP and STR polymorphisms, add to our previous knowledge and highlight the role of differences in the demographic history and displacements of the Sahelian populations as a major factor in the segregation of the Y chromosome lineages in Africa. Interestingly, within the Fulani pastoralist population as a whole, a differentiation of the groups from Niger is characterized by their high presence of R1b-M343 and E1b1b1-M35. Moreover, the R1b-M343 is represented in our dataset exclusively in the Fulani group and our analyses infer a north-to-south African migration route during a recent past.

Closed Access



Y(x CF)  Phylogeny, Red = SNPs Tested, Blue =Presumed Tested 
CF Phylogeny, Red = SNPs Tested, Blue =Presumed Tested

UPDATE: TMRCA estimates from STR haplotypes of E-M35 (x M78, M81, M123), E-M2, E-M33 and R1b respectively. Farmer and Pastoralist haplotypes were also combined. Markers used for estimates were the following: DYS 19, 388, 389-1, 389-2, 390, 391, 392, 393 and 439.

Monday, March 4, 2013

Geno 2.0 YDNA SNP Pathways.


The Geno 2.0 chip tests some 13,000 SNPs on the Y-Chromosome, by far the largest from all commercial DNA companies, in addition, a lot of these SNPs do not have a place assigned in the YDNA phylogeny, no official phylogeny has been published yet either.

However, the customers of this project get the option to transfer the SNPs to FTDNA and thereby join the numerous grouped projects under the FTDNA umbrella, which then displays the results of which SNPs they tested positive for.

Although we don't know where most of these SNPs belong on the YDNA tree, we do know where some of them belong, and by utilizing the most rudimentary operations of set mathematics (union, intersection and set difference), in addition to the positions of the known SNPs in the current YDNA phylogeny tree (ISOGG 2013) it is possible to segregate these SNPs that appear on the project pages into phylogenetic pathways.

This posting will change frequently as more and more kits appear in the FTDNA project pages.

The first thing to realize is that the following list of 101 SNPs are either erroneous or erroneously reported and need to be discarded if they appear on any of the results until FTDNA , NATGEO or whoever else is responsible fixes them,

CTS1034+ CTS10436+ CTS10713+ CTS10738+ CTS11085+ CTS11454+ CTS11844+ CTS12173+ CTS2080+ CTS2223+ CTS230+ CTS2447+ CTS295+ CTS3234+ CTS335+ CTS3647+ CTS3763+ CTS3914+ CTS4276+ CTS4623+ CTS4714+ CTS477+ CTS5458+ CTS5580+ CTS6010+ CTS6353+ CTS6384+ CTS6891+ CTS7453+ CTS7492+ CTS7859+ CTS7951+ CTS8133+ CTS8178+ CTS8244+ CTS9096+ CTS947+ CTS9512+ CTS9548+ F1173+ F1221+ F1300+ F1327+ F1369+ F1707+ F1754+ F1831+ F1833+ F1842+ F1870+ F1882+ F2000+ F2137+ F2150+ F2177+ F2223+ F2494+ F2503+ F2546+ F2631+ F2845+ F2887+ F2932+ F3035+ F3039+ F317+ F3187+ F3225+ F3394+ F3397+ F3455+ F375+ F3948+ F3965+ F4131+ F4277+ F830+ F842+ F869+ F889+ F910+ F942+ F943+ F969+ L366+ L477+ L493+ L515+ L516+ L517+ L552+ L594+ M263+ PF4208+ PF4330+ PF5061+ PF6868+ PF7392+ Z148+ Z191+ Z365+  



Notes : Unions will be listed without symbol, ex Set ABC = Set ( (A  B) ∪ C)
            Known SNP identification is all based on ISOGG 2013 only.


Pathway from root to CT-M168 (=Set # A)



Binary Operation: Set1 Set2

Number of SNPs: 77


CTS10362+ CTS109+ CTS11358+ CTS11575+ CTS125+ CTS1996+ CTS3331+ CTS3431+ CTS3662+ CTS4364+ CTS4368+ CTS4740+ CTS5318+ CTS5457+ CTS5532+ CTS6383+ CTS6800+ CTS6907+ CTS7922+ CTS7933+ CTS8243+ CTS8980+ CTS9828+ L566+ L781+ M139+ M168+ M294+ M42+ M94+ PF1016+ PF1029+ PF1031+ PF1040+ PF1046+ PF1061+ PF1092+ PF1097+ PF110+ PF1203+ PF1269+ PF1276+ PF15+ PF192+ PF210+ PF212+ PF223+ PF234+ PF258+ PF263+ PF272+ PF278+ PF292+ PF316+ PF325+ PF342+ PF500+ PF601+ PF667+ PF719+ PF720+ PF725+ PF779+ PF796+ PF803+ PF815+ PF821+ PF840+ PF844+ PF892+ PF937+ PF951+ PF954+ PF970+ V189+ V52+ V9+

Identified as same level as BT
Identified as same level as CT-M168
Identified as same level as P <---- Looks unreliable and maybe a false positive report.


Thursday, February 21, 2013

The Zhivotovsky Multiplier


It is reported that Zhivotovsky's effective mutation rate [1] has the effect of increasing the TMRCA of a lineage, as computed by the use of Microsattelite Genetic Distances[2], by a factor of 3-4 fold relative to TMRCAs computed via mutation rates observed in pedigree and family studies [3].

By utilizing my TMRCA calculating program, I want to explore,
  1. What effect does different marker combinations have on this multiplier ?
  2. What effect does marker size have on this multiplier ?
  3. Is there a variation in this multiplier for different data-sets?

First, to ensure that my program correctly calculates the TMRCA when the Zhivotovsky mutation rate of 0.00069 is applied to all the markers in my database consistently (versus only the marker specific Pedigree mutation rates I have thus far been utilizing), I attempted to replicate the TMRCA computations of the following publication;


Friday, February 15, 2013

Gradient Maps for African ADMIXTURE components

Here below are gradient maps for my last African ADMIXTURE run, Africa_V2b, courtesy of a demo download of Mapviewer7 . The Kriging method was used for Gridding and 'Grid Z limits' mode was used for color mapping.

Sampled Population's Index

Sampled Population's Location

PCA for the FST distances
generated by ADMIXTURE  

West-Africa Cluster Freq.

Nilo-Saharan Cluster Freq.

East-Africa-2 Cluster Freq.

North-Africa Cluster Freq.

Khoi-San Cluster Freq.

Omotic Cluster Freq.

Mbuti-Pygmy Cluster Freq.

Biaka-Pygmy Cluster Freq.

Hadza Cluster Freq.

East-Africa-1 Cluster Freq.
Isometric view of the MDS plot
 for all Populations sampled


UPDATE (02/18/2013) : Below are gradient maps for the first African ADMIXTURE run, Africa_V1, courtesy of a demo download of Mapviewer7 . The same options as above were used both for gridding and color mapping.

Friday, February 8, 2013

Sudan YDNA

This is from a relatively old study, but it seems that it is the most comprehensive YDNA breakdown we have of North and South Sudan to date.

Y-chromosome variation among Sudanese: restricted gene flow, concordance with language, geography, and history. Hassan (2008)

Here is a map of the populations tested from Fig.1 of the Study
Populations Studied

Here below is the phylogeny (as known back in 2008) of the SNPs tested, note that those in bold; E-M75, E-P2, G-M201 and T-M70 were NOT tested in the study.

SNPs tested (except those in bold)
The E-M78+ cases from above were also tested for Cruciani's V-Series SNPs as well for further resolution,


Cruciani's V-Series SNPs (2007)

Some notes:


  • The high level (38%) of E-M215 (x M78) in the Borgu is quite intriguing, I wonder what variant/s of E-M215 it is?
  • Almost all the J-12f2(x M172) should be J-M267.
  • B-M60 is found in Southern Nilo-Saharan speakers and not the North Western ones, while A-M13 is found in both.
  • The F-M89(x M52,M170,I2f2, M9) found in the north is also interesting, although it could possibly be G-M201, at least part of it.
  • E-V22 has a relatively high presence in these samples, even when compared to the Egyptian samples from Cruciani '07, and most certainly higher than its presence in Ethiopia.
  • The High presence of E-V12 (x V32) is also concordant with its putative area of origin, all the E-M78 found in the Nuer and the Copts is of this variety.
  • The presence of E-M78* in the Masalit and the Nuba is notable.
  • Off course the strangest result is the 54% R-M173 (x P25) in the Fulani, this could be some R1b*(R-M343), or some type of R1a, the latter would be very out of place for the region, while the former could be reconciled with the presence of more downstream R1b variants in Africa. 


Monday, February 4, 2013

A speculative superimposition of E-M35 variants onto Afroasiatic.

Here is a speculative superimposition of the variants of YDNA E-M215/M35 (E1b1b/1) onto an Afroasiatic internal classification, Lionel Bender's (1997) classification. 


The red question marks represent a less unsure fit.

Monday, January 7, 2013

East African mtDNA variation has implications on the origin of Afroasiatic

The Dienekes' Anthropology Blog shows a new paper on East African mtDNA with implications for the origin of Afroasiatic, namely with the citing: "making the hypothesis of a Levantine origin of AA unlikely",  unfortunately I do not have access to the paper, I would greatly appreciate if anyone has access to it to please send me a copy here: ethiohelix@gmail.com.

Here is the abstract and the link:


Abstract

East Africa (EA) has witnessed pivotal steps in the history of human evolution. Due to its high environmental and cultural variability, and to the long-term human presence there, the genetic structure of modern EA populations is one of the most complicated puzzles in human diversity worldwide. Similarly, the widespread Afro-Asiatic (AA) linguistic phylum reaches its highest levels of internal differentiation in EA. To disentangle this complex ethno-linguistic pattern, we studied mtDNA variability in 1,671 individuals (452 of which were newly typed) from 30 EA populations and compared our data with those from 40 populations (2970 individuals) from Central and Northern Africa and the Levant, affiliated to the AA phylum. The genetic structure of the studied populations—explored using spatial Principal Component Analysis and Model-based clustering—turned out to be composed of four clusters, each with different geographic distribution and/or linguistic affiliation, and signaling different population events in the history of the region. One cluster is widespread in Ethiopia, where it is associated with different AA-speaking populations, and shows shared ancestry with Semitic-speaking groups from Yemen and Egypt and AA-Chadic-speaking groups from Central Africa. Two clusters included populations from Southern Ethiopia, Kenya and Tanzania. Despite high and recent gene-flow (Bantu, Nilo-Saharan pastoralists), one of them is associated with a more ancient AA-Cushitic stratum. Most North-African and Levantine populations (AA-Berber, AA-Semitic) were grouped in a fourth and more differentiated cluster. We therefore conclude that EA genetic variability, although heavily influenced by migration processes, conserves traces of more ancient strata. Am J Phys Anthropol, 2013. © 2013 Wiley Periodicals, Inc.

mtDNA variation in East Africa unravels the history of afro-asiatic groups

UPDATE: Ok, got it, this was a nice little article to read, however with respect to the implications of East African mtDNA variation on the origin of Afroasiatic, it did not offer nothing really substantially new, in terms of material evidence, that any reasonable person that has read up on this subject a little bit would not have known beforehand, namely:


Concerning the third point, i.e., the place of origin of AA (EA or the Levant), our results do not allow us to make conclusive statements. Indeed, coalescent simulations of different genetic parameters (Supporting Information Fig. 4) according to the two mentioned hypotheses show that—even assuming complete correlation between languages and mtDNA variability—their confidence intervals largely overlap. Thus, we limit ourselves to the following observations. First, EA shows the highest levels of nucleotide diversity among the studied populations with a decreasing cline towards NA and the Levant (Supporting Information Fig. 1 and Supporting Information Table 1). This is true not only for the Ethiopian cluster A, but also, and especially, for groups belonging to clusters B1 and B2. Second, EA hosts the two deepest clades of AA, Omotic and Cushitic. These families are found exclusively in EA, while the presence of Semitic in this area is much more recent. Third, cluster C – collecting Berber- and Semitic-speaking populations from NA and the Levant – shows only modest signals of admixture with clusters A and B (Fig. 2, Supporting Information Table 1). None of these points,
taken by itself, is conclusive, but undoubtedly the hypothesis of origin of AA in EA is the most parsimonious one, if compared to the Levant.

It did also have some very nicely made contour maps for EA, as well as detailed mtDNA haplogroup assignments for some 30 or so East African groups, which I will make an interactive chart for within the next couple of days.

UPDATE2 (01/08/2013): mtDNA haplogroups (46) in 31 groups.

A note on the sources for the samples listed above:


The Dinka Samples are from Krings etal. (1999)
The Sudan and Ethiopia Samples are from Soares et al. (2011)
The Tigrai, Amhara, Gurage, Oromo and Yemeni1 Samples are from Kivisild et al. (2004)
The Beta Israel Samples are from Beharet al. (2008)
The Ethiopian Jewish Samples are from Non et al. (2011)
The Somali Samples are from Soares et al. (2011) and Watson et al. (1997)
The Daasanach and Nyangatom Samples are from Poloni et al. (2009)
The Turkana2 Samples are from Poloni et al. (2009) and Watson et al. (1997)
The Nairobi Samples are from Brandstatter et al. (2004)
The Kikuyu Samples are from Watson et al. (1997)
The Hutu Samples are from Castrì etal. (2009)
The Iraqw Samples are from Knight etal. (2003)
The Burunge and Turu Samples are from Tishkoff et al. (2007)
The Datoga and Sukuma Samples are from Tishkoff et al. (2007) and Knight etal. (2003)

All the remaining samples: Dawro Konta, Ongota, Hamer, Rendille, Elmolo, Luo, Maasai, Samburu and Turkana are new and sampled along with this study.

Saturday, January 5, 2013

TMRCA calculations from Plaster NRY data : Correcting an Error


Previously, I had computed TMRCAs for the YDNA STR data from the additional material that was provided along with Dr.Chris Plaster's thesis. However, after a brief communication with the author, I found out that the marker order of the STRs in the excel file was reported wrongly, the correct order for the markers are thus as follows:

DYS19 DYS388 DYS389I DYS389II DYS390 DYS391 DYS392 DYS393 DYS437 DYS438 DYS439 DYS448 DYS456 DYS635 Y GATA H4

This changes my TMRCA calculations because I am not computing the coalescent using a generic mutation rate that is equivalent for all the markers, but rather each marker has its own mutation rate attributed to it.

When I rerun my program using the newly corrected order above I get the following:


As can be seen, using the new order of markers generally reduces the number of generations to coalescent for the Plaster data-set. The previous observation of a relatively lower TMRCA for the haplozone data of E-M123 versus that of the E-M34 Plaster data-set largely disappears. 

To check if the fact that the high number of samples (129) present in the E-M123 haplozone data-set was skewing the results, I took 23 random samples (which equals the same number of samples available in the Plaster E-M34 data-set) from the larger E-M123 Haplozone dataset and re-run the TMRCA calculations on just those samples, I repeated this process 300 times, only 28% of the runs yielded a mean TMRCA less than the E-M34 Plaster data-set, if sample size was skewing the results I would expect >50% of the runs to have a mean TMRCA less than that of the E-M34 plaster dataset.

That said, the E-M34 Plaster data-set still had a relatively higher generations to coalescent than the E-M84 Haplozone dataset, E-M84 is a subclade of E-M34 and a high majority of haplotypes that belong to E-M34 also test positive for the E-M84 SNP (at least for the non-African E-M34 haplotypes that we know of).

Other than that, the new, and corrected, ordering of the markers did not have much impact in relative TMRCA terms between the Plaster and Haplozone/FTDNA data for the other lineages I had tested.

Tuesday, December 18, 2012

Ramesses III belonged to YDNA haplogroup E1b1a


According to a study published yesterday, Revisiting the harem conspiracy and death of Ramesses III: anthropological, forensic, radiological, and genetic study, Y- STR data places his YDNA haplogroup in E1b1a using Whit Athey's Haplogroup predictor: 

"Genetic kinship analyses revealed identical haplotypes in both mummies (table 1); using the Whit Athey’s haplogroup predictor, we determined the Y chromosomal haplogroup E1b1a. "


his DYS repeats are listed as follows:

DYS 19 19
DYS 385a,b 20
DYS 389I 13
DYS 389II 33
DYS 390 21
DYS 391 8
DYS 392 17
DYS 393 8
DYS 437 14
DYS 438 10
DYS 448 20
DYS 456 13
YGATAH4 13


Plugging these numbers in Whit Athey's predictor does indeed indicate that his haplogroup is E1b1a with 99.1% probability using equal priors. The decisive DYS, to judge between E1b1a and E1b1b, is DYS 390, with the exclusion of DYS 390, his haplotype belongs to 83.7 % E1b1b and 15.8% E1b1a, however, it is well known that DYS 390 = 21 is a high probability signature for West/Central/Southern Africa, i.e. where E1b1a dominates (see below).


UPDATE : Upon receiving an e-mail stating that the haplotype could still belong to E1b1b, on the basis of a haplotype from Chad present in the FTDNA database that has DYS 390 = 21, I further looked into it, the presence of such a haplotype would not necessarily refute what the authors of this study are claiming, because if one enters the repeats for those Chad E1b1b haplotypes (but only for the same DYS#'s that are included this study) into the predictor, i.e. :


DYS 19 13
DYS385a,b 15
DYS 389I 12
DYS 389II 29
DYS 390 21
DYS 391 9
DYS 392 11
DYS 393 13
DYS 437 14
DYS 438 10
DYS 448 21
DYS 456 15
YGATAH4 11




one would still get an assignment to haplogroup E1b1b with 88.6% probability and only a 5.6% probability that the haplotype may belong to E1b1a, therefore, I highly doubt that this pharaoh's haplotype, if extracted correctly and with out contamination, would be anything but E1b1a, absent an SNP test however, one can never be 100% sure.