Ethio Helix ኢትዮ:ሒሊክስ: TMRCA calculations from Plaster NRY data : Correcting an Error

Saturday, January 5, 2013

TMRCA calculations from Plaster NRY data : Correcting an Error

Previously, I had computed TMRCAs for the YDNA STR data from the additional material that was provided along with Dr.Chris Plaster's thesis. However, after a brief communication with the author, I found out that the marker order of the STRs in the excel file was reported wrongly, the correct order for the markers are thus as follows:

DYS19 DYS388 DYS389I DYS389II DYS390 DYS391 DYS392 DYS393 DYS437 DYS438 DYS439 DYS448 DYS456 DYS635 Y GATA H4

This changes my TMRCA calculations because I am not computing the coalescent using a generic mutation rate that is equivalent for all the markers, but rather each marker has its own mutation rate attributed to it.

When I rerun my program using the newly corrected order above I get the following:

As can be seen, using the new order of markers generally reduces the number of generations to coalescent for the Plaster data-set. The previous observation of a relatively lower TMRCA for the haplozone data of E-M123 versus that of the E-M34 Plaster data-set largely disappears.

To check if the fact that the high number of samples (129) present in the E-M123 haplozone data-set was skewing the results, I took 23 random samples (which equals the same number of samples available in the Plaster E-M34 data-set) from the larger E-M123 Haplozone dataset and re-run the TMRCA calculations on just those samples, I repeated this process 300 times, only 28% of the runs yielded a mean TMRCA less than the E-M34 Plaster data-set, if sample size was skewing the results I would expect >50% of the runs to have a mean TMRCA less than that of the E-M34 plaster dataset.

That said, the E-M34 Plaster data-set still had a relatively higher generations to coalescent than the E-M84 Haplozone dataset, E-M84 is a subclade of E-M34 and a high majority of haplotypes that belong to E-M34 also test positive for the E-M84 SNP (at least for the non-African E-M34 haplotypes that we know of).

Other than that, the new, and corrected, ordering of the markers did not have much impact in relative TMRCA terms between the Plaster and Haplozone/FTDNA data for the other lineages I had tested.

9 comments:

MajuJanuary 5, 2013 at 10:05 PM
E1b1b-M35 is just ~5000 years old? I think that is totally off the mark by A LOT. We are talking of a widespread African and Mediterranean haplogroup with many variants that simply cannot have spread since such a recent date as the Chalcolithic (a convenient Iberian chronology reference) or Bronze Age (an also convenient Aegean chronology reference).

It'd mean that a descendant of it, E1b1b1b1a-M81, had a massive founder effect in NW Africa and then expanded to all Western Iberia and even localities as far as Britain long after that date (Bronze Age? Iron? Roman era?). It'd mean that another descendant, E1b1b1a1b-V13, reached Greece and Albania, where it became very common, almost dominant, and spread through most of Europe (at low frequencies) also already in the Iron Age or so. It just doesn't seem plausible for what I know of prehistory, much less with the extremely low levels of African (even if just North African) autosomal genetics associated to them in Europe, which seem to demand particular "filters" (repeated admixture) before their European scatter.

Being constructive, I'd say that the most recent possible calibration point would be, for arrival to Europe, in the Neolithic, where there was still room to make clear founder effects. In this sense, it must be reminded that E1b1b1a1b-V13 has been detected in Neolithic Catalans of c. 7000 years ago, along with G2a, and this must have been older in Greece and Albania (c. 9-10,000 years ago).

So if your would be, estimated, '3000 years ago' is actually 10,000, then you should multiply all figures times three or four, if you want to get real. By "you" I mean Etyopis and Plaster alike. Producing a rough corrected estimate, everything else equal, for E1b1b of at least 15-20,000 years ago.

Similarly for J1-M267, if you get something like 12,000 years, the real thing would be rather 48,000 years, which is coherent with the way I imagine the J1 scatter in Africa and West Asia (based on correlations with material prehistory, i.e. structured archaeology).

Sorry but a lot of reality check is needed in these matters and with E1b-V13 we have aDNA references that simply do not allow for these short chronologies you are toying with.
ReplyDelete
Replies
MajuJanuary 8, 2013 at 10:44 PM
Zhivotovski uses a correcting parameter (a constant) to account for demographic alterations (the pedigree rate, assuming it's correctly measured) would only apply if the novel clades (as we measure them with SRY markers - not the best possible measure) would expand infinitely fast. In reality some do and some go extinct and some remain stable and sometimes they do contract, etc. In other words: not all of Chandler's observed father-son rates survive, much less in the long term. So Dr. Z. figured out a constant correction (an estimate in itself) to compensate for all that, which produces, as you say figures roughly x3 longer (which is reasonably correct in many cases, maybe still a bit too short, when compared with the archaeological calibration references).

The problem with the pedigree rate is that it is self-referential and then also now X measures this and then Y measures that and claims that the actual mutation rate is doubly slow than Chandler's estimates... even Dienekes admitted in the recent past to that and kicked that Russian crazy guy he used to follow, which was his name?, from his blog with angry manners.

However he recently returned to those rates in an entry about Native Americans, who according to the author of the paper would follow the obsolete model of "Clovis first", based on their estimates.

There are deep problems with STR-based age estimates (probably caused by the "cannibal mum" issue where large pops. become "conservative" that I have outlined in other cases re. mtDNA - notice that this would not happen if the whole Y-chromosome would be sequenced instead) and the Z. correction is just a decent but week attempt (made in 2004) to establish a better approximation and that way save the possibilities of the molecular clock.

Probably the most realistic correction rate to be applied is not linear as Z's constant but exponential or semi-exponential, so the differences with the pedigree rate increase with time - but still there would be many exceptions as populations are not homogenenous and a small population of a few dozen in Siberia will not behave the same as a large one of tens of thousands, maybe more, in the Tropics. The latter will be much more diverse but also more "conservative" overall, making almost impossible for any new clade (measured in mtDNA mutations or Y-DNA STR markers - but not in Y-DNA actual SNP mutations because the Y-crhomosome is long enough for at least one mutation to happen in be added each generation no matter what, defeating the "cannibal dad" hypothesis) to consolidate. Instead in small pops. it all depends on founder effect randomness.
ReplyDelete
Replies
astenbFebruary 3, 2013 at 2:43 PM
How much work would you have to go through to create a regional analysis of E1b1a lineages?
ReplyDelete
Replies

Add comment

Ethio Helix ኢትዮ:ሒሊክስ

Pages

Saturday, January 5, 2013

TMRCA calculations from Plaster NRY data : Correcting an Error

9 comments:

Blog Archive

Search This Blog

Contact Form

Ethio Helix ኢትዮ:ሒሊክስ

Pages

Saturday, January 5, 2013

TMRCA calculations from Plaster NRY data : Correcting an Error

9 comments:

Blog Archive

Search This Blog

Subscribe To

Contact Form