Ethio Helix ኢትዮ:ሒሊክስ: A ‘‘Copernican’’ Reassessment of the Human Mitochondrial DNA Tree from its Root

Thursday, April 5, 2012

A ‘‘Copernican’’ Reassessment of the Human Mitochondrial DNA Tree from its Root

Mutational events along the human mtDNA phylogeny are traditionally identified relative to the revised Cambridge Reference Sequence, a contemporary European sequence published in 1981. This historical choice is a continuous source of inconsistencies, misinterpretations, and errors in medical, forensic, and population genetic studies. Here, after having refined the human mtDNA phylogeny to an unprecedented level by adding information from 8,216 modern mitogenomes, we propose switching the reference to a Reconstructed Sapiens Reference Sequence, which was identified by considering all available mitogenomes from Homo neanderthalensis. This ‘‘Copernican’’ reassessment of the human mtDNA tree from its deepest root should resolve previous problems and will have a substantial practical and educational influence on the scientific and public perception of human evolution by clarifying the core principles of common ancestry for extant descendants.

Source (Open Access)

Some quotes and figures from the paper:

"Supported by a consensus of many colleagues and after a few years of hesitation, we have reached the conclusion that on the verge of the deep-sequencing revolution (47,55) when perhaps tens of thousands of additional complete mtDNA sequences are expected to be generated over the next few years, the principal change we suggest cannot be postponed any longer: an ancestral rather than a ‘‘phylogenetically peripheral’’ and modern mitogenome from Europe should serve as the epicenter of the human mtDNA reference system."

"Interestingly, the ranges of substitution counts within haplogroups M and N, which are hallmarks of the relatively recent out-of-Africa exodus of humans, are also very large. For example, within M there are two mitogenomes with 43 substitutions (in M30a and M44) and two mitogenomes with as many as 71 substitutions (in M2b1b and M7b3a). This is especially striking because the path from the RSRS to the root of M already contains 39 substitutions. Hence, the difference between the M root and its M44 descendant is only four substitutions (two in the coding region and two in the control region) as compared to 32 substitutions in the M2b1b and M7b3a mitogenomes. These observations raise the possibility that the tree in general, and haplogroup M in particular, might not adhere uniformly to the assumed molecular clock, under which substitutions occur at a fixed rate on all branches of the tree over time."

Some inferred dates of interest (from the supplemental file):
L3 : 67,262 (SD 4,434)
        M : 49,590 (SD 1,824)
              M1'20'51 : 47,641 (SD 2,851)
                                 M1 : 23,680 (SD 4,378)
                                          M1a : 19,183 (SD 3,226)
                                                    M1a1 : 12,910 (SD 3,341)
        N : 58,860 (SD 2,352)
N1'5 : 56,547 (SD 4,705)
   N1 : 51,643 (SD 5,640)
   N1a : 18,118 (SD 5,247)

Others;
R0a1 : 20,766 (SD 5,754)
U6a1 : 20,133 (SD 4,941)
J1 : 26,935 (SD 5,273)
T : 25,149 (SD 4,668)
K : 26,682 (SD 4,339)

22 comments:

MajuApril 6, 2012 at 1:47 AM
One of the issues with molecular-clock-o-logy is that, while new mutations are invariably present each generation in the nuclear DNA (including Y-DNA), in the critically much shorter mtDNA, mutations only appear every many generations (I estimated 1-10 Ka, averaging maybe 2.5 Ka, for each coding region mutation as rule of thumb, while the control region ones are very messy and hence best ignored).

This fact creates a peculiar reality for whatever mtDNA molecular-clock-o-logy. For example in (case A) a tiny population with effective size of two (Ne=2), each time a mutation happens (very rarely but now and then no doubt) the chances of fixation and hence long term survival are high (50%).

However (case B) as the population grows the chances of any mutation becoming fixated decrease very fast with Ne=10, they are only 10%, with Ne=100, only 1%, etc. And while mutations happen faster, the chance of ever being more than one novel mutation per generation are not really larger: innovation will happen more often but, unless the population grows A LOT (well above Paleolithic levels) the chances of two mutations per generation remain effectively zero. That means that in large (for Paleolithic standards) populations the process of evolution of novel mutations will become effectively frozen: they will happen, yes, but will also be "drifted out" by the fixated dominant clade(s).

I call this the "cannibal mum" hypothesis because, metaphorically, mum eats the daughters systematically, keeping the mtDNA evolution effectively frozen in that branch.

Now, you are surely better than I am with maths, do you think that this non-numerical explanation makes sense? The difference of length between mtDNA lines can be very extreme and has been so far unexplained. I think this can be an explanation but lack the mathematical "demonstration" so far.
ReplyDelete
Replies
MajuApril 6, 2012 at 8:15 PM
I can agree with Soares 2009 in principle but I would count from the root (and not from present) because all extant haplotypes are not 'tips' (the H(n) cases for instance) nor have the same distance and may have been stopped in their evolution by the "cannibal mum" process.

(I'm not sure why an ML tree would be needed anyhow: the true tree is provided by the phylogeny itself, right?)

But my real "problem" is to "prove" the "cannibal mum" hypothesis statistically correct with maths. Even if it is correct, it'd be interesting to know which ranges of Ne favor that behavior and which allow for a faster effective mutation rate, both in the low range (new mutations get fixated) and the high range (new mutations survive drift for long although they remain minor).

Software is not that important here: what I need is a maths-oriented mind. But never mind, I'll ask a friend who just loves maths and physics (although his interest in genetics and prehistory is very low). He'll be glad to help me solve these doubts, I'm quite sure.
ReplyDelete
Replies
Andrew Oh-WillekeApril 6, 2012 at 8:15 PM
Hurray for a nomenculture that makes the observation that "in general, and haplogroup M in particular, might not adhere uniformly to the assumed molecular clock, under which substitutions occur at a fixed rate on all branches of the tree over time," natural and obvious instead of obscure.
ReplyDelete
Replies
Andrew Oh-WillekeApril 6, 2012 at 8:22 PM
A "cannibal mum" hypothesis, of the kind that Maju suggests, would explain a discrepency between mutation rates observed in geneologically related groups of people (which we don't really have because mutations are too rare), and mutation rates estimated from longer term evolution compared to known benchmarks, but unless some lineages are more cannibalistic than others (possibly true in the Holocene, but implausible before then), this mechanism should have a fairly uniform effect only only impact the relationship between the true mutation rate and the adjusted one that we calibrate the mutation rates against in a fairly constant amount.
ReplyDelete
Replies
EtyopisApril 7, 2012 at 12:06 AM
"I'm not sure why an ML tree would be needed anyhow"
You need that to validate/invalidate the molecular clock, by comparing ML inferences of branch length with and without the clock's assumption?
Aren't you saying your theory invalidates the molecular clock at the end of the day?

"Software is not that important here:"
why reinvent the wheel? Your theory, if I understood correctly, is basically saying that mutation rates are not linear and that they maybe a function of effective population sizes, this maybe reasonable, but if there is software already out there that is able to tweak some of these existing variables why not use it? The other problem is that you have to come up with an assumption for the Ne in the first place, as you don't really know what the Ne was when the mutation occurred, this could be come circular if you are using the mutation information to infer Ne, unless you have an independent source for Ne inference, I'm guessing....
ReplyDelete
Replies
manelMay 9, 2012 at 3:58 AM
hello, i dont't interstand how the mtdna community the logiciel that mr behar has put in their site in order to definite the haplogroups ... hoa has use this logiciel.
And what di yoou think about how to definite haplogroups juste from hvs1
Thnak you
ReplyDelete
Replies

Add comment

Ethio Helix ኢትዮ:ሒሊክስ

Pages

Thursday, April 5, 2012

A ‘‘Copernican’’ Reassessment of the Human Mitochondrial DNA Tree from its Root

22 comments:

Blog Archive

Search This Blog

Contact Form

Ethio Helix ኢትዮ:ሒሊክስ

Pages

Thursday, April 5, 2012

A ‘‘Copernican’’ Reassessment of the Human Mitochondrial DNA Tree from its Root

22 comments:

Blog Archive

Search This Blog

Subscribe To

Contact Form