Thursday, March 7, 2013

African Sahel YDNA


Multiple and differentiated contributions to the male gene pool of pastoral and farmer populations of the African Sahel


ABSTRACT

The African Sahel is conducive to studies of divergence/admixture genetic events as a result of its population history being so closely related with past climatic changes. Today, it is a place of the co-existence of two differing food-producing subsistence systems, i.e., that of sedentary farmers and nomadic pastoralists, whose populations have likely been formed from several dispersed indigenous hunter-gatherer groups. Using new methodology, we show here that the male gene pool of the extant populations of the African Sahel harbors signatures of multiple and differentiated contributions from different genetic sources. We also show that even if the Fulani pastoralists and their neighboring farmers share high frequencies of four Y chromosome subhaplogroups of E, they have drawn on molecularly differentiated subgroups at different times. These findings, based on combinations of SNP and STR polymorphisms, add to our previous knowledge and highlight the role of differences in the demographic history and displacements of the Sahelian populations as a major factor in the segregation of the Y chromosome lineages in Africa. Interestingly, within the Fulani pastoralist population as a whole, a differentiation of the groups from Niger is characterized by their high presence of R1b-M343 and E1b1b1-M35. Moreover, the R1b-M343 is represented in our dataset exclusively in the Fulani group and our analyses infer a north-to-south African migration route during a recent past.

Closed Access



Y(x CF)  Phylogeny, Red = SNPs Tested, Blue =Presumed Tested 
CF Phylogeny, Red = SNPs Tested, Blue =Presumed Tested



UPDATE: TMRCA estimates from STR haplotypes of E-M35 (x M78, M81, M123), E-M2, E-M33 and R1b respectively. Farmer and Pastoralist haplotypes were also combined. Markers used for estimates were the following: DYS 19, 388, 389-1, 389-2, 390, 391, 392, 393 and 439.




Dataset:Buckova_EM35*
Marker list:9_Buckovamarkerlist
Sample size:41

Pedigree/Familial Rates Summary
Years/Generation:28 - 33
TMRCA Range:1294 - 2191
Mean TMRCA:1824
Median TMRCA:1854
SD:301

Coalescent_Detail =

{
  [1,1] = Chandler;9 Markers;  Generations(Median)--66.412 Generations(Modal)--66.412
  [1,2] = Stafford;9 Markers;  Generations(Median)--60.508 Generations(Modal)--60.508
  [1,3] = Burgarella_Navascues;9 Markers;  Generations(Median)--66.08 Generations(Modal)--66.08
  [1,4] = Ballantyne;9 Markers;  Generations(Median)--46.216 Generations(Modal)--46.216
  [1,5] = Zhivotovsky;9 Markers;  Generations(Median)--188.52 Generations(Modal)--188.52
}

Dataset:Buckova_EM78
Marker list:9_Buckovamarkerlist
Sample size:22

Pedigree/Familial Rates Summary
Years/Generation:28 - 33
TMRCA Range:20666 - 51129
Mean TMRCA:35091
Median TMRCA:34492
SD:8588

Coalescent_Detail =

{
  [1,1] = Chandler;9 Markers;  Generations(Median)--1238.6 Generations(Modal)--1549.4
  [1,2] = Stafford;9 Markers;  Generations(Median)--1039.6 Generations(Modal)--1376.3
  [1,3] = Burgarella_Navascues;9 Markers;  Generations(Median)--1009 Generations(Modal)--1348.9
  [1,4] = Ballantyne;9 Markers;  Generations(Median)--738.07 Generations(Modal)--904.69
  [1,5] = Zhivotovsky;9 Markers;  Generations(Median)--1573.7 Generations(Modal)--1807.9
}
See Comments below.

Dataset:Buckova_EM2
Marker list:9_Buckovamarkerlist
Sample size:180

Pedigree/Familial Rates Summary
Years/Generation:28 - 33
TMRCA Range:10004 - 19439
Mean TMRCA:14511
Median TMRCA:14437
SD:2662

Coalescent_Detail =

{
  [1,1] = Chandler;9 Markers;  Generations(Median)--561.59 Generations(Modal)--589.07
  [1,2] = Stafford;9 Markers;  Generations(Median)--482.51 Generations(Modal)--504.53
  [1,3] = Burgarella_Navascues;9 Markers;  Generations(Median)--446.88 Generations(Modal)--481.86
  [1,4] = Ballantyne;9 Markers;  Generations(Median)--357.31 Generations(Modal)--382.41
  [1,5] = Zhivotovsky;9 Markers;  Generations(Median)--1001.1 Generations(Modal)--1140.6
}


Dataset:Buckova_EM33
Marker list:9_Buckovamarkerlist
Sample size:60

Pedigree/Familial Rates Summary
Years/Generation:28 - 33
TMRCA Range:11684 - 26171
Mean TMRCA:18035
Median TMRCA:17756
SD:3873

Coalescent_Detail =

{
  [1,1] = Chandler;9 Markers;  Generations(Median)--678.13 Generations(Modal)--793.08
  [1,2] = Stafford;9 Markers;  Generations(Median)--539.26 Generations(Modal)--632.76
  [1,3] = Burgarella_Navascues;9 Markers;  Generations(Median)--532.99 Generations(Modal)--652.5
  [1,4] = Ballantyne;9 Markers;  Generations(Median)--417.31 Generations(Modal)--484.69
  [1,5] = Zhivotovsky;9 Markers;  Generations(Median)--1095 Generations(Modal)--1476.1
}

Dataset:Buckova_R1b
Marker list:9_Buckovamarkerlist
Sample size:15

Pedigree/Familial Rates Summary
Years/Generation:28 - 33
TMRCA Range:8024 - 12844
Mean TMRCA:10650
Median TMRCA:10711
SD:1540

Coalescent_Detail =

{
  [1,1] = Chandler;9 Markers;  Generations(Median)--389.23 Generations(Modal)--389.23
  [1,2] = Stafford;9 Markers;  Generations(Median)--375.88 Generations(Modal)--375.88
  [1,3] = Burgarella_Navascues;9 Markers;  Generations(Median)--345.04 Generations(Modal)--345.04
  [1,4] = Ballantyne;9 Markers;  Generations(Median)--286.6 Generations(Modal)--286.6
  [1,5] = Zhivotovsky;9 Markers;  Generations(Median)--901.77 Generations(Modal)--901.77
}

18 comments:

  1. No idea if this R1b is R1b1c-V88 or rather R1b1a-L320, right? I ask because I imagine V88 incoming from Sudan with the Chadic migrations but this Niger Fulani is a bit perplexing and I recall reading some years ago of some paleoanthropological evidence of Late UP or Epipaleolithic "Caucasoid" remains in the Teneré, later replaced (Neolithic?) by a more typical African-looking population.

    Just wondering because if related to this episode (the abstract claims that "our analyses infer a north-to-south African migration route during a recent past") it could also L320, which has some presence in North Africa. Otherwise the authors' claim may be hollow.

    ReplyDelete
    Replies
    1. No idea if this R1b is R1b1c-V88 or rather R1b1a-L320, right?

      Right, the information you see is what I could gather solely from the supplemental material as the responsible parties have decided to close access to this article, so I am not exactly sure what binary polymorphisms of R, P or F were specifically tested as they do not list them in the supplemental material, unlike the polymoprhisms tested for E that they do list in the supplemental material, maybe it is specified in the main article it self.

      From my TMRCA estimate however, the R1b looks relatively old, quite close to the age of the E-M2 and E-M33 estimates, about 3/5 -3/4 of their age in fact.

      Delete
  2. Slight Correction/elaboration, the E-M35 haplotypes I tested (41) were actually only the E-M35 (x M81,M78,M123) haplotypes and did not include 22 of the E-M78 haplotypes, nor 2 of the E-M123 haplotypes in the estimate.

    ReplyDelete
  3. Will you include the rest in a future update?

    ReplyDelete
    Replies
    1. The only one I was missing was the 22 haplotypes for E-M78, which I just appended, so 318/342 of the haplotypes are accounted for in 5 of the clades I did the Time estimates on, the remainder 24, don't have enough haplotypes to do the ASD as they are scattered through various haplogroups, AB,E*,P*,R*, etc..

      However, I have no idea why such a strange result for the E-M78 haplotypes, they look like the oldest of all the ones reported, this is even with considering pedigree rates. Which is the reverse for the E-M35(x...) haplotypes, which look the youngest of all reported. The E-M78 haplotypes were exclusively present in the Farmers, while the other E-M35 were for the most part present only in the pastoralists. Without further SNP sub-typing of the E-M78 clades, I can't say much why they are appearing so diverse in the farmers.

      Delete
  4. Hmm... I thought R1b was supposed to be the ''true Chadic'' lineage?? But now it seems like E-M78 might be a better candidate, and R1b might represent a pre-Chadic migration. But who knows what the future might bring...

    ReplyDelete
    Replies
    1. The E-M78 looks strangely extremely old for some reason or another, look above.....

      for anybody that has access to this paper, are there any variance results and microsatellite network diagrams reported?

      Delete
    2. So basically oldest to youngest from what I can gather, E-M78 > E-M33 > E-M2 > R1b > E-M35*

      Delete
    3. Well, That's interesting. At first I thought that some of the E-M78 might have 'spilled over' from Darfur into East/Central Chadic speaking areas, which isn't too far away from those locations. However, none of the E-M78 seem to be E-V32 (DYS19-11), which is rather peculiar. Too bad Y-Chrom studies are still largely so low res.

      Delete
    4. It could be anything, E-V22, V12*, V65 , etc.. no way to be sure without testing the V-series on the E-M78+ cases. However, i think the STR haplotypes reported for the E-M78 haplotypes are erroneous, as their modal should have been predicted as E1b1b by whit athey with a reasonable amount of probability, like it was for the other haplotypes, so I would not rule out E-V32 on the basis of the STR haplotypes that they are reporting for E-M78.

      Delete
    5. Essentially all E-V32 haplotypes have DYS19-11. It's one of those easy to identify clades based on the commonly tested STRs. See http://www.haplozone.net/e3b/project/cluster/9

      Therefore, one can assume with a high probability that these new samples don't belong to that clade. This I find rather surprising given its reported ubiquity in Darfur and considering the many historic trade networks in Sahelian Central Africa. I guess we'll have to wait for more in-depth studies to solve this mystery.

      Delete
    6. “Essentially all E-V32 haplotypes have DYS19-11.”

      That may be true for the haplotypes in haplozone, as well as Cruciani '07, but out of the 18 E-V32 haplotypes in the Plaster Dataset , none of them had 19 = 11. So again, I will need to see an SNP test before I can categorically rule out E-V32. However that is not the broader issue, which is that something may be wrong with the reporting of the E-M78 haplotypes in this study all together, according to their low probability to be predicted into the correct haplogroup (E1b1b) by the Whit Athey tool, and if that is the case, they can not be used neither for any further subclade prediction downstream of E-M78, nor for TMRCA computations.

      Delete
    7. Actually almost all (30/31) of the Plaster ones are DYS19-11 as well. But yeah I agree with you on the oddity of the TMRCA dates. Something fishy is going on.

      Delete
    8. Apologies on the Plaster data, I was mistakenly looking at the unique haplotypes reported for the E-M35 (x M78,M123,V6) dataset instead of the E-V32 ones.

      With respect to these Buckova haplotypes and verification with the Whit Athey tool, there is a configuration for the tool where it can be run in batch mode, if one does that for these haplotypes using a North West Euro prior, you get a >90% probable prediction for : 87.8% of the 41 E-M35* Haplotypes, 60% of the 180 E-M2 haplotypes, 67% of the 15 R1b haplotypes and 18% of the 22 E-M78 haplotypes, however the fitness scores, which basically measure closeness to the modal haplotypes in the reference database, were dismally low for all haplotypes (Max=23). Nevertheless the Bayesian probabilities are definitely off for the E-M78 haplotypes.

      The fact that there are only 9 markers to work with does not make predictions any easier either.

      Delete
  5. Here is one possible scenario:
    The E-M78 lineages could be Ancestral....like the few found in Nilo-Saharans in Sudan.
    The few * links in Sudan would then NOT be recent admixture from the North as previously thought.
    Being Ancestral or V12 could represent an ancient Saharan presence pushed into the Sahel and the Egyptian/Sudanese lineages would then represent terminal migration of V12*.

    Just as the M2 lineages in the Sahel seem to predate those in West/Central Africa indicating an East to West Migration possibly its the opposite with V12.

    ReplyDelete
    Replies
    1. The problem is that these numbers I am getting for E-M78 are 'off the charts', a set of ancestral E-M78 haplotypes, for the markers I am using, would not give mean pedigree rates of 35 KYA as being shown here but more in the range of 6-7 KYA or for the Zhiv rates in the range of 600-650 Generations.

      The TMRCA I am getting here for E-M78 would fit more a group of haplotypes that belong to a node higher than or equal to E-M96 in the YDNA phylogeny.

      Delete
  6. Ok there seems to be some definite problems with the reporting of the 22 E-M78 haplotype repeats, if you enter their modal value: 15 12 13 16 23 11 11 13 12 into Whit Athey's Predictor, using equal priors, you get 16.4% E1b1a, 7% E1b1b, 25% H and so forth, which is basically nonsense.

    Now if you do the same for the 41 E-M35* haplotypes, which have a modal of 13 12 12 18 22 9 11 13 13, you get 95.3% E1b1b

    And if you do the same for the 180 E-M2 haplotypes, which have a modal of, 15 12 13 18 21 10 11 13 11, you get 92.8% E1b1a

    For the 15 R1b haplotypes, which have a modal of, 15 12 14 16 24 10 13 13 12, if you enter that into Whit Athey, the predictor strangely returns 85.3% Haplogroup T with equal priors, but if you change the priors to North Western Europe you get 95.6% R1b.

    Therefore something is definitely wrong with the E-M78 haplotypes, hence why my calculator is giving strange TMRCA results for them.

    ReplyDelete
  7. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3382471/pdf/pnas.201110442.pdf

    Data from the Sahel you may or may not have seen.
    I have tried a few different predictors and gotten mixed results.
    Perhaps with you experience is dealing with STR it could be of some value.

    ReplyDelete