Ethio Helix ኢትዮ:ሒሊክስ: Cross Validating and K Selection

Saturday, March 31, 2012

Cross Validating and K Selection

There are two ways of choosing a K value for any given dataset that one wishes to perform an ADMIXTURE run on, one is to throw a dart at a random set of numbers and hope it works out for the very best, the other is to run ADMIXTURE at different K's while computing a cross validation error for each of the K values using the --cv flag, I did this with the studentized global dataset that I discussed earlier in this post. The Cross Validation error values for K 1-14 for that particular dataset can be seen in the graphs below,

close up :

While the CV-Error values do not start flattening out until about K=10, the CV error values do not start inflecting until K=13, meaning K=13 is the appropriate choice for this dataset.

Cross Validation can take a considerably long time to run, as each consecutive K has to be evaluated along with its error separately, unless one has access to a very fast machine off-course.

As a reference, the Bash shell code to run Cross Validation in ADMIXTURE for up-to K=14 is:

for K in 1 2 3 4 5 6 7 8 9 10 11 12 13 14; \

do ./admixture32 -j2 --cv=14 “filename.bed” $K | tee log${K}.out; done

where CV error values will be recorded in the .out files for each K.

Peaking populations for each cluster for K =2-13

K=2

Cluster1: pygmy,mbutipygmy,sotho/tswana,biakapygmy,fang

Cluster2: chinese-americans,tujia,miao,hezhen,han

East Asians and Africans split, with West Asians and Europeans belonging to 1/3 African and 2/3 East Asian, the reverse is seen with Ethiopians, 2/3 African and 1/3 East Asian.

K=3

Cluster1: sardinian,basque,tuscans,italian,spaniards

Cluster2: pygmy,mbutipygmy,sotho/tswana,biakapygmy,bantusouthafrica

Cluster3: she,chinese-americans,han,singapore-chinese,chinese

West Asians Split off.

K=4

Cluster1: sardinian,basque,tuscans,italian,cypriots

Cluster2: pygmy,mbutipygmy,sotho/tswana,biakapygmy,bantusouthafrica

Cluster3: colombian,karitiana,surui,pima,totonac

Cluster4: she,han,singapore-chinese,chinese,miao

Native Americans split off.

K=5

Cluster1: she,han,chinese-americans,chinese,singapore-chinese

Cluster2: surui,karitiana,colombian,pima,totonac

Cluster3: sardinian,basque,spaniards,italian,tuscans

Cluster4: pygmy,mbutipygmy,biakapygmy,bantusouthafrica,sotho/tswana

Cluster5: papuan,irula,tn-dalit,ap-mala,malayan

Oceanians and South Asians split off together.

K=6

Cluster1: papuan,melanesian,tongan,samoan,paniya

Cluster2: pygmy,mbutipygmy,biakapygmy,bantusouthafrica,sotho/tswana

Cluster3: karitiana,colombian,surui,pima,totonac

Cluster4: she,han,chinese-americans,singapore-chinese,chinese

Cluster5: sardinian,basque,spaniards,italian,tuscans

Cluster6: irula,tn-dalit,ap-madiga,ap-mala,north-kannadi

Oceanians and South Asians split off from each other.

K=7

Cluster1: sardinian,basque,spaniards,italian,tuscans

Cluster2: dogon,yoruba,bambaran,hausa,igbo

Cluster3: irula,tn-dalit,ap-mala,ap-madiga,north-kannadi

Cluster4: san-nb,san,!kung,pygmy,mbutipygmy

Cluster5: papuan,melanesian,tongan,samoan,paniya

Cluster6: colombian,surui,karitiana,pima,totonac

Cluster7: she,han,chinese-americans,singapore-chinese,chinese

San split off from the African component.

K=8

Cluster1: dogon,yoruba,bambaran,hausa,igbo

Cluster2: irula,tn-dalit,ap-mala,ap-madiga,north-kannadi

Cluster3: papuan,melanesian,tongan,samoan,paniya

Cluster4: koryaks,nganassans,chukchis,evenkis,yakut

Cluster5: dai,vietnamese,singapore-chinese,she,han

Cluster6: sardinian,basque,spaniards,italian,tuscans

Cluster7: san-nb,san,!kung,pygmy,mbutipygmy

Cluster8: surui,karitiana,colombian,pima,totonac

Siberians split off from the East Asian component.

K=9

Cluster1: papuan,melanesian,tongan,samoan,paniya

Cluster2: iban,samoan,tongan,singapore-malay,dai

Cluster3: japanese,hezhen,han-nchina,xibo,beijing-chinese

Cluster4: sardinian,basque,spaniards,italian,tuscans

Cluster5: san-nb,san,!kung,pygmy,mbutipygmy

Cluster6: dogon,yoruba,bambaran,hausa,igbo

Cluster7: surui,karitiana,colombian,pima,totonac

Cluster8: irula,tn-dalit,ap-mala,ap-madiga,north-kannadi

Cluster9: koryaks,chukchis,nganassans,east-greenlanders,kets

A South East Asian Component forms.

K=10

Cluster1: saudis,bedouin,yemen-jews,samaritians,tunisia

Cluster2: papuan,melanesian,tongan,samoan,paniya

Cluster3: dai,vietnamese,iban,singapore-chinese,she

Cluster4: hadza,maasai,ethiopians,ethiopian-jews,bulala

Cluster5: irula,tn-dalit,ap-madiga,ap-mala,north-kannadi

Cluster6: surui,karitiana,colombian,pima,totonac

Cluster7: koryaks,nganassans,chukchis,evenkis,yakut

Cluster8: dogon,yoruba,brong,igbo,bambaran

Cluster9: san-nb,san,!kung,pygmy,mbutipygmy

Cluster10: lithuanians,belorussian,orcadian,n-european,utahn-whites

West Asian component splits into 2 components; North European and Middle East & North African (MENA). An East African component that was previously concealed by the West Asian and African components forms. The previous South East Asian component disappears.

K=11

Cluster1: dai,vietnamese,singapore-chinese,she,han

Cluster2: koryaks,nganassans,chukchis,evenkis,yakut

Cluster3: surui,karitiana,colombian,pima,totonac

Cluster4: tunisia,bedouin,saudis,sahara-occ,yemen-jews

Cluster5: dogon,yoruba,brong,igbo,bambaran

Cluster6: lithuanians,belorussian,orcadian,n-european,utahn-whites

Cluster7: papuan,melanesian,tongan,samoan,paniya

Cluster8: san-nb,san,!kung,pygmy,mbutipygmy

Cluster9: irula,malayan,tn-dalit,ap-mala,ap-madiga

Cluster10: hadza,maasai,ethiopians,sandawe,bulala

Cluster11: kalash,brahui,balochi,makrani,georgians

A central Asian component forms.

K=12

Cluster1: surui,karitiana,colombian,pima,totonac

Cluster2: lithuanians,belorussian,orcadian,n-european,utahn-whites

Cluster3: san-nb,san,!kung,pygmy,mbutipygmy

Cluster4: iban,samoan,tongan,singapore-malay,cambodian

Cluster5: bedouin,saudis,yemen-jews,samaritians,tunisia

Cluster6: papuan,melanesian,tongan,samoan,paniya

Cluster7: japanese,beijing-chinese,han-nchina,chinese-americans,xibo

Cluster8: koryaks,chukchis,east-greenlanders,west-greenlanders,kets

Cluster9: irula,tn-dalit,ap-madiga,ap-mala,north-kannadi

Cluster10: dogon,yoruba,brong,igbo,bambaran

Cluster11: nganassans,evenkis,yakut,dolgans,kets

Cluster12: hadza,maasai,ethiopians,ethiopian-jews,bulala

Central Asian component disappears, a second Siberian component is formed, the S. East Asian component reappears.

K=13

Cluster1: san-nb,san,!kung,xhosa,bantusouthafrica

Cluster2: surui,karitiana,colombian,pima,totonac

Cluster3: papuan,melanesian,tongan,samoan,paniya

Cluster4: japanese,han-nchina,beijing-chinese,xibo,hezhen

Cluster5: hadza,maasai,ethiopians,sandawe,bulala

Cluster6: lithuanians,belorussian,orcadian,n-european,utahn-whites

Cluster7: koryaks,chukchis,nganassans,evenkis,east-greenlanders

Cluster8: tunisia,bedouin,saudis,yemen-jews,sahara-occ

Cluster9: kalash,brahui,balochi,makrani,georgians

Cluster10: pygmy,mbutipygmy,biakapygmy,alur,fang

Cluster11: irula,malayan,tn-dalit,ap-mala,ap-madiga

Cluster12: dogon,yoruba,brong,bambaran,igbo

Cluster13: iban,samoan,tongan,singapore-malay,dai

Central Asian Component reappears, a new Pygmy component is formed, second Siberian component disappears.

Fst for K=13.

UPDATE: Median cluster % for all populations, K13.

- no title specified

ADMIXTURE, Global K13	N	San	N. American	Oceanian	E. Asian	E. African	N. European	Siberian	MENA	Central Asian	Pygmy	S. Asian	W. African	S.E. Asian
!kung	8	78%	0%	0%	0%	2%	0%	0%	0%	0%	2%	0%	16%	0%
adygei	11	0%	1%	0%	3%	0%	32%	3%	20%	42%	0%	1%	0%	0%
african-americans	37	2%	1%	0%	0%	1%	13%	0%	1%	3%	3%	0%	72%	0%
algeria	12	0%	0%	0%	0%	5%	22%	1%	48%	5%	0%	3%	13%	0%
altaians	8	0%	2%	0%	37%	0%	12%	31%	0%	12%	0%	0%	0%	0%
alur	7	0%	0%	0%	0%	34%	0%	0%	0%	0%	17%	0%	50%	0%
ap-brahmin	14	0%	1%	2%	1%	0%	8%	2%	1%	36%	0%	48%	0%	2%
ap-madiga	5	0%	0%	2%	2%	0%	0%	0%	0%	24%	0%	66%	0%	5%
ap-mala	8	0%	0%	2%	2%	0%	0%	0%	0%	22%	0%	67%	0%	5%
armenians	11	0%	0%	0%	0%	0%	19%	0%	34%	43%	0%	2%	0%	0%
armenians-b	3	0%	0%	1%	0%	0%	48%	4%	17%	26%	0%	1%	0%	0%
ashkenazy-jews	15	0%	0%	0%	1%	0%	37%	0%	34%	24%	0%	1%	0%	0%
azerbaijan-jews	6	0%	1%	0%	0%	0%	15%	0%	37%	44%	0%	0%	0%	1%
balochi	18	0%	1%	0%	1%	0%	7%	1%	13%	53%	0%	20%	0%	0%
bambaran	14	3%	1%	0%	0%	1%	0%	0%	1%	0%	1%	0%	91%	0%
bamoun	10	3%	0%	0%	0%	4%	0%	0%	0%	0%	7%	0%	85%	0%
bantukenya	5	3%	0%	0%	0%	20%	0%	0%	2%	0%	5%	0%	67%	0%
bantusouthafrica	3	24%	0%	0%	1%	6%	0%	0%	0%	0%	4%	0%	65%	0%
basque	24	0%	0%	1%	0%	0%	75%	0%	16%	6%	0%	1%	0%	0%
bedouin	33	0%	0%	0%	0%	3%	0%	0%	65%	27%	0%	0%	2%	0%
beijing-chinese	91	0%	0%	0%	68%	0%	0%	2%	0%	0%	0%	0%	0%	28%
belorussian	4	0%	1%	1%	0%	0%	77%	4%	3%	15%	0%	1%	0%	0%
biakapygmy	12	17%	0%	0%	0%	1%	0%	0%	0%	0%	33%	0%	45%	0%
bnei-menashe-jews	4	0%	0%	2%	1%	0%	7%	0%	16%	34%	0%	34%	0%	3%
bolivian	17	0%	95%	0%	1%	0%	1%	3%	0%	0%	0%	0%	0%	0%
brahui	18	0%	1%	0%	0%	0%	8%	1%	13%	55%	0%	20%	0%	0%
brong	4	4%	0%	0%	0%	0%	0%	0%	0%	1%	3%	0%	91%	0%
bulala	12	0%	0%	0%	0%	38%	0%	0%	3%	0%	0%	0%	57%	0%
burusho	17	0%	2%	1%	7%	0%	13%	4%	2%	41%	0%	27%	0%	2%
buryat	16	0%	0%	1%	49%	0%	5%	38%	1%	5%	0%	0%	0%	1%
buryats	13	0%	0%	1%	47%	0%	5%	38%	0%	5%	0%	1%	0%	0%
cambodian	5	0%	0%	1%	31%	0%	0%	0%	0%	1%	0%	11%	0%	57%
chinese	5	0%	0%	0%	60%	0%	0%	0%	0%	0%	0%	0%	0%	38%
chinese-americans	73	0%	0%	0%	63%	0%	0%	0%	0%	0%	0%	0%	0%	36%
chukchis	11	0%	17%	0%	0%	0%	0%	80%	0%	0%	0%	0%	0%	2%
chuvashs	12	0%	2%	0%	6%	0%	54%	19%	1%	15%	0%	2%	0%	0%
cochin-jews	4	0%	2%	2%	0%	1%	5%	2%	8%	34%	0%	46%	0%	1%
colombian	6	0%	100%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
cypriots	7	0%	0%	1%	1%	0%	29%	0%	39%	30%	0%	0%	0%	0%
dai	6	0%	0%	0%	36%	0%	0%	0%	0%	0%	0%	3%	0%	62%
daur	8	0%	1%	1%	63%	0%	1%	25%	0%	1%	0%	0%	0%	8%
dogon	24	1%	0%	0%	0%	0%	0%	0%	1%	0%	0%	0%	94%	0%
dolgans	5	0%	0%	0%	28%	0%	10%	56%	0%	3%	0%	2%	0%	0%
druze	30	0%	0%	0%	0%	0%	17%	0%	42%	38%	0%	0%	0%	0%
east-greenlanders	6	0%	35%	0%	0%	0%	4%	60%	0%	0%	0%	0%	0%	0%
egypt	12	0%	0%	0%	0%	7%	11%	0%	47%	24%	0%	0%	7%	0%
egyptans	7	0%	0%	0%	0%	8%	10%	0%	49%	23%	0%	0%	7%	0%
ethiopian-jews	12	1%	0%	1%	0%	37%	0%	0%	38%	8%	0%	0%	11%	0%
ethiopians	12	1%	0%	0%	1%	36%	0%	1%	39%	7%	0%	0%	11%	0%
evenkis	11	0%	0%	0%	34%	0%	3%	61%	0%	2%	0%	0%	0%	0%
fang	7	6%	0%	0%	0%	5%	0%	0%	0%	0%	7%	0%	80%	0%
french	22	0%	1%	0%	0%	0%	70%	0%	14%	12%	0%	1%	0%	0%
fulani	7	2%	0%	0%	1%	5%	7%	1%	25%	0%	0%	2%	58%	0%
georgia-jews	4	0%	0%	0%	1%	0%	16%	0%	37%	43%	0%	0%	0%	0%
georgians	17	0%	0%	0%	0%	0%	23%	0%	28%	46%	0%	0%	0%	0%
gujaratis	53	0%	1%	1%	1%	0%	2%	0%	0%	37%	0%	55%	0%	2%
gujaratis-b	14	0%	2%	1%	0%	0%	13%	2%	0%	40%	0%	40%	0%	1%
hadza	11	19%	0%	0%	0%	80%	0%	0%	0%	0%	0%	0%	0%	0%
han	24	0%	0%	0%	60%	0%	0%	0%	0%	0%	0%	0%	0%	39%
han-nchina	6	0%	0%	0%	68%	0%	0%	4%	0%	2%	0%	0%	0%	24%
hausa	9	1%	0%	0%	0%	2%	0%	0%	0%	0%	3%	0%	90%	0%
hazara	16	0%	1%	0%	31%	0%	14%	16%	6%	23%	0%	8%	0%	4%
hema	11	3%	0%	1%	0%	31%	0%	1%	10%	2%	4%	0%	46%	0%
hezhen	4	0%	1%	0%	66%	0%	0%	28%	0%	0%	0%	0%	0%	6%
hungarians	9	0%	2%	0%	0%	0%	69%	2%	10%	15%	0%	1%	0%	0%
iban	15	0%	0%	2%	11%	0%	0%	2%	0%	0%	0%	7%	0%	77%
igbo	10	3%	0%	0%	0%	1%	0%	0%	0%	0%	2%	0%	90%	0%
iranian-jews	4	0%	0%	0%	1%	0%	12%	1%	39%	44%	0%	2%	0%	0%
iranians	12	0%	1%	1%	0%	0%	16%	1%	28%	45%	1%	7%	1%	0%
iraq-jews	8	0%	0%	1%	0%	0%	14%	0%	41%	40%	1%	1%	0%	1%
irula	24	0%	0%	0%	0%	0%	1%	0%	2%	1%	0%	89%	0%	0%
italian	8	0%	0%	1%	0%	0%	60%	0%	23%	14%	0%	0%	0%	1%
japanese	154	0%	0%	1%	91%	0%	0%	1%	0%	0%	0%	0%	0%	6%
jordanians	14	1%	0%	0%	0%	3%	16%	1%	42%	33%	0%	1%	3%	1%
kaba	9	2%	0%	0%	1%	10%	0%	0%	0%	0%	4%	0%	80%	0%
kalash	16	0%	2%	1%	0%	0%	10%	3%	0%	65%	0%	16%	0%	2%
karitiana	14	0%	100%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
kets	2	0%	5%	0%	13%	0%	19%	54%	0%	8%	0%	1%	0%	0%
khmer-cambodian	3	0%	0%	3%	27%	0%	0%	0%	0%	0%	0%	13%	0%	55%
kongo	5	3%	0%	0%	0%	5%	0%	0%	0%	0%	6%	0%	83%	0%
koryaks	13	0%	7%	0%	0%	0%	0%	93%	0%	0%	0%	0%	0%	0%
kurd	16	0%	1%	1%	0%	0%	19%	0%	29%	46%	0%	3%	0%	0%
kyrgyzstani	15	0%	1%	0%	40%	0%	13%	24%	3%	12%	0%	2%	0%	3%
lahu	5	0%	0%	1%	42%	0%	0%	1%	0%	0%	0%	3%	0%	52%
lebanese	3	0%	1%	2%	0%	1%	20%	0%	40%	33%	0%	2%	2%	0%
lezgins	13	0%	2%	0%	0%	0%	32%	2%	16%	45%	0%	1%	0%	0%
libya	9	0%	1%	1%	0%	7%	17%	0%	50%	10%	0%	2%	9%	0%
lithuanians	6	0%	1%	0%	0%	0%	80%	2%	0%	12%	0%	3%	0%	0%
luhya	73	2%	0%	0%	0%	22%	0%	0%	0%	0%	6%	0%	67%	0%
maasai	100	2%	0%	0%	0%	55%	0%	0%	14%	0%	1%	0%	24%	0%
mada	8	0%	1%	0%	0%	22%	0%	0%	0%	0%	3%	0%	73%	0%
makrani	19	0%	1%	0%	0%	0%	7%	0%	15%	54%	0%	18%	3%	0%
malayan	2	0%	1%	5%	3%	0%	1%	2%	0%	12%	1%	70%	0%	6%
mandenka	13	3%	0%	0%	0%	2%	0%	0%	3%	0%	1%	0%	88%	0%
maya	12	0%	86%	0%	1%	0%	3%	3%	2%	1%	0%	0%	0%	0%
mbutipygmy	13	0%	0%	0%	0%	0%	0%	0%	0%	0%	100%	0%	0%	0%
melanesian	7	0%	0%	74%	0%	0%	0%	0%	0%	0%	0%	0%	0%	25%
mexicans	38	0%	44%	0%	1%	0%	27%	2%	12%	6%	0%	1%	3%	0%
miao	6	0%	0%	0%	56%	0%	0%	1%	0%	0%	0%	0%	0%	42%
mongola	6	0%	1%	0%	64%	0%	4%	14%	1%	1%	0%	0%	0%	13%
mongolians	8	0%	2%	1%	46%	0%	10%	30%	2%	7%	0%	0%	0%	2%
moroccans	5	1%	0%	0%	0%	3%	18%	1%	54%	0%	1%	3%	15%	0%
morocco-jews	7	0%	0%	0%	0%	1%	32%	0%	39%	23%	0%	1%	2%	1%
morocco-n	12	0%	1%	0%	0%	3%	27%	0%	49%	1%	0%	4%	12%	0%
morocco-s	13	0%	0%	0%	0%	5%	18%	0%	50%	0%	1%	3%	16%	0%
mozabite	21	0%	0%	0%	0%	3%	20%	0%	53%	0%	0%	4%	16%	0%
n-european	14	0%	1%	0%	0%	0%	74%	1%	8%	13%	0%	0%	0%	0%
naxi	5	0%	0%	1%	63%	0%	0%	6%	0%	0%	0%	4%	0%	26%
nepalese	17	0%	1%	1%	7%	0%	11%	3%	0%	35%	0%	35%	0%	4%
nganassans	15	0%	0%	0%	11%	0%	0%	88%	0%	0%	0%	0%	0%	0%
nguni	4	18%	0%	1%	0%	6%	0%	0%	0%	0%	4%	0%	71%	0%
north-kannadi	6	0%	0%	3%	3%	0%	0%	0%	0%	23%	0%	65%	0%	3%
orcadian	9	0%	1%	0%	0%	0%	75%	2%	7%	14%	0%	0%	0%	0%
oroqen	7	0%	0%	0%	52%	0%	0%	40%	0%	0%	0%	0%	0%	5%
palestinian	27	0%	1%	1%	0%	3%	14%	0%	46%	32%	0%	1%	2%	0%
paniya	4	0%	0%	13%	16%	0%	0%	1%	0%	0%	1%	14%	1%	48%
papuan	17	0%	0%	100%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
pathan	14	0%	2%	0%	1%	0%	17%	1%	6%	44%	0%	26%	0%	1%
pedi	8	18%	0%	0%	0%	5%	0%	0%	0%	1%	4%	0%	71%	0%
pima	11	0%	95%	0%	0%	0%	0%	5%	0%	0%	0%	0%	0%	0%
punjabi-arain	15	0%	2%	1%	0%	0%	10%	1%	4%	45%	0%	34%	0%	0%
pygmy	17	0%	0%	0%	0%	0%	0%	0%	0%	0%	100%	0%	0%	0%
romanians	9	0%	0%	0%	0%	0%	55%	3%	19%	19%	0%	0%	0%	0%
russian	20	0%	2%	0%	0%	0%	70%	9%	1%	14%	0%	2%	0%	1%
sahara-occ	10	0%	0%	0%	0%	6%	16%	1%	57%	0%	0%	3%	15%	0%
sakilli	4	0%	0%	3%	3%	0%	1%	0%	0%	25%	0%	64%	0%	2%
samaritians	3	1%	0%	2%	0%	0%	11%	0%	49%	35%	0%	1%	0%	0%
samoan	11	0%	0%	25%	0%	0%	0%	0%	0%	0%	0%	0%	0%	74%
san	24	88%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
san-nb	12	100%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
sandawe	17	12%	1%	0%	0%	38%	0%	0%	13%	1%	5%	0%	29%	0%
sardinian	22	0%	0%	0%	0%	0%	59%	0%	35%	4%	0%	0%	0%	0%
saudis	15	0%	0%	0%	0%	4%	0%	0%	63%	30%	0%	0%	0%	0%
selkups	7	0%	5%	0%	9%	0%	26%	47%	0%	10%	0%	1%	0%	0%
sephardic-jews	13	0%	0%	0%	0%	0%	33%	0%	37%	26%	0%	1%	0%	0%
she	9	0%	0%	0%	59%	0%	0%	0%	0%	0%	0%	0%	0%	40%
sindhi	15	0%	2%	1%	0%	0%	11%	1%	5%	44%	0%	35%	0%	0%
singapore-chinese	70	0%	0%	0%	60%	0%	0%	0%	0%	0%	0%	0%	0%	40%
singapore-indians	53	0%	1%	2%	1%	0%	2%	1%	1%	32%	0%	54%	0%	3%
singapore-malay	59	0%	1%	4%	15%	0%	0%	1%	0%	1%	0%	10%	0%	65%
slovenian	17	0%	1%	0%	0%	0%	70%	2%	9%	15%	0%	1%	0%	0%
sotho/tswana	5	25%	0%	0%	0%	3%	0%	0%	0%	0%	4%	0%	67%	0%
spaniards	5	0%	0%	0%	0%	0%	68%	1%	19%	10%	0%	0%	1%	1%
stalskoe	5	0%	2%	0%	2%	0%	34%	3%	16%	39%	0%	2%	0%	0%
surui	7	0%	100%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%
syrians	10	0%	1%	0%	0%	1%	16%	0%	40%	35%	0%	3%	2%	0%
thai	17	0%	1%	2%	15%	0%	1%	2%	1%	3%	0%	16%	0%	57%
tn-brahmin	9	0%	2%	2%	0%	0%	8%	2%	0%	36%	0%	48%	0%	1%
tn-dalit	7	0%	0%	3%	0%	0%	0%	1%	0%	23%	0%	67%	0%	5%
tongan	11	0%	0%	30%	0%	0%	0%	0%	0%	0%	0%	0%	0%	70%
totonac	15	0%	91%	0%	1%	0%	3%	5%	0%	0%	0%	0%	0%	0%
tu	7	0%	1%	1%	63%	0%	3%	8%	1%	3%	0%	1%	0%	18%
tujia	5	0%	0%	0%	62%	0%	0%	0%	0%	0%	0%	0%	0%	36%
tunisia	11	0%	0%	0%	0%	1%	20%	0%	59%	0%	0%	4%	13%	0%
turks	13	0%	1%	0%	4%	0%	26%	3%	28%	35%	0%	2%	0%	0%
tuscans	79	0%	0%	0%	0%	0%	53%	0%	26%	18%	0%	0%	0%	0%
tuvinians	11	0%	1%	1%	41%	0%	9%	40%	0%	6%	0%	0%	0%	1%
urkarah	11	0%	2%	0%	0%	0%	36%	2%	11%	45%	0%	0%	0%	0%
utahn-whites	72	0%	1%	0%	0%	0%	75%	1%	7%	12%	0%	1%	0%	0%
uygur	7	0%	2%	0%	29%	0%	17%	12%	5%	22%	0%	7%	0%	6%
uzbekistan-jews	2	0%	1%	1%	0%	0%	18%	1%	35%	42%	0%	2%	0%	1%
uzbeks	10	0%	1%	0%	27%	0%	21%	17%	6%	20%	0%	6%	0%	1%
vietnamese	4	0%	0%	1%	42%	0%	0%	0%	0%	0%	0%	4%	0%	52%
west-greenlanders	8	0%	26%	0%	0%	0%	23%	45%	1%	2%	0%	2%	0%	0%
xhosa	3	27%	0%	0%	0%	7%	0%	0%	1%	0%	2%	0%	61%	0%
xibo	6	0%	0%	1%	67%	0%	1%	15%	0%	2%	0%	0%	0%	13%
yakut	18	0%	0%	1%	37%	0%	3%	53%	1%	4%	0%	0%	0%	0%
yemen-jews	12	0%	0%	1%	0%	4%	3%	0%	58%	31%	0%	1%	0%	0%
yemenese	7	1%	0%	1%	1%	5%	3%	1%	42%	28%	1%	3%	7%	1%
yi	6	0%	0%	1%	62%	0%	0%	7%	0%	0%	0%	3%	0%	26%
yoruba	92	2%	0%	0%	0%	0%	0%	0%	0%	0%	2%	0%	93%	0%
yukaghirs	6	0%	0%	0%	16%	0%	31%	42%	0%	6%	0%	1%	0%	0%

All results can be downloaded here: ADMIXTURE_K1-14.tar.gz
which contains:
PLINK formatted *.bed, *.bim, *.fam files
*.txt file with complete list of samples
K folders containing:
*.P and *.Q ADMIXTURE output files
log file, with Fst distances and CV errors
Processed Output folder containing:
Median Cluster %
Average Cluster %
Standard Deviations
Cluster Key: Top five populations in each cluster
list of Unique Populations
GNU OCTAVE variable loading file, *.mat

10 comments:

AnonymousApril 11, 2012 at 6:19 PM
Why do mandenka have MENA component? Also, many of the North African have significant West African component? this is not seen in other analysis of this type except maybe for some of the South Morroco.
ReplyDelete
Replies
LembaApril 18, 2012 at 10:47 AM
Etyopsis, first i want to thankyou greatly for this post, its very informative and the Dataset will help me get started on a "New World" Ancestry project, in which it is Vital for the African components to be as broken down as possible, but will also need to keep alot of the populations for people with native ancestry, east asian, south asian, and southeast asian, siberian, south/north euro, mideast and north-african. Do you have any tips/suggestoins for having the Western Bantu and Eastern bantu's cluster form? Or any suggestoins at all will be very welcome, i am Lemba from ABF
ReplyDelete
Replies
AnonymousApril 25, 2012 at 11:19 PM
Why doesn't the MENA cluster break up?
ReplyDelete
Replies

Add comment

Ethio Helix ኢትዮ:ሒሊክስ

Pages

Saturday, March 31, 2012

Cross Validating and K Selection

10 comments:

Blog Archive

Search This Blog

Contact Form

Ethio Helix ኢትዮ:ሒሊክስ

Pages

Saturday, March 31, 2012

Cross Validating and K Selection

10 comments:

Blog Archive

Search This Blog

Subscribe To

Contact Form