u/Eurasiatic

▲ 16 r/hebrew+1 crossposts

I built a YouTube caption dashboard that turns pointed Hebrew captions into broad Tiberian-style IPA

Credit / Attribution:

This project uses Hebrew/Biblical reading material from RatzonShalomMeoded,

YouTube channel ID UCDvFuAnOKFOlr9hb4ol53Gg:

https://www.youtube.com/channel/UCDvFuAnOKFOlr9hb4ol53Gg

Original video/audio/caption content © the original creator/channel.

Used here for language-learning, caption study, and Tiberian Hebrew pronunciation analysis.

Broad IPA transcription, caption-dashboard tooling, and study interface by Eurasiatic.

No endorsement by the original creator is implied.

Music credit:

“Enclosure” by Ajwaa

Royalty-free music used as background music.

All music rights belong to the original creator/licensor.

u/Eurasiatic — 2 days ago

qpAdm results for half Amhara-Ethiopian / half Anglo-American ancestry — Horn + Egyptian/North African + Satsurblia-like models

I ran a qpAdm batch on my own raw DNA to explore how my ancestry models when I combine Horn African, ancient North African / Nile Valley / Levantine, Caucasus-related, and NW European-related sources.

Background:
I am half Amhara-Ethiopian and half Anglo-American. For the Ethiopian/Horn data, I used the Ethiopian dataset from:

Alkorta-Aranburu, Gorka, et al. “The Genetic Architecture of Adaptations to High Altitude in Ethiopia.” PLOS Genetics, vol. 8, no. 12, 2012, e1003110. https://doi.org/10.1371/journal.pgen.1003110.

For ancient/reference populations, I used AADR v66 1240k. My personal genotype input was my FTDNA Family Finder raw data.

I filtered for qpAdm models that were both feasible and passing at p ≥ 0.05. That gave me 47 feasible passing models.

The main right-set/outgroup setup was:

  • Mbuti
  • Russia_UstIshim_IUP
  • Russia_Kostenki_UP
  • Georgia_KotiasKlde_Mesolithic
  • Iran_GanjDareh_N
  • Israel_Natufian
  • Papuan
  • Karitiana
  • Mixe

Most passing models used the reduced right set without Han and without Iberomaurusian. A smaller number used a plus_Iberomaurusian right set that added Morocco_Iberomaurusian.

My highest-p model overall was:

old_3way__Tanzania_Swahili-oNearEast__Egypt_AbusirelMeleq_ThirdIntermediatePeriod__Satsurblia
p = 0.7644

Weights:

  • Tanzania_Swahili-oNearEast: 38.47% ± 2.33%
  • Egypt_AbusirelMeleq_ThirdIntermediatePeriod: 55.36% ± 5.04%
  • Georgia_Satsurblia_LateUP: 6.17% ± 4.06%

The second-highest model was especially interesting because it used the Early Dynastic Egyptian sample:

old_3way__Tanzania_Swahili-oNearEast__Egypt_Nuwayrat_EDynastic__Satsurblia
p = 0.6413

Weights:

  • Tanzania_Swahili-oNearEast: 36.57% ± 2.48%
  • Egypt_Nuwayrat_EDynastic: 52.43% ± 4.87%
  • Georgia_Satsurblia_LateUP: 11.00% ± 3.68%

However, I do not want to overinterpret the “old_” models, because Tanzania_Swahili-oNearEast may be acting as a composite proxy rather than a clean Horn source.

The more directly interpretable Horn-specific non-old models were:

3way__ORO2__Egypt_AbusirelMeleq_Ptolemaic__Satsurblia
p = 0.2392

  • ORO2: 44.85% ± 2.62%
  • Egypt_AbusirelMeleq_Ptolemaic: 46.18% ± 5.62%
  • Georgia_Satsurblia_LateUP: 8.97% ± 4.04%

3way__ORO1__Egypt_AbusirelMeleq_Ptolemaic__Satsurblia
p = 0.2163

  • ORO1: 47.99% ± 2.79%
  • Egypt_AbusirelMeleq_Ptolemaic: 43.41% ± 5.70%
  • Georgia_Satsurblia_LateUP: 8.60% ± 3.99%

There were also AMH models:

3way__AMH2__Egypt_AbusirelMeleq_Ptolemaic__Satsurblia
p = 0.1225

  • AMH2: 57.86% ± 3.32%
  • Egypt_AbusirelMeleq_Ptolemaic: 33.48% ± 6.10%
  • Georgia_Satsurblia_LateUP: 8.66% ± 3.90%

3way__AMH1__Egypt_AbusirelMeleq_Ptolemaic__Satsurblia
p = 0.1125

  • AMH1: 59.13% ± 3.37%
  • Egypt_AbusirelMeleq_Ptolemaic: 32.23% ± 6.14%
  • Georgia_Satsurblia_LateUP: 8.63% ± 3.89%

My tentative interpretation is that the best fits are repeatedly asking for:

  1. a Horn/Ethiopian-related source,
  2. an Egyptian / North African / Nile Valley-related source, and
  3. a smaller Satsurblia-like Caucasus-related source.

But I am not treating these as literal ancestry percentages. I see them as qpAdm proxy coefficients showing which reference combinations explain my allele-sharing pattern under this setup.

One issue: I still feel like I need better NW European populations. Since I am half Anglo-American, the model probably needs stronger or more specific NW European sources than what I currently included. CEU appears in a few passing models, but the standard errors are large, so I do not think the NW European side is being modeled as cleanly as it could be.

Examples:

4way__CEU__ORO2__Egypt_AbusirelMeleq_Ptolemaic__Satsurblia
p = 0.1063

  • CEU: 5.84% ± 8.15%
  • ORO2: 45.15% ± 2.66%
  • Egypt_AbusirelMeleq_Ptolemaic: 40.99% ± 9.11%
  • Georgia_Satsurblia_LateUP: 8.03% ± 4.19%

3way__CEU__ORO2__Egypt_AbusirelMeleq_Ptolemaic
p = 0.0548

  • CEU: 10.40% ± 8.16%
  • ORO2: 43.90% ± 2.69%
  • Egypt_AbusirelMeleq_Ptolemaic: 45.71% ± 9.72%

So my current question is: what would be better NW European source populations to add for someone who is half Anglo-American? Would it make sense to test GBR, CEU, Irish, Scottish, English, Dutch, German, Scandinavian, or ancient/post-medieval British Isles references if available?

I am also curious whether people here would trust the higher-p Swahili-oNearEast models more, or the lower-p but more directly Horn-specific ORO/AMH models more.

My current take is:

  • The high-p “old_” models are statistically strong but may be using composite proxies.
  • The ORO/AMH models are probably more interpretable for the Horn side.
  • Egypt_AbusirelMeleq_Ptolemaic, Egypt_ThirdIntermediatePeriod, Egypt_Nuwayrat_EDynastic, Morocco_LN, and related sources seem to be absorbing some kind of North African / Nile Valley / West Eurasian-related component.
  • Satsurblia-like ancestry appears repeatedly at roughly 6–13% in many 3-way models, but I am not sure whether this is a real signal or a proxy artifact.

I would appreciate feedback on:

  1. whether these source choices make sense,
  2. how to improve the NW European side,
  3. whether the Egyptian/North African component is being overused as a proxy, and
  4. whether the ORO/AMH models should be preferred over the higher-p Swahili-oNearEast models.
u/Eurasiatic — 5 days ago

qpAdm results for Horn/Ethiopian-related modeling — best passing models and interpretation

I recently ran a qpAdm batch focused on Horn/Ethiopian-related models and wanted to share the feasible passing results for feedback from people who know Horn African ancestry modeling better than I do.

The batch tested multiple left-source combinations and right-set variants. The output I am discussing here is from the visible top feasible passing models in top_feasible_passing.csv.

Best overall visible passing model

The highest p-value model was:

old_3way__Tanzania_Swahili-oNearEast__Egypt_AbusirelMeleq_ThirdIntermediatePeriod__Satsurblia

With:

  • p-value: 0.764
  • Tanzania_Swahili-oNearEast: 38.5% ± 2.3%
  • Egypt_AbusirelMeleq_ThirdIntermediatePeriod: 55.4% ± 5.0%
  • Georgia_Satsurblia_LateUP / Satsurblia: 6.2% ± 4.1%

The next two visible passing models were also 3-way models:

  1. Tanzania_Swahili-oNearEast + Egypt Abusir el-Meleq Third Intermediate Period + Satsurblia p = 0.764
  2. Tanzania_Swahili-oNearEast + Egypt Nuwayrat Early Dynastic + Satsurblia p = 0.641
  3. Tanzania_Swahili-oNearEast + Morocco_LN + Satsurblia p = 0.629

So, in the visible top results, the best-fitting models generally seem to combine:

  • an East African / Swahili-related proxy,
  • a North African / Egyptian / Northeast African or West Eurasian-shifted ancient source,
  • and a small Satsurblia-like component.

I am not reading this literally as “I am 55% ancient Egyptian” or “6% Satsurblia.” I am interpreting these as qpAdm proxy components that may be capturing deeper West Eurasian / Northeast African structure in a Horn African-related genome.

More Horn-specific non-old models

The best visible non-old_ Horn model was:

3way__ORO2__Egypt_AbusirelMeleq_Ptolemaic__Satsurblia

With:

  • p-value: 0.239
  • ORO2: 44.9% ± 2.6%
  • Egypt_AbusirelMeleq_Ptolemaic: 46.2% ± 5.6%
  • Georgia_Satsurblia_LateUP / Satsurblia: 9.0% ± 4.0%

The second visible non-old_ Horn model was:

3way__ORO1__Egypt_AbusirelMeleq_Ptolemaic__Satsurblia

With:

  • p-value: 0.216
  • ORO1: 48.0% ± 2.8%
  • Egypt_AbusirelMeleq_Ptolemaic: 43.4% ± 5.7%
  • Georgia_Satsurblia_LateUP / Satsurblia: 8.6% ± 4.0%

These two are interesting because both pass and both use the same general structure:

Oromo-like Horn source + Ptolemaic Egyptian / Northeast African-West Eurasian proxy + small Satsurblia-like component

The ORO2 model has the slightly better p-value, but the ORO1 model has a slightly higher Oromo-like proportion. The Satsurblia-like component is very similar in both, around 9%.

My tentative interpretation

My read is that the non-old Horn models are probably more directly interpretable than the old_ Swahili-style models, even though their p-values are lower. The old_ models fit better statistically in the visible top list, but they may be acting as broad composite proxies rather than clean population-historical sources.

The more relevant Horn-specific signal, at least from the visible results, seems to be something like:

  • about 45–48% Oromo-like Horn African
  • about 43–46% Egyptian/Northeast African-West Eurasian shifted
  • about 9% Satsurblia-like / Caucasus-Upper Paleolithic-like proxy

Again, I am not interpreting the ancient references literally. I am treating them as proxy populations that help qpAdm explain allele-sharing patterns.

Questions for the subreddit

Does this kind of model structure make sense for a Horn African / Ethiopian-related result?

Would you trust the higher-p old_ Swahili-style models more, or would you focus more on the lower-p but more Horn-specific ORO1/ORO2 models?

Also, what would you suggest testing next?

Some ideas I had:

  • more Ethiopian/Horn sources if available: Amhara, Tigray, Oromo, Somali, Afar, Beta Israel, etc.
  • more Nile Valley sources: Kulubnarti, ancient Nubian, Mota-like, different Egyptian periods
  • more Levant/Red Sea sources: Israel_MLBA, Jordan_LBA, Lebanon_IA, Natufian-related sources
  • testing whether Satsurblia is just acting as a stand-in for broader Caucasus/Iran/Near Eastern ancestry
  • using alternative right sets to see how stable the ORO + Egypt + Satsurblia structure is

I would appreciate any feedback on whether these models are meaningful, overfit, or missing better source populations.

u/Eurasiatic — 6 days ago
▲ 1 r/Amhara

qpAdm: my best passing reduced-right model is Tanzania_Swahili-oNearEast + Ptolemaic Egyptian + Satsurblia-like

I ran a qpAdm grid for my target, labeled Eurasiatic, using an AADR-based PLINK dataset plus my target sample.

The full initial grid ran 160/160 models, but none of the feasible models passed with the original right set. The best “least bad” direction was repeatedly around:

Tanzania_Swahili-oNearEast / Tanzania_Swahili / Kenya_Swahili + Egyptian / Northeast African / Levant-like sources

The strongest reduced-right result I found was:

Target: Eurasiatic
Left/source model:

  • Tanzania_Swahili-oNearEast
  • Egypt_AbusirelMeleq_Ptolemaic
  • Georgia_Satsurblia_LateUP

Right set:

  • Mbuti
  • Russia_UstIshim_IUP
  • Russia_Kostenki_UP
  • Georgia_KotiasKlde_Mesolithic
  • Iran_GanjDareh_N
  • Israel_Natufian
  • Papuan
  • Karitiana
  • Mixe

This model excludes Han and Morocco_Iberomaurusian from the earlier right set.

qpAdm result:

  • p = 0.555
  • chisq = 4.91
  • dof = 6
  • feasible = TRUE

Weights:

  • Tanzania_Swahili-oNearEast = 47.1% ± 2.8%
  • Egypt_AbusirelMeleq_Ptolemaic = 45.9% ± 6.5%
  • Georgia_Satsurblia_LateUP = 7.0% ± 5.1%

My interpretation is that I am being modeled best as a mix of an East African / Swahili-oNearEast-like proxy, an Egyptian/Northeast-African-like proxy, and a small Caucasus/Upper-Paleolithic West-Eurasian-like correction.

I would not take the labels literally as exact ancestry percentages. These are formal qpAdm proxies, not proof that the ancestry is literally “Swahili,” “Ptolemaic Egyptian,” or “Satsurblia.” The Satsurblia-like component is also small and imprecise, so I would interpret it cautiously.

The right-set diagnostics were interesting: the model improved a lot when Han was removed, and then passed strongly when Morocco_Iberomaurusian was also removed. That suggests the original rejection was not just because of the source model, but also because the right set was exposing tension around East Asian-related and Iberomaurusian/North-African-related contrasts.

Would be interested in feedback from people who work with Horn/East African qpAdm models: does this look like a reasonable reduced-right exploratory model, or would you recommend a stricter/better right set or better Horn/Northeast African proxies?

u/Eurasiatic — 8 days ago

qpAdm: my best passing reduced-right model is Tanzania_Swahili-oNearEast + Ptolemaic Egyptian + Satsurblia-like

I ran a qpAdm grid for my target, labeled Eurasiatic, using an AADR-based PLINK dataset plus my target sample.

The full initial grid ran 160/160 models, but none of the feasible models passed with the original right set. The best “least bad” direction was repeatedly around:

Tanzania_Swahili-oNearEast / Tanzania_Swahili / Kenya_Swahili + Egyptian / Northeast African / Levant-like sources

The strongest reduced-right result I found was:

Target: Eurasiatic
Left/source model:

  • Tanzania_Swahili-oNearEast
  • Egypt_AbusirelMeleq_Ptolemaic
  • Georgia_Satsurblia_LateUP

Right set:

  • Mbuti
  • Russia_UstIshim_IUP
  • Russia_Kostenki_UP
  • Georgia_KotiasKlde_Mesolithic
  • Iran_GanjDareh_N
  • Israel_Natufian
  • Papuan
  • Karitiana
  • Mixe

This model excludes Han and Morocco_Iberomaurusian from the earlier right set.

qpAdm result:

  • p = 0.555
  • chisq = 4.91
  • dof = 6
  • feasible = TRUE

Weights:

  • Tanzania_Swahili-oNearEast = 47.1% ± 2.8%
  • Egypt_AbusirelMeleq_Ptolemaic = 45.9% ± 6.5%
  • Georgia_Satsurblia_LateUP = 7.0% ± 5.1%

My interpretation is that the target is being modeled best as a mix of an East African / Swahili-oNearEast-like proxy, an Egyptian/Northeast-African-like proxy, and a small Caucasus/Upper-Paleolithic West-Eurasian-like correction.

I would not take the labels literally as exact ancestry percentages. These are formal qpAdm proxies, not proof that the ancestry is literally “Swahili,” “Ptolemaic Egyptian,” or “Satsurblia.” The Satsurblia-like component is also small and imprecise, so I would interpret it cautiously.

The right-set diagnostics were interesting: the model improved a lot when Han was removed, and then passed strongly when Morocco_Iberomaurusian was also removed. That suggests the original rejection was not just because of the source model, but also because the right set was exposing tension around East Asian-related and Iberomaurusian/North-African-related contrasts.

Would be interested in feedback from people who work with Horn/East African qpAdm models: does this look like a reasonable reduced-right exploratory model, or would you recommend a stricter/better right set or better Horn/Northeast African proxies?

u/Eurasiatic — 8 days ago

Exploratory qpAdm rejected, but my G25 “Eurasiatic” distances keep pulling Maghrebi / Punic / Guanche-like. How should I interpret this?

I wanted to share three exploratory results and get feedback from people who understand Horn African ancestry modeling, especially the behavior of Levantine / North African proxies in mixed Horn-related models.

For context, I am not treating any of this as a formal ancestry claim. I know G25 distances are not proof of descent, and my qpAdm model was formally rejected. I am mainly trying to understand why the same broad pattern keeps appearing.

My qpAdm exploratory run used public AADR references. Across several rounds, no tested model formally passed:

  • Round 2: 24 tested, 0 accepted
  • Round 3: 96 tested, 0 accepted
  • Round 4: 64 tested, 0 accepted
  • Round 5: 64 tested, 0 accepted

The least-bad cleaned model was:

CEU + Tanzania_Swahili-oNearEast + Lebanon_IA3

With approximate weights:

  • CEU: 41.2% ± 9.2
  • Tanzania_Swahili-oNearEast: 38.8% ± 2.6
  • Lebanon_IA3: 20.0% ± 9.5

But the p-value was around:

p ≈ 4.0e-168

So this is clearly not a valid qpAdm ancestry model. I am reading Lebanon_IA3 only as a possible stand-in for some broader ancient Levantine / Red Sea / Near Eastern-related affinity that is not being captured well by the public reference set.

What makes it interesting is that my G25 “Eurasiatic_Array_Scaled” distances seem to pull in two related directions.

First, when compared to modern-ish populations, my closest results are heavily North African:

  • Tunisia: Kef_Sousse — 0.0837
  • Algerian: Algerian43A22 — 0.0871
  • Tunisia: Tunisois — 0.0875
  • Tunisia: Beja — 0.0883
  • Algerian: ALG100 — 0.0893
  • Tunisia: Msaken — 0.0895
  • Berber_Tunisia_Sen: BerSF8 — 0.0898
  • Berber_Tunisia_Sen: BerSF5 — 0.0897
  • Tunisian: Tunisian20D4 — 0.0903
  • Tunisian: Tunisian20F4 — 0.0926

Then, when compared to ancient samples, the closest hits include a lot of African-shifted Mediterranean, Punic, Guanche, and Islamic-period Iberian / North African-like references:

  • Austria_Ovilava_Roman_oAfrica.SG:R10667.SG — 0.0992
  • CanaryIslands_Guanche.SG:gun005_noUDG.SG — 0.1003
  • England_Saxon_oAfrica.SG:EA503.SG — 0.1050
  • Spain_NazariPeriod_LateMuslim:I8146 — 0.1052
  • Punic:I24215 — 0.1080
  • Punic:I1735 — 0.1118
  • Punic:I24673 — 0.1128
  • Italy_Imperial_oAfrica.SG:R132.SG — 0.1131
  • Portugal_Miroico_LateRoman_oAfrica.SG:R10503.SG — 0.1140
  • Punic:I3528 — 0.1168
  • CanaryIslands_Guanche.SG:gun008_noUDG.SG — 0.1167
  • Punic:I2200 — 0.1208
  • CanaryIslands_Guanche.SG:gun011_noUDG.SG — 0.1209
  • CanaryIslands_Guanche.SG:gun012_noUDG.SG — 0.1222
  • Spain_BellBeaker_oAfrica:I4246 — 0.1240
  • Punic:I2405 — 0.1235
  • Spain_NazariPeriod_Muslim:I7425 — 0.1263
  • Turkey_Byzantine_oAfrica:I8372 — 0.1270
  • Italy_Sardinia_IA_Punic_1:VIL011 — 0.1287
  • Tunisia_Punic_oAfrica2.SG:R1778.SG — 0.1289
  • Punic:I24039 — 0.1299

My current interpretation is:

The qpAdm result does not prove Lebanon_IA3 ancestry, and the G25 results do not prove Punic, Guanche, Tunisian, or North African ancestry. But the repeated pattern may be showing that my “Eurasiatic” side is best approximated by a combination of Northwest European-like ancestry plus Horn/East African-like ancestry plus a Levantine / North African / Red Sea-related component.

In other words, I am wondering whether Lebanon_IA3 is simply functioning as a generic ancient Near Eastern proxy for something closer to Horn African / Red Sea admixture that is missing from the reference set.

Questions for the subreddit:

  1. In Horn African qpAdm modeling, does Lebanon_IA3 often appear as a stand-in when closer ancient Ethiopian / Eritrean / Red Sea / Sudanese references are missing?
  2. Do close G25 distances to Tunisian, Algerian, Punic, Guanche, and oAfrica Mediterranean samples usually indicate real North African-like affinity, or can they appear because those samples are themselves mixtures of African + Mediterranean + Levantine-like ancestry?
  3. Would a better qpAdm model probably need ancient or more specific references from Ethiopia, Eritrea, Sudan/Nubia, the Red Sea corridor, or Arabia?
  4. Is it reasonable to interpret this as a broad Horn/East African + West Eurasian + Levantine/North African-like proxy effect, rather than a literal Punic or Lebanese signal?

Again, I am treating this as exploratory only. I am mainly looking for methodological feedback on proxy behavior, not trying to make a literal ethnic claim from rejected qpAdm or G25 distances.

u/Eurasiatic — 10 days ago

Exploratory qpAdm shows a rejected but recurring Lebanon_IA3-like component; G25 distances pull toward Punic/Guanche/North African proxies

I wanted to share an exploratory result and get feedback from people who understand Horn/East African + Near Eastern proxy behavior in qpAdm and G25.

I ran an exploratory qpAdm setup using public AADR references. None of the tested models formally passed. In fact, the best cleaned model was still strongly rejected:

CEU + Tanzania_Swahili-oNearEast + Lebanon_IA3

Estimated weights:

  • CEU: 41.2% ± 9.2
  • Tanzania_Swahili-oNearEast: 38.8% ± 2.6
  • Lebanon_IA3: 20.0% ± 9.5
  • p ≈ 4.0e-168, so this is clearly not a formally accepted qpAdm model

I am not interpreting Lebanon_IA3 literally as “Lebanese ancestry.” My read is that, in this setup, Lebanon_IA3 is probably acting as a broad ancient Levantine / Near Eastern stand-in for ancestry not well captured by the available public AADR Horn/Ethiopian references. The same applies to Tanzania_Swahili-oNearEast: I am treating it as a proxy for an East African/Horn-adjacent component with some Near Eastern affinity, not as a literal Swahili ancestry claim.

What makes it interesting is that my G25 distance results for the same “Eurasiatic” side are repeatedly pulling toward North African / Punic / Guanche / Mediterranean outlier references. Some of the closest distances include:

  • Austria_Ovilava_Roman_oAfrica.SG:R10667.SG — 0.0992
  • CanaryIslands_Guanche.SG:gun005_noUDG.SG — 0.1003
  • England_Saxon_oAfrica.SG:EA503.SG — 0.1050
  • Spain_NazariPeriod_LateMuslim — 0.1052
  • Punic:I24215 — 0.1080
  • Punic:I1735 — 0.1118
  • Punic:I24673 — 0.1128
  • Italy_Imperial_oAfrica.SG:R132.SG — 0.1131
  • Portugal_Miroico_LateRoman_oAfrica.SG:R10503.SG — 0.1140
  • Punic:I3528 — 0.1168
  • CanaryIslands_Guanche.SG:gun008_noUDG.SG — 0.1167
  • Punic:I2200 — 0.1208
  • CanaryIslands_Guanche.SG:gun011_noUDG.SG — 0.1209
  • CanaryIslands_Guanche.SG:gun012_noUDG.SG — 0.1222
  • Spain_BellBeaker_oAfrica:I4246 — 0.1240
  • Punic:I2405 — 0.1235
  • Spain_NazariPeriod_Muslim:I7425 — 0.1263
  • Turkey_Byzantine_oAfrica:I8372 — 0.1270
  • Italy_Sardinia_IA_Punic_1:VIL011 — 0.1287
  • Tunisia_Punic_oAfrica2.SG:R1778.SG — 0.1289
  • Punic:I24039 — 0.1299

So my current interpretation is:

The qpAdm result does not prove a literal Lebanon_IA3 source, and the Punic/Guanche G25 distances do not prove that I am Punic, Guanche, or North African. But both analyses seem to point in the same general direction: my non-Horn/non-European side may be expressing some kind of older Northeast African / Red Sea / Levantine / North African-related affinity that is not being modeled cleanly with the available public references.

I would be interested in feedback on whether this is a reasonable interpretation, or whether Lebanon_IA3 is simply acting as a generic West Eurasian / Levantine “catch-all” proxy in a rejected qpAdm model.

In particular, I am wondering:

  1. In Horn African qpAdm models, how often does Lebanon_IA3 or a similar Levantine ancient proxy appear as a stand-in when closer Ethiopian/Eritrean ancient references are missing?
  2. Do Punic/Guanche/North African G25 distances usually reflect real North African-like affinity, or can they simply appear because those samples themselves are mixed African + Mediterranean + Levantine-like proxies?
  3. Would a better model require ancient Ethiopian/Eritrean, Sudanese/Nubian, or Red Sea references that are not available in the public AADR set?

I am treating all of this as exploratory only. I am mainly trying to understand the pattern rather than make a literal ancestry claim.

u/Eurasiatic — 10 days ago

Exploratory qpAdm results for mixed NW European + Ethiopian/Horn ancestry — no formal pass, but consistent broad signal

Hi everyone,

I’ve been working through an exploratory qpAdm analysis using public AADR references for my target sample, labeled Eurasiatic. My known background is broadly maternal NW European / Anglo-American and paternal Ethiopian/Horn African, likely Amhara-related.

I tested several rounds of qpAdm models:

Round 2: 2-way models
Round 3: broad 3-way models
Round 4: Horn-core 3-way models
Round 5: cleaned right-pop sensitivity test

None of the models formally passed qpAdm. Even the best valid-weight model had a very low p-value, so I’m not treating this as a formally accepted ancestry model.

The best cleaned valid-weight proxy model was:

Eurasiatic = CEU + Tanzania_Swahili-oNearEast + Lebanon_IA3

Approximate weights:

CEU: ~41.2% ± 9.2%
Tanzania_Swahili-oNearEast: ~38.8% ± 2.6%
Lebanon_IA3: ~20.0% ± 9.5%

p ≈ 4.0e-168, so formally rejected.

My interpretation is that the broad signal is still directionally meaningful:

NW European-like + Horn/East African-like + Near Eastern/Levant-like

But the public AADR set does not seem to have a close enough Ethiopian/Amhara reference to produce a proper qpAdm pass. I’m treating labels like Tanzania_Swahili-oNearEast, Kenya_Somali, Lebanon_IA3, and Turkey_N as proxies only, not literal ancestral sources.

I’d be interested in feedback from people here, especially on better public proxies or right-pop setups for modeling Ethiopian/Horn ancestry in qpAdm.

u/Eurasiatic — 13 days ago

Merged my FTDNA data with AADR and ran qpAdm: broad NW European + East African/Horn-like signal, but no accepted model yet

u/Eurasiatic — 13 days ago

Merged my FTDNA data with AADR and ran qpAdm

Merged my FTDNA data with AADR and ran qpAdm: broad NW European + East African/Horn-like signal, but no accepted model yet.

u/Eurasiatic — 13 days ago
▲ 9 r/HornAfricanAncestry+4 crossposts

Merged my FTDNA data with AADR and ran qpAdm: broad NW European + East African/Horn-like signal, but no accepted model yet

I finally got my FTDNA autosomal raw data converted and merged with the AADR 1240K dataset, then ran ADMIXTOOLS 2 / qpAdm models against ancient and modern proxy populations.

Basic workflow:

  1. Converted FTDNA raw CSV into PLINK PED/MAP, then BED/BIM/FAM.
  2. Converted AADR v66 1240K EIGENSTRAT/TGENO into PLINK using PLINK2.
  3. Merged my sample with AADR.
  4. Fixed the .fam population labels so ADMIXTOOLS could recognize the AADR groups.
  5. Extracted precomputed f2 stats with ADMIXTOOLS 2.
  6. Tested qpAdm models using different combinations of European, North African, Levantine, Punic, Horn/East African, Nile Valley, Swahili, and Pastoral Neolithic proxies.

The early North African / Levantine / Punic models failed badly. They produced impossible weights like huge positive North African ancestry paired with huge negative Levantine or Roman ancestry, so I treated those as invalid.

The models became more sensible once I added northwest European proxies like English, CEU, and GBR, plus East African / Horn / Nile-related proxies like Kenya_Somali, Sudan_Kulubnarti, Tanzania_Swahili, and Kenya_PastoralN_Nderit.

The best broad signal was consistently:

Northwest European + East African / Nile-Horn / Swahili-Pastoral-like

The most useful feasible proxy models were roughly:

  • English + Tanzania_Swahili-oNearEast: ~47.5% English / ~52.5% Tanzania Swahili-oNearEast
  • English + Kenya_PastoralN_Nderit: ~41.4% English / ~58.6% Kenya Pastoral Neolithic Nderit
  • English + Kenya_Somali: ~57.4% English / ~42.6% Kenya Somali

Important caveat: none of these were formally accepted qpAdm models. The p-values were still extremely low, so I’m treating them as exploratory proxy positions rather than final ancestry proportions.

My takeaway is that with the AADR references I had available, qpAdm could place the broad axis pretty clearly, but it could not find a clean formally accepted source model for a modern mixed individual. Better Ethiopian/Eritrean/Somali/Afar/Oromo/Tigray/Amhara references would probably improve the modeling a lot.

u/Eurasiatic — 14 days ago

You do not upload an FTDNA raw-data file directly into the ADMIXTOOLS 2 Shiny app.

FTDNA Family Finder raw data is a single-person CSV/GZ file with columns like RSID, chromosome, position, and result, usually on Build 37 / GRCh37. ADMIXTOOLS 2, however, expects a real genotype dataset in formats such as PLINK, PACKEDANCESTRYMAP, or EIGENSTRAT, each with separate genotype/SNP/sample metadata files. (FamilyTreeDNA Help)

The practical workflow

You need this pipeline:

FTDNA raw CSV/GZ
→ convert to PLINK or EIGENSTRAT
→ merge with a reference dataset such as AADR/1240K
→ compute f2-statistics
→ load f2-statistics into ADMIXTOOLS 2 Shiny
→ run qpAdm

ADMIXTOOLS 2 works by first computing or loading f2-statistics, then using those f2-statistics for qpAdm/qpWave/other analyses. It is not designed to take a consumer-DNA raw file by itself. (Uqrmaie1)

What you should do

Best realistic route

Ask someone with genetics/bioinformatics experience to prepare one of these for you:

  1. A precomputed f2-statistics folder that includes your FTDNA sample, or
  2. A merged EIGENSTRAT dataset containing your sample plus ancient/modern reference populations.

Then, in the Shiny app, you load the prepared f2 folder, not the original FTDNA file.

What to give the person preparing it

Give them:

1. Your FTDNA Family Finder Build 37 raw autosomal file
2. The reference dataset you want, usually AADR/1240K
3. Your preferred sample label, e.g. Mattios_Girma_FTDNA
4. Your preferred population label, e.g. Mattios
5. Request: “Please merge this FTDNA sample into AADR/1240K and create ADMIXTOOLS 2 f2-statistics for qpAdm.”

Make sure you use Family Finder autosomal data, not Big Y, Y-SNP, or mtDNA files. FTDNA’s Family Finder file is the autosomal file with SNP results across chromosomes 1–22 and X; Y-DNA and mtDNA files are not useful for qpAdm autosomal modeling. (FamilyTreeDNA Help)

In the Shiny app

Once you have the prepared f2 folder:

  1. Open r/RStudio.
  2. Run:

​

library(admixtools)
run_shiny_admixtools()
  1. In the Shiny browser app, go to the data/f2 section.
  2. Load the precomputed f2 directory.
  3. Go to the qpAdm tab.
  4. Choose:
    • Target: your sample label or population label
    • Sources/Left: candidate ancestral populations
    • Right/Outgroups: outgroup set
  5. Run qpAdm.

What not to do

Do not try to upload this directly:

Family_Finder_Autosomal_Raw_Data.csv

or:

Family_Finder_Autosomal_Raw_Data.csv.gz

That file is not in ADMIXTOOLS input format.

Short answer

You need to convert and merge your FTDNA autosomal raw data first. The Shiny app should receive either:

a prepared f2-statistics folder

or a properly formatted genotype dataset such as:

PLINK .bed/.bim/.fam
EIGENSTRAT .geno/.snp/.ind
PACKEDANCESTRYMAP .geno/.snp/.ind
reddit.com
u/Eurasiatic — 18 days ago