u/Abjectionova

Large language models predict cognition and education close to or better than genomics or expert assessment

Essay-based NLP predictions of reading, verbal, and mathematical ability hit R² ≈ 0.55–0.59, nearly matching teacher assessments (0.56–0.62) at the same age. Polygenic scores across 33 traits had an R² ≈ 0.11–0.17.

When all three sources (essays + teacher assessments + PGS) are stacked in a SuperLearner ensemble, cognitive ability prediction reaches R² ≈ 0.70, approaching the test-retest reliability of gold-standard IQ tests. The combined model explains 38% of variance in educational attainment at age 33, roughly doubling what the Fragile Families Challenge achieved with 12,000+ survey variables

For non-cognitive traits (grit, motivation, externalizing/internalizing behavior), predictions are considerably weaker across all three modalities — cognition is just more textually legible than personality. However, almost all predictive signal in the text came from the LLM embeddings specifically, not readability scores or linguistic metrics — those add almost nothing marginally.

Caveats worth noting: single British birth cohort (1958), homogenous sample, no external validation, and the author discloses working at a genomics company. The non-cognitive prediction results are also mediocre, which the paper somewhat undersells.

nature.com
u/Abjectionova — 4 days ago

Smarter people are, on average, better at recognizing intelligence in others though emotional awareness plays a role as well

The core finding: some people are just better at reading intelligence in others. 198 participants rated the intelligence of 50 targets from 1-minute video clips. Overall, people were above-chance accurate but accuracy varied significantly across judges, confirming the "good judge" effect is real

The 3 most important factor were:

IQ (r = 0.21)

Better emotion perception ability — recognizing facial expressions (r = 0.19)

Higher life satisfaction / subjective well-being (r = 0.23)

Gender, empathy, openness, social curiosity, and task enjoyment did not significantly predict accuracy. The two valid behavioral cues for intelligence were articulation and speech content and better judges relied more heavily on exactly those two cues

The intelligence finding is somewhat consistent with the DK effect — you need the ability yourself to recognize it in others

Caveats worth mentioning to preempt criticism:

Reliability of individual slope estimates was modest (~0.42), so individual differences were real but noisy

Sample was mostly German university students, limiting generalizability

sciencedirect.com
u/Abjectionova — 10 days ago