
The #1 model on the leaderboard dropped to #14 when I included the benchmarks they didn't report.
u/testofschool — 1 day ago

Before an LLM generates anything, PCR uses Item Response Theory (1968) to pick which content chunks to feed it based on the user's ability level.
Tested on Llama 3.3 70B with 20 science passages and 6 user profiles:
- PCR: 6.06/10
- Hardest-for-everyone: 3.67/10
- 15/18 pairwise wins (p=0.004, Cohen's d=1.23)
One sigmoid, one sort, zero GPU cost for routing.