My model’s strength differentials for regionals, now with Home Field Advantage calculated in!
So I added in a Home Field Advantage (HFA) estimator to my strength estimating algorithm. And I verified that it worked properly by taking the 2026 season, adding in one run to every home team score, and getting back a value of 0.8 runs HFA (which is decent estimation performance for something simple) more than the HFA is estimated to be if I do NOT mess with the scores.
So the estimator is working fine. But… it tells me that the HFA is 0.003 runs, on average. Which I guess is reasonable, but also negligible. I was estimating based solely on games between March 10th and May 3rd, since that’s the part of season where being the Home team almost always means you’re playing at home, in your own stadium, sleeping in your own bed, practicing in your own facility, and playing for your home fans. So the allowed dates for this were after the early tournaments and ending the Sunday before the Conference tournament. And I limited it to games where at least 1 team was in the top 50. There were about 870 such games, so PLENTY of games to estimate one little parameter from without issue. But… it’s just not there. If I include ALL the games in the time period, bringing it up to 3800 games, the HFA goes up to a whopping 0.1 runs. I can run the estimator another way and get a value of 0.07 runs using the top 50 teams. So that’s as big as I can make it. But I think it’s actually near-zero.
Anyway, I was surprised, and thought I would share.
So I’m giving my model’s estimates WITHOUT HFA, since it seems to be zero-ish anyway.
Alabama by 1.5 runs over LSU
Arkansas by 2.8 runs over Duke
Texas by 3.0 runs over Arizona State
Texas tech by 0.8 runs over Florida (so here I’m saying it’s more likely that the host loses)
Oklahoma by 3.9 runs over Miss State
Georgia by 0.7 runs over Tennessee (again, my model says it’s more likely that Georgia wins here as the visitor)
Nebraska by 3.0 runs over OK State
UCLA by 3.6 runs over UCF
So by the numbers, it looks like the two that were closest are the two that my model sees opposite to how the NCAA saw them. And maybe Alabama will get a good scare.
As always, my strength numbers are an estimate of a team’s ability to score runs and also prevent the other team from scoring runs. And when you subtract those numbers for two teams, you get an estimated score differential. That is what I gave here, the on-average expected score differentials. Use them however you wish, I do have a money-back policy (if you don’t like my opinion I’ll return whatever you paid me for it).
Of the results posted here, I dislike Tennessee losing, and I dislike Texas winning, but I don’t know enough detail to disagree with my model. We get to start finding out on Thursday.
Is anyone is curious about the statistical performance of my model during regionals, I have some interesting but maybe a bit complicated plots I can post to show that statistically, the teams played more or less like normal, with normal amounts of randomness, which is what you’d want if you were making statistical models.