r/sportsanalytics

Image 1 — My edge
Image 2 — My edge
Image 3 — My edge
Image 4 — My edge
Image 5 — My edge
Image 6 — My edge
Image 7 — My edge
Image 8 — My edge
Image 9 — My edge
Image 10 — My edge
Image 11 — My edge
Image 12 — My edge
Image 13 — My edge
Image 14 — My edge
Image 15 — My edge
Image 16 — My edge
Image 17 — My edge
Image 18 — My edge
Image 19 — My edge
Image 20 — My edge

My edge

Hey guys. So, I have an edge, or at least I believe I have an edge. This edge relies on a simple logisticregression model and an implied probability filter based on moneyline opening odds. It uses simple features like running diff and team level stats etc and can be used to predict future game outcomes as soon as odds drop on a Sportsbook.

You can look at my website’s history page to see the walkforward backtests and current forward test.
https://baseballpredictions.net/history?mode=featured
No I’m not trying to advertise the website here it just has the history already collated.

Anyways, my edge seems to be somewhere like 6-20% ROI on 30-140 bets per season (of course in the backtests only this year is an actual forward test so we will see how the season ends. Either way, the problem is that even if I bet on every bet for the whole season and it’s the maximum amount profitable I still don’t make anywhere near enough money to retire with the amount of bankroll I am willing to put on each bet.

The question is, what do I do from here? Obviously I will finish betting this season and see if my backtesting is at all accurate in the forward test, but even if it is accurate, I still need to somehow improve my edge.

Pretty much rambling but I hope you got the point. I have an edge supposedly, and the edge seems pretty good, but apparently it’s not even good and I feel stuck? Thoughts??!

Bets attached (only 20 pictures allowed can post the rest in a separate post if you want or maybe I can comment them?)

u/Prudent_Student2839 — 20 hours ago
▲ 5 r/sportsanalytics+3 crossposts

Need testers for my sports bracket app! I will test yours back immediately

Hi everyone!

I need some help reaching the 20 testers requirement for my app. It's the mobile version of my website bracketmundial.com, designed to create and manage sports tournament brackets.

I will 100% test your app in return and keep it installed for the required 14 days. Just leave your links in the comments!

Here are the links to test mine:

1. Join the Google Group:

https://groups.google.com/g/bracketmundial

2. Opt-in Web

Link : https://play.google.com/apps/testing/com.jesusprodriguez.bracketmundial

3- Android App Link:

https://play.google.com/store/apps/details?id=com.jesusprodriguez.bracketmundial

u/Academic-Maize-4071 — 16 hours ago

How we built an MLB prop model using Statcast — and where books consistently leave money on the table

The hits prop. The strikeout over. The home run parlay leg you picked because he's been \"on a tear.\"

Most bettors approach player props the same way books want them to — reacting to recent results, anchoring to season averages, and guessing on matchups they haven't actually researched. This post is about why that process finds almost no edge, and what the data actually says instead.

Why books misprice props at scale

On a full 15-game MLB slate, books are setting 300–400 individual player props. Game lines get precise models and fast sharp corrections. Props don't. The volume makes it structurally impossible to price every prop with the same precision applied to the main line — especially for mid-slate games and less-followed prop types like hits and RBI.

The mispricing isn't random. It clusters in three specific conditions:

Recency bias. Books anchor on recent results. A batter going 1-for-18 gets his hit line dropped even if his xBA, Barrel%, and hard-hit rate haven't moved. The book is reacting to outcomes. The data is telling a different story.

Platoon lag. Most books use blended season splits. A left-handed batter facing a right-handed pitcher tonight has a meaningfully different true hit probability than his season average against all pitchers suggests. The gap between blended and split-specific is often 25–40 points of batting average — that's the entire margin on a prop line.

Workload blindspots. A pitcher with a 9.0 K/9 on a strikeout line priced for 6.5 Ks is a bad bet if his team is pulling him at 85 pitches. That workload pattern is in the data. The book's line often doesn't reflect it.

The metrics that actually predict prop outcomes

For hit props — xBA over batting average. xBA (Expected Batting Average) calculates what a batter's average should be based on exit velocity, launch angle, and spray direction against historical outcomes on balls hit the same way. A batter hitting .231 with a .298 xBA is making contact the scoreboard doesn't reflect yet. That gap closes. The book anchors on .231. The model sees .298.

For strikeout props — SwStr% over K/9. Swinging Strike Rate measures swing-and-miss generation per pitch thrown, independent of whether those misses converted to strikeouts. It leads K/9 by 2–3 starts when a pitcher's stuff is improving or declining. A pitcher with elite SwStr% and a suppressed K rate is due for a spike. K/9 misses this entirely.

For home run props — Barrel% × park factor interaction. Barrel% (the optimal exit velocity + launch angle combination) is the cleanest measure of true power. Pair it with the park's HR factor and the opposing pitcher's hard-hit rate allowed, and you have a model input the book's generalized HR line doesn't fully capture.

https://preview.redd.it/fzkwrmkh8e2h1.png?width=2564&format=png&auto=webp&s=91e76ded9c194cc672501405613228ed9a57be3c

How we score it

Each of those inputs feeds into an EdgeScore from 0–100. The score isn't a win probability — it's a measure of how strongly the Statcast data diverges from the book's implied probability. High score means the inputs are stacked in the bettor's favour and the line hasn't caught up.

https://preview.redd.it/cu8sluxm8e2h1.png?width=2526&format=png&auto=webp&s=de87e9d9f80c1d9ab522ea255d200212aa419b28

The EV layer sits on top. Once the true probability is modelled using a Poisson distribution for counting stats, we strip the book's vig and calculate the exact gap between true probability and implied probability. That output — LineCheck — tells you whether the edge is real before you place.

A prop with a true over probability of 61% offered at -115 (53.5% implied after vig) is +7.5% EV. That's the signal. Everything else is noise.

https://preview.redd.it/enyuav409e2h1.png?width=2528&format=png&auto=webp&s=5d49263e20549b1e032e2d37355094efdf02d26e

What this looks like in practice

The dashboard ranks every MLB prop across the full slate by EdgeScore in one view. The hover breakdown shows which specific factors are driving the score — so you know whether you're betting a strong platoon advantage, elite contact quality, or a pitcher workload mismatch. The Parlay Builder stacks the highest EV legs automatically.

This is what we built at ProprStats. The Statcast model does the research. You see the reasoning, not just the output.

Happy to go deep on any part of the methodology in the comments — the Poisson modelling, the platoon weighting, or how we handle small-sample noise on the xBA inputs.

reddit.com
u/StatsPropGuy — 22 hours ago
▲ 0 r/sportsanalytics+1 crossposts

From Crypto Bots to Football: I applied my 400+ model "Quant" system to sports betting

https://preview.redd.it/3d3n5qc6ha1h1.png?width=1920&format=png&auto=webp&s=3d4008b8b6d332acb5a9286dca0c2bd8222501d5

(This is a repost, because for some reason Reddit didn't like the language I used on my previous post)(This is a repost, because for some reason Reddit didn't like the language I used on my previous post)

Hey everyone,

I’m an IT student currently refining a quantitative framework I’ve been developing over the last two years. Initially, my focus was on high-volatility assets like crypto, where I spent time building models to account for market sentiment and volume. Recently, I decided to take on a much more complex challenge: the football (soccer) market.

For the last 6 months, I’ve been migrating my "Quant" architecture to handle sports metrics. My goal shifted from managing capital to building a purely statistical "second brain"—a research tool designed to strip away the "gut feeling" and narratives that usually cloud this space.

  • The Architecture: A pipeline of 400+ ensemble models processing 60+ variables per fixture.
  • Data Inputs: I focus strictly on market movement (odds) and team-level performance metrics (xG, possession, etc.). I’ve deliberately excluded individual player data to reduce noise and maintain model scalability.
  • Markets: Currently modelling HDA, Over/Under, Goals and BTTS. I’ve recently added Corners, and Double Chance is currently in testing.
  • Coverage: Wide-scale coverage including the big 5 European leagues, Championship, MLS, and several Scandinavian and South American divisions.

The approach is 100% mathematical. I’m looking at this as a probability problem rather than a sports problem. In early testing with a small group of users, the model has shown a consistent ability to identify value in high-variance markets (specifically the MLS and lower European divisions).

I’ve reached a point where my own backtesting and limited forward-testing show a steady statistical edge (maintaining a 60%+ hit rate on primary markets during the last cycle), but I need more "stress testing" from people who understand algorithmic modelling.

I’ve built a dashboard to host these daily statistical projections to keep the project organised. It’s a completely free research tool—I’m not looking to sell anything.

I’m looking for "power users" and fellow quants/developers to help me refine the logic. I want to confirm these data points are useful to other researchers before I look into scaling the infrastructure or approaching potential investors.

I’m happy to share the data or the link for the dashboard in the comments if anyone wants to look at the projections for this weekend’s fixtures.

I'm keen to hear your critiques on the methodology:

  • Are there specific high-signal variables I might be missing?
  • Should I expand into player-specific props, or does that introduce too much variance for a quant model?
  • What other niche leagues tend to follow statistical trends better than the "top-heavy" ones like the Premier League?

Looking forward to some technical feedback! I have also shared some pictures of my system, how the process is going and also my own picks for the day based on what the system suggested

https://preview.redd.it/on9g2xp4ha1h1.jpg?width=800&format=pjpg&auto=webp&s=be4521807ecdf26e0ed979a10480102d97cffc1a

https://preview.redd.it/6zyq0oc6ha1h1.png?width=1038&format=png&auto=webp&s=7d6f09c1cf6c727a33c4c7ed420fb7ef38e7eac7

https://preview.redd.it/bw732pc6ha1h1.png?width=1201&format=png&auto=webp&s=0551841ec420cdf9ec4e893bdb1406b861a32b5b

reddit.com
u/Internal-Cover-339 — 1 day ago

Looking for SofaScore Player Ratings Dataset for Football Finance MSc Thesis

Hi everyone,

I am currently working on my MSc thesis on football finance and data-driven player acquisition. For my research, I am looking for a SofaScore dataset with the following variables:

  • Player name
  • Average rating
  • Team name
  • Season

Ideally, the dataset would cover the seasons 2019/20, 2020/21, 2021/22, 2022/23, and 2023/24.

The leagues I am most interested in are:

  • Eredivisie
  • Campeonato Brasileiro Série A
  • Primeira Liga / Liga Portugal
  • Belgian Pro League
  • EFL Championship
  • Süper Lig

Does anyone happen to have access to this kind of data, or know where I might be able to find it?

If not, I would also really appreciate any guidance on how to scrape this data from SofaScore or similar platforms in a reliable way. I am fairly new to coding, so even pointers to useful tools, scripts, APIs, or tutorials would be very helpful.

Thanks in advance!

reddit.com

Give me your weirdest NBA prop theory and I’ll try to backtest it

I’m building an NBA prop model and want more angles to test.
Not obvious stuff like “minutes matter” or “usage matters.”
I mean weird but testable theories.
Examples:
\- guards on B2Bs go under assists
\- rested bigs rebound better vs tired teams
\- co-star out can hurt points overs because defenses load up
\- rebound props depend more on opponent shot profile than rebounding rank

The more specific, the better.

I’m trying to separate angles that sound sharp from angles that actually survive data.

Drop one you’d want tested. And I’ll report back the data I find.

reddit.com
u/TallPassenger2738 — 1 day ago

I built a playoff model before Round 1 and just tested it through two full rounds — 9/12 series correct so far.

The point of the model was not to predict every series perfectly. It was to separate structural contenders from teams that only look good in the regular season.

My main takeaway after two rounds: the model is very good at identifying large structural mismatches, but it struggles more in coin-flip series where one player can swing the entire matchup.

A few things the model seems to capture well:

  • shot creation under playoff pressure
  • defensive scalability across series
  • weak-link exposure
  • clutch decision-making
  • roster optionality

A few things it still misses:

  • matchup-specific solutions
  • individual playoff variance
  • in-series injuries
  • momentum / confidence shifts

The biggest lesson so far is this:

The model understands systems, not solutions.
That’s why it’s been strong on the obvious series, and weaker when a specific player or matchup changes everything.

I wrote the full report card here: https://open.substack.com/pub/atakankaraoban/p/the-playoff-viability-model-conference?r=6fb0sd&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true

Curious what people think:
In playoff basketball, what matters more — team structure or elite individual variance?

u/Logical_Demand435 — 1 day ago

Football api for analytic dashboard

Hi everyone,

I'm working on a small football analytics/dashboard project and I'm trying to choose the right data provider before building the database structure.

What I need is mainly:

  • Fixtures, teams, players, squads
  • Lineups, formations and substitutions
  • Match events: goals, cards, substitutions, shots, penalties, VAR events if available
  • Detailed team and player match statistics, not just goals and assists
  • Player stats such as shots, shots on target, passes, key passes, crosses, tackles, interceptions, clearances, aerial duels, ground duels, fouls, dribbles, possession lost/won, goalkeeper saves, etc.
  • Historical data for the main European leagues
  • Ideally player heatmaps or some kind of positional/event-level data
  • Stable IDs for teams, players and matches
  • Decent documentation and predictable pricing

So far I'm looking at SportMonks as the main provider, and possibly using other sources only to fill gaps, especially for things like heatmaps, event-level data or more detailed player stats.

I've also seen SofaScore APIs on RapidAPI/APIDojo, API-Football, StatsBomb open data, football-data.org, Understat/FBref scraping, etc., but I'm not sure which ones are reliable enough in practice.

For those who have actually used these sources:

  1. Which provider would you trust as the main data source for a football analytics project?
  2. Is SportMonks reliable enough for a structured database/dashboard?
  3. Are RapidAPI football APIs stable enough, or should they only be used for prototyping/enrichment?
  4. Where would you get player heatmaps, detailed player stats or event-level data without paying enterprise-level prices?
  5. Any hidden issues with coverage, rate limits, missing fields, unstable IDs or inconsistent historical data?

I'm not looking for a perfect enterprise solution, just something solid enough to build a serious prototype without having to rebuild everything later.

Any real-world experience would be super helpful. Thanks!

reddit.com
u/sevenx986 — 2 days ago

I built a football analytics platform that goes beyond standard xG to evaluate "deserved" outcomes.

I’ve spent the last few months building numbertwenty.io, a football data platform designed to calculate the true "deserved" outcomes of matches by filtering out the game's "aleatoric" noise. I just wanted to share it to get some fresh eyes and constructive feedback, so I can improve my model / platform.

The problem I'm trying to tackle:

We all know football is inherently chaotic (according to statistics). A single rare event can flip a result, which is why relying solely on the final score might miss the dynamics of a football match. We often use standard Expected Goals (xG) to assess the fair result, but it also has its limits when analyzing a single game such as:

  • The "Draw" blindspot: Football has 3 outcomes (1-X-2). But xG models (even Poisson-derived ones) mathematically struggle to predict a draw as the most probable outcome as soon as the xG values aren't perfectly identical.
  • Context is ignored: Generating 1.5 xG away from home is inherently harder than doing it at home, but raw xG doesn't capture this dynamic.
  • Volume vs. Control: A team spamming low-probability shots can inflate their xG without actually controlling the game.

Then, there is no direct metric to quantify the "Fair Result" of a football match.

The core idea of numbertwenty.io:

To tackle this inverse problem (and cut through the match's aleatoric noise), I use the very simple principle of similarity search (statistical neighbors).

The pipeline compares a match's statistics (derived from raw stats) against thousands of football games (each feature being weighted according to its relevance in the competition!). By finding a match's closest statistical neighbors, and after performing a calibration to match the observed distribution of 1-X-2 in the competition, the model surfaces a realistic probability distribution of what the outcome truly deserved to be.

I detail the whole process a bit more in the about section of the website. The current model is surely not the final version and can evolve over time. I also added a simple predictive algorithm based on the same principle as the post-match analysis, but it's not the main purpose of the website, and I will try to improve it in future updates. I really focus on post-match analysis, which also highlights just how random results can be, and why betting is highly uncertain!

Beyond all of that, I tried to add plenty of other tools on the platform for you to check out, like a dynamic Fair Elo ranking, an automatically generated analysis of football matches according to statistics (experimental)...

This is my first time building and deploying a full-stack platform, so any feedback is welcome!

(Quick note on ads: they are managed automatically by Google with most settings kept to the bare minimum. If you find them too intrusive or if they ruin the UX, please let me know and I will try to adjust them manually).

Here are some screenshots from desktop:

Main menu (Green background = deserved result, Red background= unfair outcome)

Match details (showing the 'Similar Matches' neighbors, resulting probability distributions, and more features in tabs)

Competition overview

Team profile

reddit.com
u/onenumbertwenty — 2 days ago
▲ 5 r/sportsanalytics+2 crossposts

Not All Sprint Training Should Be the Same

I took real data and mapped all runs above 5.5 m/s per position and type of event.

This clip I included is a Right Center Back while in possession of the ball. What I found was that not all high speed running and sprints are the same.

I included small amount so we can see that most runs are not linear. Adding different types of sprints are crucial for training to be better but also injury reduction.

Data Science principals used were cluster - namely PCA - to find features and correlate player positions.

With this knowledge, we can create better training regimens.

u/URThrillingMeSmalls — 2 days ago
▲ 1 r/sportsanalytics+1 crossposts

We just launched a football community app focused on live match discussions and predictions looking for honest feedback

Hey everyone,

I’ve been building something called Fanverse since February 2025 and we just launched the Android version.

It’s a football community platform built around:

  • live match chats during games
  • match predictions
  • football debates and discussions
  • tournament-based communities (World Cup, leagues, etc.)

The idea is to make football conversations feel more real-time and connected instead of scattered across different platforms.

We’re still very early and I’m not here to promote aggressively — I’m mainly looking for honest feedback from football fans or people who’ve built/used community platforms before.

What do you think works well in early-stage community products, and what usually causes them to fail at the start?

Android link : https://play.google.com/store/apps/details?id=com.fanverse.sportshub

u/FanverseSports — 2 days ago
▲ 18 r/sportsanalytics+12 crossposts

I built a sports analytics app for player prop research — would love feedback

Hey everyone! I just launched the iOS version of AlgoSwish, a sports analytics app I’ve been working on for a while.

The app is built for bettors who want to research smarter without having to jump between a bunch of different sites. Right now, it’s focused on NBA analytics and includes things like:

  • Player prop research
  • Player analysis and stat breakdowns
  • Market movement
  • Bet tracking
  • Model picks
  • Parlay builder

A big thing I want to mention: most of the app is free to use, and you don’t even need an account to try some of the free research tools. The main Pro feature right now is the Picks screen.

iOS is live now, and Android should be coming sometime next week.

Current roadmap:

  • WNBA next
  • MLB after that
  • NFL after that
  • More sportsbooks, exchanges, and DFS-style books planned too, including platforms like Kalshi, PrizePicks, Novig, and more
  • Currently launching in the US and Canada first, with UK, EU, and Australia planned later

Since I’m launching late into the NBA season during the playoffs, I also want to be transparent: the model picks and parlay builder have tested really well during the previous seasons, but playoff rotations and matchups can get weird, so I’d be more cautious/selective with picks during this stretch. I’m going to keep improving the app as more data comes in and more sports are added.

I’m also offering a welcome deal for early users:

50% off the 1-month Pro subscription for the first 100 people who redeem it. Code expires May 31, 2026.

Redeem promo code:
https://apps.apple.com/redeem?ctx=offercodes&id=6764715128&code=WELCOME

App Store Download link:
https://apps.apple.com/us/app/algoswish-sports-picks/id6764715128

I’d genuinely appreciate any feedback, good or bad. I’m trying to build this into a high-quality sports analytics app with fair pricing and useful tools, not just another hard-paywalled betting research app.

Thanks for checking it out.

u/dubuckets — 3 days ago
▲ 11 r/sportsanalytics+2 crossposts

Using Clustering to Discover Patterns in Sprinting

This is a cool visual aid I made in python to explain clustering.

I've taken sprints across a game and tried to determine which positions relate to others. I believe this will help with training sprints since there are many ways we can.

Such as short sprints, curved sprints, cutting, plyometrics, etc.

u/URThrillingMeSmalls — 3 days ago
▲ 1 r/sportsanalytics+1 crossposts

Sports betting app tool?

Y’all know what name this sport betting app tool is?

u/x_IGHT_x — 3 days ago
▲ 4 r/sportsanalytics+4 crossposts

Sport Predicting RPG!

https://apps.apple.com/us/app/betsfriends/id6761031275

Imagine a game like clash of clans but in order to level up, your xp and in game currency are earned by your ability to predict real life sporting events. That’s where we are heading with this. All the fun of gambling without any risk of financial ruin. Hope you can see the value in that!

Hey everyone, I'm the solo developer behind an app called BetsFriends, and I'm at the point where I really need real people to try it and tell me the truth about it.

Here is the simple version of what it is. On the surface it is a game. You make sports picks, and when you get them right you earn XP and units, you level up through tiers, and you build out a custom avatar with accessories that reflects how good you actually are. It is genuinely fun and a little addicting to climb. But underneath that game is something I built for people who actually follow sports closely: a real tool for tracking picks, comparing yourself to other people, and seeing who actually knows what they are talking about.

Here is what is in it right now:
Make your picks. Call games across multiple sports and markets, and your results follow you.

XP, units, and tiers. Correct picks earn XP and units. You level up through tiers, and your profile shows your standing. Units also let you unlock avatar accessories, so progression actually means something visually.

Custom avatar system. Build a character with layered accessories. It is tied to your record, so your avatar is basically your reputation that other people can see.

See everyone's picks. This is the real handicapping layer. You can see other people's picks, share your own, and follow sharp pickers instead of guessing in the dark.

Records and history. Look at anyone's full pick history and track record over time. No more people claiming they "had it" after the fact. The receipts are right there.

Best picker per team. See who is actually the sharpest on a specific team, not just overall. If someone is elite on one team's games, that is incredibly useful information, and the app surfaces it.

Live scoreboard with live chat. Follow games live and talk through them in real time with other people who have action on the same game. The games are more fun when you are watching them with people.

Leaderboards and competition. It is built around friendly competition. The whole point is proving you are the sharpest, with zero money involved. No deposits, no losses, no chasing anything. Just skill and bragging rights.

The reason I am posting is honest. It is still early and there are not a lot of users yet. An app like this only comes alive when real people are in it, picking against each other and sharing records. I would rather have a small group of people who actually use it and tell me what is broken than a big number that means nothing.

So here is my ask. If any of this sounds interesting, I would really appreciate it if you would download it, make picks for a few days, and then tell me what you actually think. What is confusing, what is missing, what made you want to come back or not. Brutal feedback helps me more than polite feedback. Being this early means the things you flag genuinely shape where this goes next.

I also want to be straight about why this matters. I am one person building this, and I have a lot more I want to add. Things like a battle pass, deeper progression, bigger competition, and more reasons to keep coming back. But I can only justify pushing that further if there are real people here using what already exists. Every person who tries it and gives me honest feedback is literally what lets me keep building.

It is just me. No big company behind this. I am building something I think should exist, and your support and your honesty are what move it forward.

If you have questions, or you just want to tell me what sucks and why, reply here or DM me. I read and answer everything.

u/MOONNNMANNN — 3 days ago
▲ 1 r/sportsanalytics+2 crossposts

Per-bucket Platt scaling on a 425-bet sports model: global per-sport fit was making it worse

Working on a sports prediction model in my spare time and ran into a calibration issue that surprised me. Sharing in case it's useful, or anyone has thoughts.

Setup: my journal has ~425 graded singles across NBA, MLB, and MMA. I was applying per-sport Platt scaling (one A/B sigmoid fit per sport) as a final residual correction after the Elo + form + market-anchor blend. Standard pattern from the binary-classifier literature.

Worked fine until I started bucketing the journal by verdict tier (STRONG BET vs GOOD BET) and noticed the two tiers were miscalibrated in opposite directions:

NBA STRONG (eff_n=28, 30d half-life):   A = -6.83, B = +4.23
NBA GOOD   (eff_n=22):                  A = +2.65, B = -1.77
MLB STRONG (eff_n=46):                  A = +3.57, B = -2.52
MLB GOOD   (eff_n=127):                 A = +2.10, B = -1.12

The NBA STRONG and GOOD slopes disagree on sign. The single per-sport NBA fit (A=-1.36) was averaging those two errors and correcting both buckets wrong.

Fixed by switching to per-(sport, verdict_tier) Platt with a fallback chain: per-bucket when eff_n ≥ 12, else per-sport, else global, else identity. Verdict tier at inference is inferred from the probability band (p ≥ 0.68 = STRONG, 0.55-0.68 = GOOD) since the actual verdict label isn't known until edge is computed downstream.

Delta on synthetic predictions:

NBA STRONG @ p=0.78:  per-sport -> 0.358,  per-bucket -> 0.251  (-10.7pt)
MLB GOOD   @ p=0.62:  per-sport -> 0.516,  per-bucket -> 0.546  (+3.0pt)

Calibration audit is live at lakeshore-edge.com/model if anyone wants the raw data. Per-bucket coefficients update on the journal reflect loop.

Caveats I'm aware of:

  • 425 bets is still tiny for a 2-parameter sigmoid per bucket
  • Verdict-tier inference from p-band has its own selection bias (high-p picks become STRONG more often, so the bucket fit is fitting on a non-random subset)
  • Time-decay weighting (30d half-life) is plausible but not validated against a held-out window

Question for the sub: anyone done per-bucket calibration in this kind of small-sample regime? Specifically deciding between hierarchical Bayes (pool partial information across buckets) and the simpler fallback chain I'm running now.

u/mangoman40114 — 3 days ago
▲ 5 r/sportsanalytics+1 crossposts

which tools do you use for long term player development tracking in your club?

As a young coach trying to modernize our assessment process, I'm looking for recommendations. We typically do CMJ, attack and block jump recordings monthly, also endurance testing and cod drills, but the manual recording is just too time consuming. What tools do you have to make it trackable on the long run?

reddit.com
u/Hungry_Raspberry1768 — 3 days ago
▲ 10 r/sportsanalytics+4 crossposts

Título: Hice una web para el bracket del Mundial 2026 — gratis y con descarga en Excel

Con el Mundial a menos de un mes quería algo donde poder rellenar mis predicciones y compartirlo con el grupo de WhatsApp.

La app te deja rellenar el bracket completo, ronda por ronda, y al final puedes exportarlo a Excel para mandarlo o imprimirlo.

Sin registro, sin anuncios.

🔗 https://bracketmundial.com

Si encontráis algún bug o tenéis sugerencias, las leo todas.

u/Academic-Maize-4071 — 4 days ago

6 weeks, 889 bets, +16% ROI flat stake — full breakdown by league (one league is genuinely broken)

6 weeks in, 889 bets tracked: here's what the data actually looks like

Back in my first post I shared the early methodology. Enough data has accumulated to post a proper update. All figures are virtual flat-stake (€10/bet), real resolved predictions.


Overall performance

Metric Value
Total bets 889
Win rate 69.3%
Avg odds 1.72
Net profit +142 units
ROI +15.99%

At €10/bet that's +€1,421 profit on €8,890 wagered. I'll be honest — I didn't expect this to hold up at scale. The early sample was noisy.


By league (top 15, min 20 bets)

League Bets Win% ROI
MLS 41 82.9% +41.0%
UAE Pro League 31 77.4% +38.0%
Bundesliga 24 79.2% +30.9%
La Liga 22 72.7% +29.7%
Trendyol 1. Lig 24 75.0% +25.3%
Brasileirão 24 79.2% +25.3%
Premier League 28 71.4% +12.3%
Ligue 1 33 69.7% +9.5%
Eliteserien 37 59.5% +3.9%
Ekstraklasa 23 52.2% -21.8% ⚠️

What I found interesting

The big leagues aren't necessarily the best performers. Bundesliga (+30.9%) and La Liga (+29.7%) are outperforming Premier League (+12.3%) by a wide margin. My working hypothesis: top leagues have tighter markets and more efficient odds, which compresses edge. Lower-tier competitions with less liquidity seem to be where the model finds more signal.

MLS and UAE at +38-41% are statistical noise for now — both under 50 bets. I'm not drawing conclusions from those yet.

Ekstraklasa is a genuine problem. 23 bets, 52.2% win rate, -21.8% ROI. That's not noise — something about Polish football specifically isn't fitting the model's assumptions. My best guess: physical, high-press style produces momentum readings that look threatening but don't convert. I'm tightening the gates specifically for that league.


The dynamic vs flat stake question

Someone will ask this so I'll address it upfront: I'm sticking with flat stake for now.

My confidence calibration error is currently around 5pp — meaning when the model says 75% confident, the actual win rate is closer to 70%. Dynamic staking would amplify that miscalibration. Kelly criterion at 69.3% win rate / 1.72 avg odds suggests ~27% of bankroll per bet, which is recklessly aggressive. +16% ROI flat is already exceptional — chasing more variance isn't worth it at this sample size.

Once calibration error drops below 2pp and I have 200+ bets per league, I'll revisit.


What changed since the first post

A few meaningful updates to the system:

  • Added bookmaker AH (Asian Handicap) signal from live odds — when bookmaker and model agree on direction, confidence adjusts upward. When they conflict, it adjusts down. Still evaluating impact.
  • Pressure threshold is now bucket-specific: it's a ceiling signal in the first half (high pressure = overheating) and a floor signal in the second half (low pressure = cold match). The global threshold was actively wrong.
  • Hard block on minute ≥ 80. Accuracy in that window was 21.4% — essentially noise. Removing those predictions cleaned up the overall numbers.
  • Score states high (5+ goals) and 2-1 are now blocked. Both historically below 56% accuracy.

I'll post again at 1,500 bets. Happy to answer methodology questions below.


Virtual results only. Not financial advice.

reddit.com
u/Dry-Jello194 — 5 days ago