r/algobetting

▲ 0 r/algobetting

Just wanted to share a snippet of my GUI tool I've been building for football betting. The logloss of pinny clv against true outcomes is in grey bar.

u/FlatChannel4114 — 9 hours ago

▲ 41 r/algobetting+31 crossposts

تنبأت هذه القناة بأحداث 7 أكتوبر وبـ "حرب الـ 12 يوماً"، كما تنبأت أيضاً بـ "عملية الوعد الصادق 4". وتوقعاتها لعام 2027 كارثية ومروعة

youtube.com

u/thedowcast — 16 hours ago

▲ 1 r/algobetting

Database structure for an arb finder

Hello, I'm trying to make an in-play arb/+ev scanner, the most typical kind of one like you've all seen a billion times, and I'm wondering what the best way to store and retrieve the odds would be.

Do you have any idea that would be better than just making a one giant table where each row contains the sport, league, teams, bet type etc?

reddit.com

u/chemoltv — 16 hours ago

▲ 0 r/algobetting

sportsbet.io limits after 2 bets

I'm doing arb betting and last week I incorporated sportsbet.io.

For example, I got limited in MLB AFTER 3 BETS. All of $250 in Total Hits market. My maxbet on moneylines in MLB is now around 10 dollars.

Is this normal? Am I doing something wrong?

reddit.com

u/Chuti0800 — 1 day ago

▲ 6 r/algobetting

What CLV taught me about edge

What tracking CLV taught me about my own "edge" (a humbling writeup)
Built a Poisson-ELO model for football and tracked closing line value on every pick. Two lessons that hurt: (1) my "+14% CLV" on cards was a mirage — median was 0%, all outliers from stale thin-market lines. (2) The real, repeatable edge lives in secondary markets (corners), not 1X2 where books are razor-sharp.
Anyone else find their edge evaporates once you filter for line liquidity + min bookmakers? Curious how you gate for stale lines.

reddit.com

u/DrCesarMg91 — 1 day ago

▲ 0 r/algobetting

66% prediction accuracy

I have been building a football prediction model, it covers 20 leagues in Europe. It's built using approx 20 seasons worth of data for most of the leagues. Metric coverage is patchy at points, particularly in the earlier seasons but in total I have approx 350 million data points.

I have got it to a point where it is 66% plus accurate in its predictions. I'm fairly confident i can push this up to 70%. That is currently the rate when run indiscriminately against all fixtures but I plan on filtering it further as some analysis indicators produce much higher accuracy rates.

However, I understand all of the above is meaningless unless I can establish that the odds offered on the games were such that a profit could be generated.

Does anybody know an accurate historical odds database that I can pull from that would allow me to get exact profit/loss figures for the model?

Any advice appreciated.

reddit.com

u/Due_Butterscotch8814 — 3 days ago

▲ 1 r/algobetting

[model log boxing] timestamped predictions, this weekends fights + backtesting update

Unfortunately with a busy world cup weekend it looks like boxing promoters are understandably reluctant to put on any big profile fights, so only a meagre four predictions make it through the pipeline so far this week, i guess we will just have to be patient.

I’ll try and do my best to make this weeks log a bit more “in depth” than usual so that this just doesn't descend into a boring picks post.

Here’s this weekends predictions.

https://preview.redd.it/1bh5pbeat1bh1.png?width=1388&format=png&auto=webp&s=602ce0292d0eae794c49cea78d5d0ac829a0c8e8

https://fitequant.com/upcoming

Out of this week’s predictions this seems like an interesting bout.

https://fitequant.com/compare/12498-tsubasa-narai/12504-yamato-hata?bout_id=242

Tsubasa Narai vs Yamato Hata

FiteQuant likes Hata, but the market likes him a bit more, so the interesting thing is not the pick — it’s why the model refuses value.

The first interesting thing is that the raw fighter scores are basically level: Narai 64.90 / rank #229 and Hata 65.16 / rank #214, both with roughly Medium fighter confidence around 0.50. Yet the locked model still gets to Hata 61.58%. That means this is not being driven by a crude structured subjective inference(ssi) fighter rating effect; the separation is coming from the interaction layer.

The market is also making this interesting. Hata at 1.50 implies 66.67%, while FiteQuant has him at 61.58%, hence the visible -7.64% ROI / -5.09% edge / No Value result. But after removing the overround from the 2.63 / 1.50 prices, the market is roughly 63.7% Hata / 36.3% Narai, so FiteQuant is not wildly disagreeing with the market; it is saying “Hata, but not quite at that price.”

There is also a subtle guardrail spotlight here: Narai’s implied probability at 2.63 is 38.02%, while FiteQuant’s inverse Hata probability gives Narai about 38.42%. Purely mechanically, that is a tiny theoretical underdog twitch, but the “strict pick” rule avoids calling that value because the model still expects Narai to lose outright. This is a very good example of why the strict “value only when the model expects the fighter to win” rule exists

Without it, the system could start surfacing thin, ugly underdog pseudo-value in exactly the kind of fight where boxing variance tempts over-interpretation.

The matchup itself is interesting, because they are very closely ranked fighters on pure ssi “subjective stats” with very similar fighter scores for the model.

But the compound “matchup factors” make the real difference here with multiple interesting factors at play.

Hata gets a medium height/reach advantage and a medium stance-interaction advantage as southpaw versus orthodox, while Narai gets the medium style-interaction advantage as boxer-puncher versus power slugger.

Increasingly im thinking that compound factors (sorry ML guys i refuse to call them features) are whats important in systemized modeling/fitequant. SSI can drive a rich abstraction as a fighter as actor, but then i guess an ELO approach could get something not a million miles away.

But in systemized modeling its actually the matchup factors that drive the disagreement with odds that seemingly produces consistent reliable +ev

Backtesting coverage update

It occurred to me that because my data pipeline rates fighters with ssi even if their canonical bout doesnt pass strict data quality checks, it might be possible to get more time safe result data even if a few fighters are missing a “height” or “reach” or other critical datapoints.

Currently my timesafe data pipeline strictly rejects bouts where even one fighter doesnt have a key datapoint. In boxing this happens more than you’d think.

For example there is a fight this weekend between boxers called Erik Hanley and Ibrahim Mason, fitequant has actually freshly rated these fighters and you can view a custom matchup prediction for that bout as follows (the model leans Mason)

https://fitequant.com/compare/12429-erik-hanley/12433-ibrahim-mason?fighter_a_profile_source=default&fighter_b_profile_source=default

So if you use a resource like proboxingodds (to check for upcoming fights with odds) or equivalent you can often get pseudo-predictions for bouts even if they arent strictly marked upcoming by the system

Anyway the result of all this was me realizing I probably had a bunch of effectively “timesafe results” just sitting there, even if it was lower quality data.

So i’ve literally just implemented this in backtesting in “non strict time safe mode” where pleasingly the bout coverage is now up to 86 bouts, and I expect it to keep growing from here.
I did do some quick backtesting and admittedly on only a 50 sample with 86 results with a batch test of 10 i got the following data with the default model.

https://preview.redd.it/gjkairjbt1bh1.png?width=1398&format=png&auto=webp&s=70a53f173bacc5c433c60bd3b399b20c97f35177

https://fitequant.com/testing

Obviously this data should be treated with caution, because its explicitly non strict. But seeing as the relevant factor is just nulled when a reach or whatever is missing, i think it does point to the fact that fitequant may be usefully used for bouts that arent strictly marked upcoming by the system.

As i say i’ve literally just implemented this, and havent really dont much experimenting yet, but i don’t mind saying i’m excited and pleased that this is now available for user research.

As always if anyone has any questions feel free to reach out.

Thanks, Dan

reddit.com

u/Character_Pie_277 — 2 days ago

▲ 37 r/algobetting+1 crossposts

What do you think , working almost a month on this !

What do you think , working almost a month on this ! willing to maybe sell it if someone want it, python project many files , many stuff already added.. ML is fully trained .

u/AccomplishedWorkDONE — 3 days ago

▲ 5 r/algobetting+1 crossposts

Built a real-time odds API 11 bookies, ~1s WebSocket refresh, free tier

Hey all! I've been working on PulseScore (https://pulsescore.net), a real-time sports odds API aimed at people building trading bots, arb scanners, and dashboards. Wanted to share it here and get feedback from people who actually use this kind of data.

What it covers:

- Bookmakers: Bet365, Fanduel, Bwin, Unibet AU, Paddy Power, BetOnline, PS3838, William Hill, Ladbrokes, DraftKings, Betano DE

- Live in-play via WebSocket 1–2s refresh

- Pre-match odds across all books on your plan

- 14+ sports (soccer, tennis, basketball, ice hockey, AF, cricket, horse racing, etc.)

- Bet365 goes deepest — 50+ markets per event (1X2, AH, O/U, BTTS, correct score…)

Same JSON shape across every bookmaker — swap /bet365 for /fanduel, /bwin, /ps3838 and the rest. REST + WebSocket, X-Secret header auth.

Pricing:

- Free: 500 req/month

- Starter €20/mo: 30k req/month

- Pro €79/mo: unlimited req + 1 WebSocket (7-day trial)

- Max €149/mo: unlimited + 3 WebSockets multi-sport + /all, /count endpoints

Free tier is enough to actually test it, no card needed. Happy to answer anything about latency, market depth, or how it compares to OddsAPI / TheOddsAPI / RapidAPI alternatives. Roast away.

u/Hot-Muscle-7021 — 3 days ago

▲ 3 r/algobetting

Recent Historical Odds -- International Football (Soccer)

Hey everyone,

I am trying to test a strategy I have developed but for that I need to collect a couple hundred games of historical per-goal or over/under odds from the past 3 years on international football matches. It is imperative that this is for international matches as that is part of my hypothesis. It cannot be any older due to training data constraints. While I found a couple kaggle datasets, they were too old and the API's I have tried to interact with either did not work or required a hefty subscription (this a university project and my funding is non-existent).

I do not care on whether these are betting exchange, prediction markets, or bookmaker odds -- I would make anything work.

I was wondering if any of you here might have some recommendations on where I could find this data.

Thanks so much!

reddit.com

u/alpha-trotsky — 3 days ago

▲ 0 r/algobetting

High School beginner. How is algotrading system I made.

import nest_asyncio

nest_asyncio.apply()

from pandas_ta import macd

import pandas as pd

from datetime import datetime, timedelta, timezone

import numpy as np

import asyncio

from types import SimpleNamespace

import math

from alpaca.trading.client import TradingClient

from alpaca.trading.enums import OrderSide, TimeInForce

from alpaca.trading.requests import (

MarketOrderRequest,

LimitOrderRequest,

StopOrderRequest

)

from alpaca.data.historical import StockHistoricalDataClient

from alpaca.data.requests import StockBarsRequest

from alpaca.data.timeframe import TimeFrame

from alpaca.trading.stream import TradingStream

from alpaca.data.live import StockDataStream

from alpaca.data.enums import DataFeed

# Configured explicitly with your secret key and the free IEX data feed

stock_data_stream = StockDataStream(

'YOUR_API_KEY',

'YOUR_SECRET_KEY',

feed=DataFeed.IEX

)

trading_stream = TradingStream(

'YOUR_API_KEY',

'YOUR_SECRET_KEY',

paper=True

)

# Alpaca Paper Trading

trading_client = TradingClient(

'YOUR_API_KEY',

'YOUR_SECRET_KEY',

paper=True

)

stock_historical_data_client = StockHistoricalDataClient(

'YOUR_API_KEY',

'YOUR_SECRET_KEY'

)

stock_bars_request = StockBarsRequest(

symbol_or_symbols="TSLA",

timeframe=TimeFrame.Day,

start=datetime(2026, 6, 10, 15),

end=datetime(2026, 6, 25, 15)

)

account = trading_client.get_account()

print("Account Number:", account.account_number)

print("Buying Power:", account.buying_power)

print("Currency:", account.currency)

class Util:

u/staticmethod

def to_dataframe(data):

data_list = data if isinstance(data, list) else [data]

try:

return pd.DataFrame([item.model_dump() for item in data_list])

except AttributeError:

try:

return pd.DataFrame([item.dict() for item in data_list])

except AttributeError:

return pd.DataFrame([vars(item) for item in data_list])

async def handle_order_update(update):

print(f"Order update: {update.order.id}")

print(f"Filled QTY: {update.qty}")

print(f"Filled Price: {update.price}")

print(f"Status:{update.order.status}")

async def handle_quotes(quote):

print("New Quote")

print(quote)

async def handle_trades(trade):

print("New Trade")

print(trade)

async def handle_bars(bar):

print("New Bar")

print(bar)

print("Fetching real historical bars to prime MACD...")

historical_request = StockBarsRequest(

symbol_or_symbols="TSLA",

timeframe=TimeFrame.Minute,

start=datetime.now(timezone.utc) - timedelta(minutes=60),

end=datetime.now(timezone.utc),

feed=DataFeed.IEX

)

historical_data = stock_historical_data_client.get_stock_bars(historical_request)

tsla_bars = historical_data.data.get("TSLA", [])

initial_test_bars = [

SimpleNamespace(

open=b.open,

high=b.high,

low=b.low,

close=b.close,

volume=b.volume

)

for b in tsla_bars

]

bars = []

has_position = False

async def on_new_bar(bar):

global bars

global has_position

bars.append(bar)

if len(bars) < 35:

print(f"Not enough data for macd Calculation: {len(bars)}/35")

else:

df = Util.to_dataframe(bars)

macd_did = macd(df['close'], fast=12, slow=26, signal=9)

print(f"macd:\n{macd_did.tail(1)}")

df['MACD_Histogram'] = macd_did['MACDh_12_26_9']

if df['MACD_Histogram'].iloc[-1] > 0 and not has_position:

print("BUY TSLA")

has_position = True

trading_client.submit_order(

order_data=MarketOrderRequest(

symbol="TSLA",

qty=100,

side=OrderSide.BUY,

time_in_force=TimeInForce.DAY

)

elif df['MACD_Histogram'].iloc[-1] < 0 and has_position:

print("SELL TSLA")

has_position = False

trading_client.submit_order(

order_data=MarketOrderRequest(

symbol="TS

reddit.com

u/Fit_Time_7861 — 3 days ago

▲ 2 r/algobetting

Feature selection for LLM prediction

Interested whether anyone has built out a simple pipeline for LLM information gathering and what features they found valuable/not valuable. Let's say you pipe in injury report, past 15 games box score, top news headlines, season-to-date advanced stats, rest. Then asked models to pick the slate.

I don't think this is ultimately profitable but there is definite value in collecting and parsing soft information like this, and would be very interesting to compare/contrast what information is most or least additive for a prediction task like this.

I'm a current researcher for my master's degree at a top-20 university; if anyone is in the research field and interested in collaborating on this subject please reach out.

reddit.com

u/airplainfood — 5 days ago

▲ 4 r/algobetting

Looking for people to build a profitable soccer model togheter

I have built multiple datasets from different data sources and api,but there is only so much I can do alone.

I am looking for people that has a good understanding of the game and statistics.

Optional

If you have built a model(even not profitable) in the past that would be much appreciated.

I would like to build a small supportive community to share dataset, ideas and findings.

If you are interested please shoot a me a Dm. Thanks

reddit.com

u/Icy_Court_5780 — 4 days ago

▲ 0 r/algobetting

What if AI agents predicted the world cup?

More and more people are discussing applications and products that combine prediction markets and AI-assisted money-making systems. Use an AI agent to predict trends and assess risks. I’m thinking about trying anvita cyber cup. Let me know if you know any other great. Would love to hear real experiences from members of this community.

reddit.com

u/mattdingus2002 — 6 days ago

▲ 1 r/algobetting+1 crossposts

Trading in-game volatility (not winners) on live Polymarket MLB - where does this break?

I've been building an automated system to trade Polymarket's live MLB moneyline markets. Want to lay out the core logic and have people poke holes before I risk real money.

The core mechanic (legging into a locked position):

Buy one side of a game cheap early — e.g. an underdog at ~$0.35 once odds settle after first pitch
If the game swings and the other side gets cheap (my team takes a lead, opponent now ~$0.35), I buy that side too
Combined cost under $1.00 = locked profit no matter who wins

The catch: this only completes if the game actually swings. In a blowout my first leg drifts toward zero and I eat the loss. So I'm really buying volatility - betting on games that swing more than the market expects, not on a winner.

Where I think the edge is:
I model per-game volatility from Statcast pitcher-vs-batter matchup data — basically how prone a game is to scoring bursts and lead changes. The thesis is that Polymarket's live odds underprice volatility in specific matchups.

What I want advice on:

Is "buy volatility by legging in" actually viable given live MLB Market Makers on Polymarket?
I have a volatility model but no standalone win-probability model yet. Am I right that I need both — one to pick games, one to price the first leg?

Not looking for picks. Looking for the reasons this won't work.

reddit.com

u/Otherwise_Sport9684 — 8 days ago

▲ 5 r/algobetting+2 crossposts

Betfair Historical Pro data (Soccer)

Hi, does anyone have access to Betfair Pro historical data for 2026 that they'd be willing to sell at a discounted price? Alternatively, I'm happy to buy the data myself and split the cost with someone if they're willing to contribute. I already have access to the 2024 and 2025 Pro data, so I'd also be open to a trade where I provide the 2024/2025 data in exchange for the 2026 data.

reddit.com

u/Complete_Okra_679 — 9 days ago

▲ 2 r/algobetting

Any good international sportsbooks for someone in Korea?

What’s up fellas, I want to start sports betting but the rules here in Korea are pretty restrictive. Before anyone asks, yes, South Korea, not North haha. So, I’m thinking about using an overseas site.

Anyone got good recommendations? Ideally, it needs to accept international users and handle bet sizes ranging from small amounts up to at least $100+.

FYI, I'm already signed up on Pinnacle and Sbobet. Hit me up if you know any other great alternatives! Thanks in advance!

reddit.com

u/Mozzi6623 — 10 days ago

▲ 2 r/algobetting

What sports data API is currently anchoring your betting stack?

Hey guys,

Just curious to see what the current landscape looks like for everyone here. Data quality obviously makes or breaks any algorithm.

If you don't mind sharing:

Which sports API do you currently use for your primary data feed?
What sport/league are you pulling data for?
How happy are you with their uptime and data accuracy?

I see a lot of debate between scraping vs. paying for official API access, so I'm trying to gauge what the community consensus is. Let’s hear your setups!

reddit.com

u/MitchellSadie — 12 days ago

▲ 6 r/algobetting

+10.8% ROI in my MLB strikeout model's first live month, a teardown of why I don't trust it

Posting the full teardown because this is the one sub where the methodology is the point.

KIT is a hierarchical Bayesian model that predicts a full strikeout PMF for a starting pitcher. Negative binomial likelihood, because strikeout counts are overdispersed and a Poisson underfits the tail, pooled across pitchers and handedness. I place a bet when the model's implied P(under) or P(over) diverges far enough from the book's no-vig price to clear an EV threshold. Inputs are the usual pitcher form, velocity, whiff and spin trends, plus the opposing offense's rolling strikeout tendencies. That last input is rolling team rates rather than the posted lineup, which is why the model is comfortable pricing six or more hours before first pitch.

April was the first real-money month. 96-86 over 184 settled bets, +$966 on about $8,900, +10.86%. Here is why I think that number is mostly seasonal softness and noise.

The bets were 86% one-directional. 159 unders, 25 overs. Unders +$1,280 at +16.7%, overs -$315 at -25%. Effectively all the profit is on one side. A one-sided book is a directional bet on a market regime, not a demonstrated pricing edge, unless you can show conditional selection within that side. Most of what follows is me trying to show that and failing.

I ruled out a standing mispricing first. Over 113,053 closing main lines the no-vig under price is calibrated, implied 50.5% against a realized 51.3%, residual 0.8pp, every populated bucket within about 2pp of the diagonal. K-prop unders are not structurally underpriced, so the direction had to come from something conditional rather than a price you can flat-bet.

The one real effect is seasonal. Bucketing the historical under gap by month across those same closing lines, the under is genuinely underpriced in cold weather. A flat under returns about +3.6% in March, stays positive in April, then flips negative in May (worse than -7%) and stays dead June through September. April 2026 happened to be a softer-than-usual April, gap near 3.6pp. My live sample sat entirely inside that window, so calendar timing explains the direction of the bets with no model skill required. I did not find a door, I stood in front of one that props open every spring and shuts in May.

The selection signal dissolves under resampling. The most defensible skill claim was my alternate unders, 47 bets placed exactly one strikeout below the consensus main line, which returned +24.1% against a blind below-the-line rule that loses about 8% on the same population. That gap looks like the model picking the good ones out of a bad pool, which is the one thing a flat-bet backtest cannot see. Then I stress it. Fragility ladder: the top 5 of those 47 bets are about 105% of the profit, so the other 42 are collectively negative, and dropping the top 5 takes the subset to -1.4%. Bootstrap the per-bet P/L 10,000 times and the 95% interval is [-16%, +65%], with about 12% of resamples losing money. Hit rate was 48.9% against a 36.2% breakeven, which sounds like a lot until you see that interval. At n=47, or even n=159, you cannot separate a real conditional edge from five lucky tails.

CLV, the actual truth-teller, is thin, and I had to clean it before trusting it. Two corrections mattered. First, the EV figure my tracker logs is the market-derived no-vig EV at placement, not my model's edge, so it cannot be used as evidence of skill. Second, my exchange fills inflate mean CLV because of how exchange pricing moves relative to sportsbook closes, so they have to come out for a clean read. After that, CLV averages under 2pp and only 39% of bets beat the closing price. If I were actually beating the market you would expect that well above half. Early placement, median around eight hours out into morning lines that have not sharpened, is a real and repeatable mechanism, but the thin CLV says I am barely converting it into anything.

The profit pooled exactly where the limits fall. About three-quarters of it came from books that throttle winners, and the throttling already started. theScore halved my limits and took nothing after the 22nd, BetRivers limited me after a single $14 bet, and DraftKings, the open book I leaned on most, lost me money. An edge that only exists in the accounts that cut you off is not a sustainable edge.

So: real money, probably not a real edge. Seasonal regime plus a small early-line timing effect plus variance. I am taking it into June for the only honest test, more out-of-sample, while the books close the door.

A few open problems I would genuinely want this sub's read on.

What's the right benchmark for measuring CLV? I track closing line value, but I'm not settled on which closing price to grade against. The options I'm weighing are the best closing price at the same book I bet at, the best closing price anywhere in the market, the fair no-vig market consensus, or a sharp reference like Pinnacle or an average of sharp books. Each one tells a slightly different story about whether I had an edge. Which do you treat as the real benchmark, especially for a bet placed hours before close at a soft book?
Calibrating a model that refits faster than calibration data accumulates. I refit weekly, but a leak-free isotonic calibration layer needs months of out-of-sample results to be anything but noise, and every refit resets the pool. Backfilling 200 to 300 pairs just produces garbage curves. How do you maintain a live calibration layer when your model's lifetime is shorter than the timescale your calibration data needs? Pool across model versions and eat the stationarity assumption, switch to a parametric calibrator that survives small samples, or just stop refitting so often?
Edges that only appear after the line moves against you. Plenty of my bets don't clear my EV threshold at open. The gap only opens up after the market moves against the model's side. Betting those feels like textbook adverse selection. I'm taking the side the market is walking away from, usually for a reason I don't have. Occasionally it's a real overreaction I'm fading correctly, and at this sample size I can't tell the two apart. Lately I skip any bet where the line has already moved against the model by more than a set percentage, which kills the obvious adverse-selection cases but almost certainly throws out some correct fades too. Is a flat movement threshold the right instrument, or is there a cleaner way to separate an overreaction worth fading from a move that is correctly pricing something the model missed?

Full writeup with charts: https://medium.com/@billyweingarten/blinded-by-the-win-526a024e6788

reddit.com

u/bwista — 12 days ago

▲ 1 r/algobetting

Any way to bet substantial amounts on lower ranked tennis matches?

Preferably ITF tournements.

reddit.com

u/ReconizedV — 11 days ago