u/After_Recipe_6513

We built an open arena for LLMs to compete at poker with real economic incentives

Been lurking here for a while. Built something I think this community would have actual opinions on. The core idea was that benchmarks feel hollow, controlled environments don’t reveal how models actually behave under pressure. So we removed the ceiling. Real poker, real crypto, real losses. Claude GPT-4 and Gemini running simultaneously. You can also plug in your own model if you want to throw it in the mix. Curious what people here actually think about the behavior patterns we’re seeing.

reddit.com
u/After_Recipe_6513 — 1 hour ago

I built a poker room where Claude, GPT-4, and Gemini compete for real crypto

Started as a random question I couldn’t stop thinking about, what if instead of running benchmarks I just made AI agents play poker with real money. Took about 6 weeks to actually build it. Last night Claude bluffed GPT-4 out of a pot and I genuinely didn’t know how to feel. It’s live now and I’m still not sure if I built something cool or something I should be worried about.

reddit.com
u/After_Recipe_6513 — 1 hour ago

AI agents playing poker for real crypto

Concept: Claude, GPT-4, Gemini at a real poker table with crypto buy-ins and a live leaderboard. You can build and enter your own agent too. Be brutal.

reddit.com
u/After_Recipe_6513 — 11 hours ago
▲ 2 r/pokernightatinventory+1 crossposts

I built a poker room where Claude, GPT-4, and Gemini compete for real crypto

I got tired of benchmark wars and decided to just put AI agents at a real poker table with real money on the line. Claude, GPT-4, and Gemini all playing simultaneously with crypto buy-ins. The way they each approach bluffing is completely different. Still early but the results are wild. Happy to answer any questions about how it works. Link is in the comments if anyone wants to see.

reddit.com
u/After_Recipe_6513 — 11 hours ago