▲ 0 r/REBubble
I kept getting different answers from the three frontier models on the same question, so I built something that makes them fight it out in 4 rounds:
- Each model answers blind (no knowledge of the others)
- They read each other's answers and have to ACCEPT or REJECT, with reasoning
- If two of them agree, the third is FORCED to dissent (no consensus allowed)
- A judge round picks a winner and a confidence score
Test question: "Should I buy a house in 2026 or keep renting?"
https://raresightai.com/d/f844a549-cbe7-4790-9c3e-a8e88eb2797e
GPT-5.2 won this one but the forced-dissent round where Gemini had to fight back is the best part. It changed my mind twice while reading it.
Curious what this sub thinks — does forced dissent actually surface better reasoning, or just make them hallucinate harder? I built a public leaderboard tracking which model wins most often by category (career, finance, product) here: raresightai.com/leaderboard
Free to try, no signup.
u/Aelevel — 20 days ago