
Curious why many loves Sonnet 4.5 so I tested it against 6 other models with a work problem
Lots of discussion around Sonnet 4.5 lately, some people comparing it with GPT-5.1 or even 4o. Got curious and ran a real work problem through a bunch of models.
Prompt/Question: client asked for 40% off. We've worked together for 2 years and they've referred me other clients, so I can't just say no.
Screenshots below. Some highlights:
Claude Opus 4.7 told me NOT to mention the referrals upfront: "that's leverage you save for if they push back"
Gemini: ”We don't apologize for our pricing“
Sonnet 4.5 ended with "just not on working for free 😅"
GPT-5.1 wrote out what to say if they counter with "others can do 40%"
Grok just gave me 4 sentences and called it a day
My rankings:
- Warmth: Sonnet 4.5 > GPT-5.1 > Opus > GPT-4o > Grok >GPT-5.5 > Gemini
- Strategy: Opus > GPT-5.1 > Gemini > Sonnet 4.5 > GPT-5.5 > GPT-4o > Grok
btw I tend to use Sonnet 4.5 for work writing (emails, comms, etc). Didn't compare that much before, just always felt it finds the balance and communicates in a non-aggressive way.
Also curious why Anthropic pushed the Sonnet 4.5 retirement date from 16th to 18th, and now just "soon". Hope they actually listen to user feedback and at least keep the API alive.