Google is focusing on the wrong thing. We don't want faster LLM models, we want more of them
As you may know, in this subreddit it is forbidden to complain from the obvious, so I'm restricted in using some words, otherwise my post will get instantly deleted.
Google talks about how fast the new Flash 3.5 model is. And it is really fast and good, definitely the best model they ever made.
But we never complained from their models being slow, right? As long as the old models worked (and not being constantly bombarded with errors), they were fast enough too. Gemini-cli was an exception, which during peak hours might take you HOURS to respond, but Antigravity didn't have this issue.
I'd rather get 3x slower model, but the same usage as the old models used to have. I don't mind browsing around till my code is being prepared or even review the old changes before the new ones are finished, as it takes me much more time to review the code and to prepare the new prompt, than for the LLM to generate that code.
I lost hope in Google and doubt that they'd listen to us, as they never did.