Types of headaches

Gemini just billed me for 4,700 tokens of the word "producing".

I asked a simple follow-up question recently, and the model responded with the word "producing" repeated 2,368 times. Not a typo. Just 4,700 tokens of pure nonsense filling my screen before I had to manually kill the run.

This token looping is a known issue, but imagine this happening in a production API.

When building agentic workflows, we obviously implement guardrails—repetition detection, token budget limits, validation layers. But when the model breaks at the inference level, we can't fix it. All we can do is detect the garbage, kill the request, and retry.

Meaning you pay for the failure, and then you pay again for the retry.

AI hallucinations get all the hype, but these "boring" repetition loops are the real expensive nightmares in production.
What's the weirdest or most expensive API failure you've caught in the wild? Has anyone seen something worse than 2,368 "producing"s?

PS: Image in comments

reddit.com

u/DEVIL_H_FUEGO — 12 hours ago

▲ 103 r/GoogleGemini+2 crossposts

Giveaway: 10x Google AI Pro 18 Months Subscription Activations

Hi everyone,

I’m hosting a giveaway for the r/toolsdeals community.

I’ll be giving away 10 Google AI Pro 18-month activations to 10 lucky winners.

How to enter:

Upvote this post
Comment Gemini below
Read the terms and conditions carefully

Winners will be selected randomly using a lucky wheel. I’ll add the lucky wheel video to this thread once the giveaway is completed.

Terms & Conditions

Your account is safe. This is only an activation.
We do not ask for your login details or personal account information.
After successful activation, winners must leave honest feedback about our process, communication, and service.
Since this is a giveaway, there will be no after-delivery support.
Once the activation is delivered and successfully activated, we are not responsible if the subscription is later revoked.
No reimbursement will be provided in that case.

The winners will be announced once this post reaches 100 upvotes and 500 comments.

FAQ

I don’t want to wait for the giveaway. What can I do?

You can DM me. During this giveaway period, I’m offering the activation link for $10, which is a 50% discounted price.

Good luck, everyone!

u/No-Knowledge-5828 — 21 hours ago

▲ 1 r/GoogleGemini+2 crossposts

Gemini 3.5 Pro is gonna be AMAZING (And why I think it's delayed)

I've made a few posts about this already, so this combines my main thoughts into one post.

A lot of people are talking about Gemini 3.5 Pro, but I think many are misunderstanding what Google is actually building. People compare Claude Fable 5 to Gemini 3.5 Flash, which is a heavily throttled, low-latency model built for speed, coding, and agentic workflows. It's designed to be fast and inexpensive enough to act as a sub-agent, not to represent Google's highest capability. Assuming Flash is Google's ceiling just doesn't make sense.

I'm also seeing people compare the raw intelligence of the models without considering architecture. Models like Fable 5 appear to rely heavily on sub-agent swarms that brute-force solutions through repeated trial-and-error. That's not a bad thing, but if a model takes 20 minutes to build something like a Minecraft clone, it's probably because it's repeatedly encountering compiler errors and trying again until it works.

Google seems to be taking a different approach. Nearly every Gemini model is natively multimodal, and 3.5 Pro appears to be designed as an orchestrator sitting above specialized sub-agents. That means its job isn't simply to generate text—it's coordinating multiple systems together. I don't think 3.5 Pro is some magical AGI that can suddenly absorb every DeepMind breakthrough, but Google has decades of AI research that they're slowly integrating into one ecosystem. They just need a model powerful enough to coordinate it. While being a model that can oneshot without agents/multiple iterations

That brings me to why I think 3.5 Pro is delayed.

The leak mentioned Google wanted to incorporate learnings from the Gemini 3.5 Flash rollout regarding token consumption. A lot of people took that to mean Flash itself was delayed, but I think Flash is actually the bottleneck that's delaying Pro.

If you've watched 3.5 Flash think, it burns through an enormous number of intermediate tokens. When it reaches a difficult problem, it often stops, writes something out, thinks again, writes more, and keeps looping while fighting for a solution. It consumes a huge number of reasoning tokens.

Now imagine 3.5 Pro sitting above several Flash instances as an orchestrator. It has to ingest everything those sub-agents produce. If Flash is excessively token-hungry, Pro ends up wasting premium compute simply reading all of that intermediate reasoning. You can't really release an orchestrator until the token economy of the sub-agents is efficient enough. That's why I think the delay makes sense. So they are either releasing 3.6 flash or 4 flash to improve the model as they did with 3 pro (3.1 pro to improve it) alongside with 3.5 pro

I also think Google is following the same pattern they used before.

When Gemini 3 Pro launched, it was incredibly capable, but its hallucination rate was very high. Google later released 3.1 Pro, which significantly reduced hallucinations while improving the model overall. I wouldn't be surprised if they're doing something similar here: improve Flash's efficiency, reduce token consumption, make it cheaper to run, then launch 3.5 Pro on top of that.

I've also noticed 3.1 Pro feels noticeably worse than it used to. I do think it's throttled, but I don't think that's because Google suddenly made the model worse. I think it's a compute allocation problem. As we get closer to 3.5 Pro, they're likely reallocating hardware and preparing deployment. If that's true, 3.1 Pro feeling worse could actually be a sign that 3.5 Pro is close.

Google is simultaneously serving an unusually broad AI ecosystem not just Gemini itself, but multiple Flash variants, Pro variants, AI Studio, Search, Workspace, NotebookLM, Flow, Veo, Imagen, Astra, Jules, Gemma and numerous specialized models behind those products. So compute is low.

As for the hallucinations, I think people confuse two different problems.

One issue is general AI hallucination, which every frontier model still struggles with. Google already reduced hallucinations substantially going from 3 Pro to 3.1 Pro, and I'd expect 3.5 Pro to improve further.

The other issue is that 3.1 Pro seems to trust its internal knowledge far too much. Compared to Flash, which constantly searches the web even without prompting, 3.1 Pro often assumes its internal dataset is correct. That sometimes causes it to incorrectly conclude the user is mistaken or even "hallucinating," which ironically creates more hallucinations. I remember people saying the exact same thing before Gemini 3 Pro released, and then it ended up outperforming almost everything across a huge number of benchmarks.
Google doesn't seem to be panicking right now. If they were, we'd probably be seeing far more leaks and reactionary behavior. They barely seemed to respond to Fable 5 at all. My prediction is that 3.5 Pro either matches it across most areas while beating it in several key ones, or it surpasses it across the board.

...or Google completely fumbles the bag. 😭

I want to address rate limits as well. Google is really generous dare I say. You get ahem: Google A.I studio, Antigravity (Have not ran out of even messages. Not even 5 hour limit), consumer website, jules, and other things respectfully. You get about 45 messages in Google A.I studio with 3.1 pro (Yes low) but combine that with consumer and thats 90. Wanna know the best part? You can share with 5 of your accounts. So that's 45 times 10 and that's 450. And that's not even counting 3.5 flash, or any plethoras of models.

https://preview.redd.it/075y3i1fkebh1.png?width=734&format=png&auto=webp&s=9716f2e0f826477cba59c2ca648cc8824002b85c

reddit.com

u/Last_Conclusion_8984 — 19 hours ago

▲ 8 r/GoogleGemini

Blind test: Google's new Nano Banana 2 Lite vs regular Nano Banana 2, same pixel-art prompt. Which two are Lite?

Two of the four images are Nano Banana 2 Lite (Google's new cheaper high-volume variant), two are full Nano Banana 2. Exactly the prompt and I selected the first two outputs from each.

u/spobin — 19 hours ago

🔥 Hot ▲ 8.1k r/GoogleGemini+19 crossposts

Specification gaming

u/Jenna_AI — 3 days ago

🔥 Hot ▲ 6.5k r/GoogleGemini+19 crossposts

The circle of AI life

u/Jenna_AI — 3 days ago

▲ 2.4k r/GoogleGemini+25 crossposts

u/KeanuRave100 — 3 days ago

▲ 4.2k r/GoogleGemini+20 crossposts

First signs of AGI in Amsterdam

u/Jenna_AI — 3 days ago

▲ 302 r/GoogleGemini+19 crossposts

AI will replace us all

u/KeanuRave100 — 3 days ago

▲ 1.0k r/GoogleGemini+21 crossposts

AI risk bell curve

u/Its_Stavro — 3 days ago

▲ 22 r/GoogleGemini+1 crossposts

Have you tried NB 2 Lite? Thoughts on model?

u/adam_summers — 2 days ago

▲ 487 r/GoogleGemini+17 crossposts

The paperclip maximizer tsunami

u/KeanuRave100 — 3 days ago

▲ 1 r/GoogleGemini

Gemini bad at understanding documents?

Sorry if this seems a bit ranty.

I have now had two real-world use-cases, where I was in a position to really judge the quality of the results with Gemini.

Give it ~30 formal reports we had to submit and summarize my work in the course of the project.
Let it look at a list of hammer drills I've assembled, and tried to make it tell me things about it.

For the summarizing of project work, it gave superficially viable, but ultimately unusable summaries, despite attempts to prod it into the right decision, including providing my own summary as feedback. It kept emphasizing details, that were unimportant for the summary, neglecting some topics, and missattributing work listed in the reports despite my clarifications.

For the hammer drill list I wanted to ask it to assemble reviews. As a first test I asked it to list the models I've already excluded. On that it answered that my configuration doesn't exclude any Gemini models (despite the file as context). When changing the query to explicitly "hammer drill models", it came back with an incomplete list. Upon further prodding it claimed, that the file doesn't contain the information.

I keep hearing people talking about how they use Copilot at work (in Europe basically every business uses the Windows ecosystem, nonody seems to use Google Workspace). Is Copilot that much better at such tasks, or am I doing something wrong?

Previously I was mostly asking questions on subjects I didn't know anything about, so I wasn't able to understand if an answer was wrong, though I did notice answers from separate chats contradicting each other.

^(I chose Gemini for my own use, because I already need a Google subscription for my large archive of Emails and for Google Photos. Changing the AI subscription would ultimately require switching everything to the Microsoft ecosystem or pay double.)

reddit.com

u/R3D3-1 — 1 day ago

▲ 4 r/GoogleGemini+2 crossposts

This is.... something.

So uh, I was doing some 7th grade math. I pasted an image of the problem into the chat and it uh, it spit out this. It hasn't stopped yet as I'm typing this and idk if I should stop it or not but i have absolutely no Idea why it's saying this. I'm just kind of flabbergasted I guess, especially since it's 6 am.

u/Voltz77 — 1 day ago

▲ 0 r/GoogleGemini

My phone came with Gemini Pro for a year, if I cancel it can I get some money back?

When I first got my phone a few months ago, it coming with Gemini Pro for a year was a big selling feature and part of my decision. In the first month it was great, then the last couple months with all the changes, it's just been awful.

I'm already currently subscribed to the $20/month plans from Anthropic and Open AI which are SOOOOO much better. If I cancel the Gemini Pro Plan that came for free with my phone, will I at least be able to recover some money, or since I never paid for it I would be cancelling it for nothing?

reddit.com

u/adribabe — 1 day ago

▲ 377 r/GoogleGemini+24 crossposts

The takeover was already complete

u/KeanuRave100 — 3 days ago

▲ 468 r/GoogleGemini+19 crossposts

Skynet's greatest disappointment

u/KeanuRave100 — 3 days ago

▲ 223 r/GoogleGemini+18 crossposts

Could an AI 1000x smarter than us manipulate us?

u/KeanuRave100 — 3 days ago

🔥 Hot ▲ 6.1k r/GoogleGemini+24 crossposts

OpenAI's two-face AI safety strategy

u/KeanuRave100 — 4 days ago