u/Any-Explanation-9275

▲ 38 r/Bard

I "reverse-engineered" Gemini Pro's new usage limits. Here's what $20/month buys you.

Google won't tell you how the new limits work - just a percentage bar. So I ran identical prompts in two parallel continuous chats - one in the Gemini app, one in AI Studio. Same model (3.1 Pro), same thinking level (max), same documents, same prompts. One continuous chat in each, never refreshed.

AI Studio shows input tokens, output tokens, and total. It does NOT show thinking tokens. On max thinking, these are likely massive - but completely invisible. Keep that in mind for every number below. The AIStudio tokens are cumulative

Also, keep in mind that the usage limits in the app are FLUID - Google has set out a limited overall pool of daily compute cost for the Pro subs. If too many people use it, they will cut you off after 1 prompt. This gives Google STATIC and PREDICTABLE compute cost - no matter the usage, compute will cost them a preset amount. The entire risk of the usage rate IS ON THE USER. It is you, who is going to be cut off your service if too many people use it today.

If Google decides to give out tens of millions of free Pro subs, guess who is going to pay for it? : ) You are going to pay for it - by being cut off of the service you are paying for.

Prompt 1 - uploaded a 29-page PDF, asked for a 10-page analysis. Input: 16,295 | Output: 4,154 | Total visible: 20,449 | Gemini app: 9% of 5hr window

Prompt 2 - follow-up in the same chat, asked for a personalized take. Input: 16,320 | Output: 6,837 | Total visible: 23,157 | Gemini app: 13% (+4%)

Prompt 3 - attached two large documents (17k + 163k tokens), asked for analysis. Input: 191,636 | Output: 10,531 | Total visible: 202,167 | Gemini app: 33% (+20%)

Three prompts. 202k visible tokens. 33% of my 5hr window. Thinking tokens on top - uncounted, unshown, but clearly eating quota.

The API cost equivalent for all three prompts: $0.51. That means Google gives Pro subscribers roughly $1.50 worth of compute per 5-hour window. For $20/month. And won't even show you what's being counted.

I also checked DevTools on the Gemini web app. Zero token data in the network responses. Google tracks everything server-side and gives you a percentage bar with no numbers.

This method is flawed, and very imperfect I know - the custom instructions in my Gemini app, the 3.1 Pro in app does not equal 3.1 Pro in AIStudio etc etc. But it gives us a picture.

If anyone has a better metric or method, please share it.

reddit.com
u/Any-Explanation-9275 — 21 hours ago
▲ 100 r/GoogleGeminiAI+1 crossposts

I "reverse-engineered" Gemini Pro's new usage limits. Here's what $20/month buys you.

Google won't tell you how the new limits work - just a percentage bar. So I ran identical prompts in two parallel continuous chats - one in the Gemini app, one in AI Studio. Same model (3.1 Pro), same thinking level (max), same documents, same prompts. One continuous chat in each, never refreshed.

AI Studio shows input tokens, output tokens, and total. It does NOT show thinking tokens. On max thinking, these are likely massive - but completely invisible. Keep that in mind for every number below. The AIStudio tokens are cumulative

Prompt 1 - uploaded a 29-page PDF, asked for a 10-page analysis. Input: 16,295 | Output: 4,154 | Total visible: 20,449 | Gemini app: 9% of 5hr window

Prompt 2 - follow-up in the same chat, asked for a personalized take. Input: 16,320 | Output: 6,837 | Total visible: 23,157 | Gemini app: 13% (+4%)

Prompt 3 - attached two large documents (17k + 163k tokens), asked for analysis. Input: 191,636 | Output: 10,531 | Total visible: 202,167 | Gemini app: 33% (+20%)

Three prompts. 202k visible tokens. 33% of my 5hr window. Thinking tokens on top - uncounted, unshown, but clearly eating quota.

The API cost equivalent for all three prompts: $0.51. That means Google gives Pro subscribers roughly $1.50 worth of compute per 5-hour window. For $20/month. And won't even show you what's being counted.

I also checked DevTools on the Gemini web app. Zero token data in the network responses. Google tracks everything server-side and gives you a percentage bar with no numbers.

This method is flawed, and very imperfect I know - the custom instructions in my Gemini app, the 3.1 Pro in app does not equal 3.1 Pro in AIStudio etc etc. But it gives us a picture.

If anyone has a better metric or method, please share it.

reddit.com
▲ 16 r/Bard

Google Gemini Pro - new colors, new limits, same neglect

Do you enjoy getting drilled by a giga corporation? Then Gemini Pro is for you! Remember to bring your own lube though! (not included in the subscription)

Big news - Google just rolled out a new UI. New gimmicks, new colors. BUT it is not all useless!! Are you a paying customer (Pro or Ultra) in Europe? Do you want the Projects feature (Notebook) that already works everywhere else? Guess what - you get new colors, and just don't you even talk about Projects! Hey! New colors! Are you in Europe and want to use the Gemini in Chrome feature? Guess what - you get new colors!!

One more bonus - new usage limits. Pro users now get roughly 5-10x less usage than before. There's no transparent metric - you send a prompt and suddenly discover 40% of your 5-hour window is gone. For $20/month you get 4x what the free tier gets. Oh yeah, but you get 5 TB of storage! For what? Are we paying for an unlimited storage program, or actually useful AI service?

Instead of finally rolling out Projects - a feature the rest of the world's Gemini users have had and competitors shipped ages ago - we got new colors, a new font, gutted quota and a newbenchmaxxed model!! Still no Projects though. Still no way to organize chats at all.

Models - the 3.5 family is out now. Amazing. Enjoy it while it lasts - it will be usable quality-wise for a week (you get cca 5 prompts per a 5hr window to test it out). When the honey week is over, it gets its weekly scheduled lobotomy, until it is fully regarded.

Sorry for my rant. Long-term Gemini user getting let down by the inevitability of continued engooglification. Google is becoming Anthropic now - the difference being that Google is serving highly lobotomized husks of (actually very good when non-lobotomized) models benchmaxxed to the moon.

reddit.com
▲ 136 r/DeepSeek

Vision surely is one of the models of all times

Testing out these questions along with questionable haircuts on different models. DS on Vision failed the first one, but then succeeded the next one.

Both were done in a different chat. It makes me think whether there is some sort of memory in Deepseek app, or it is a pure coincidence? It almost seems like it has learned to react differently based on the preivous chat.

u/Any-Explanation-9275 — 3 days ago
▲ 2 r/cursor

Open AI compatible API in Cursor

Hey.

I have been experimenting with new models in my Cursor. I like Deepseek V4 models, and wanted to use them directly within an IDE so connected an OpenAI compatible API key in Cursor model settings.

Deepseek V4 Flash/PRO directly from Deepseek did not work - a short prompt worked well and I recieved an answer, but any longer prompt on any of the V4 models returned errors. It had to do something with reasoning content most likely.

Deepseek V4 Flash/Pro via Openrouter API key worked, but the result was simply sad. It was incredibly slow even on Flash (and I mean 15x slower than running the same with Sonnet directly from Cursor PAI offer), the token consumption was incredibly high (and I seemed to have hit almost zero cached) and the result was underwhelming.... So using DS4, I spent 2 hours and 20 USD building something that Sonnet had to fix anyway, and that Sonnet could have built in 5-10 mins for below 5 USD.

I have tried DS V4 models via Cline plugin in Antigravity (Openrouter API key), and the results were MUCH better in all aspects.

It could very well be a skill issue, or this path of connection is just not the way to go. But my experiments with DS V4 models via vanilla OpenAI compatible API has been a disaster.

Any ideas or advice about this? What have I done wrong?

reddit.com
u/Any-Explanation-9275 — 3 days ago
▲ 7 r/cursor

Tips for using Composer 2? New to Cursor

Hi.

I new to using Cursor - coming from Claude Code, Antigravity and most recently GLM coding plan.

I am familiar with most of the models in the selection, ut this is my first time working with Composer 2. I have read that it is based on Kimi models, and coding-specific. The speed of it is just astonishing (especially compared to GLM 5.1) and the results so far are quite good. But it seems like it needs to be prompted very specifically - compared to Claude models that are more forgiving to vague prompts.

Any advice for using the model apart from being very specific? How about "Auto" mode? Do you go Auto and trust Cursor to pick the model for you, or am I better off manually selecting either Composer or one of the API pack?

reddit.com
u/Any-Explanation-9275 — 12 days ago
▲ 18 r/ChatGPT

I've been active on a bunch of AI subs for a while now - Claude, Gemini, GPT, a few others. And I've noticed that roughly half the posts follow the same template: the model has gotten stupid, the limits are unbearable, image gen broke, I'm cancelling my subscription.

I've posted one of those myself (a rant about Gemini), so I'm not throwing stones.

But stepping back, I find it interesting as a phenomenon. Three years ago we were amazed that GPT could write a coherent paragraph. Now we're furious that a frontier model loses track of a 50-document analysis session. The baseline shifted silently step-by-step, and most of us didn't notice it happening.

So what's actually driving the volume of complaints? A few theories I have:

Models are genuinely getting worse. Companies are quietly optimizing for cost, frontier models are getting lobotomized as we speak, and the quality degradation is real. Meanwhile the official benchmarks still show the same numbers. Plausible - I've experienced it firsthand with certain models.

The user base exploded and diluted the signal (skill issue). A year ago these subs were mostly people who understood the technology's limits. Now they're mainstream. Higher expectations, less tolerance for failure modes that power users just worked around.

Bot farming. Rage content about AI tools performs well algorithmically. Some of it is manufactured.

We are simply spoiled. The technology advanced fast enough that we internalized each improvement as the new floor. We're now disappointed by things that would have seemed miraculous in 2022. (I find myself guilty of this too)

My honest read is it's all four in different proportions depending on the sub. But I'm curious what people here think. Especially whether there's any actual data on model quality regression, or whether this is mostly a perception problem driven by shifting expectations.

reddit.com
u/Any-Explanation-9275 — 20 days ago
▲ 53 r/ZaiGLM

Been on the GLM Coding Pro plan at $30/month for several weeks (I bought in a day or two before the price hike). GLM 5.1 is impressive - usage limits on Pro are great (at least for my use = 200-250 mil/week), and I've had it running as a custom model endpoint inside Claude Code which has been a fine setup after solving the BIG initial troubles to connect it.

Then the price jumped to cca $70/month. For a hobby setup that's hard to justify since I am not making money with it.

My use case: I build interactive stuff for myself (and for my job) - mostly Vite + React, interactive presentations in html (40 or so pages, multiple iterations, difficult ot one-shot it in a normal chatbot), some Three.js. Nothing I'm shipping commercially. The $30 tier was a good fit for this, $70 is not.

Two weeks left on my current plan, so actively looking for what's next. What matters to me:

  • $20–50/month range
  • Strong frontend output - React, Vite, Three.js
  • Either connectable via custom API endpoint to Claude Code or similar CLI harness, OR a solid bundled IDE like Cursor where the model access is already baked in
  • Reasonable usage limits for agentic sessions - I burn through tokens quickly

I've looked at Qwen Code, Kimi Code, Ollama Cloud and Cursor but haven't committed to any of them. Not married to any particular workflow - happy to switch IDEs entirely if the value is there.

What are you running, and would you recommend it for frontend-heavy hobby projects?

reddit.com
u/Any-Explanation-9275 — 21 days ago

Have any of you found any info when the Notebook porject integration in Gemini will be available in Europe for Pro and Ultra subscribers?

I have found some info about April 29th, which was yesterday - and all that happened was that it was rolled back even for the US paying users (at least according to what I read on Reddit today).

Is this the same thing as the mythical "Gemini in Chrome" which simply does not exist in Europe, and the 500 million Europeans get nothing again?

reddit.com
u/Any-Explanation-9275 — 22 days ago

I manage a portfolio of industrial chemical products (adhesives, sealants, edge banding materials) across EU markets. My day involves technical advisory to sales teams, handling warranty claims, writing product documentation, coordinating with manufacturers, and a fair amount of travelling to meet customers, our sales and distributors face to face.

I've been thinking about when exactly a job like mine becomes replaceable, and I can't land on a clean answer.

The case for "sooner than you think": I used to work for the biggest chemical distributor in the world. Product managers there spent a huge part of their time essentially buying and selling commodities - checking SAP codes, processing purchase orders, managing pricing across thousands of SKUs. That part of the job is already begging to be automated. An AI system with ERP integration could manage that workload better than a human, at any hour, across every market simultaneously.

The case for "not yet": AI customer service has been a visible disaster with many showcases (as far as I can see). The moment a situation gets slightly outside the script (a non-standard claim, an angry customer, a technical edge case) - it collapses. A siginificant chunk of my job is edge cases (the Gauss curve is rather flat in the middle). Every customer problem is specific, every application environment is different, and the relationship component is real. I travel.

But the middle of the Gauss curve still takes up significant part, so there is still a potential to reduce a team of 4-5 of me to just 1 to take care of the edge cases, and leave the rest for the silicon brain.

So I'm curious. Especially those of you who work in B2B, manufacturing, or technical sales adjacent roles:

Do you think about this? And what's your honest read on my and your timelines?

reddit.com
u/Any-Explanation-9275 — 24 days ago
▲ 108 r/IA_Italia+1 crossposts

I've been on Google AI Pro for about a year. Paid a full year in advance two months ago. Yes, in hindsight, monumentally stupid.

Let me be clear about something first: Google has an almost unfair structural advantage. They own Google Search, Google Scholar, Google Patents, Google Books, YouTube. No one else on earth has access to that volume of high-value, structured information. This should be an insurmountable moat. So how are they losing - badly - on almost every dimension?

Model quality

2.5 Pro was genuinely great. 3.0 Pro was very solid for a while. 3.1 Pro is a regression so sharp it's impressive. It doesn't just hallucinate - it pretends to perform the task. Give it 10 documents to analyze, it processes two and a half, fills the gaps with confident fabrication, then reports back lying that everything is done. When you catch it, it apologizes. When you tell it to redo it properly and avoid the mistake it just did, it does the exact same thing again. Even on a relatively fresh chat with minimal context. This isn't a quirk, it's a systematic failure mode.

Memory

The memory implementation is so broken it's almost satirical. You put into custom instructions that you work with chemical adhesives and in a fresh chat where you ask for a movie recommendation you get shit like - "since you are a chemical engineer working with adhesives, here are the top 5 zombie movies on IMDB". Meanwhile it has zero recall of anything you actually discussed in previous sessions - even with memory enabled. There's no way to inspect what it thinks it knows (aside from custom instruction that you have write yourself), no way to correct it meaningfully. Custom instructions either get completely ignored or become a firehose of contextually irrelevant nonsense.

Projects / Features

Projects were a highly anticipated feature - anticipated for over a year while competitors had well-functioning equivalents. What launched? NotebookLM integration for paid tiers. Except if you're in Europe on Pro. You simply get nothing, and no rollout schedule. The Notebook extension has supposedly been available even to some free-tier users globally ATP. Paying Pro subscribers in Europe? Nothing, I guess.

Gemini in Chrome? Sorry, not in Europe. Meanwhile Anthropic - whose product in Europe also subject to EU regulation - somehow managed to ship a fully functional, tab-context-aware Chrome extension. So "regulatory constraints" isn't the real answer.

Gemini desktop app for Windows? No, sorry, only on Mac. But there's a Google AI app that idles at 20% GPU usage and basically just runs searches.

Agentic features? You get Antigravity with your weekly quota that a serious 2-hour session burns through completely.

The actual question

I'm not asking this rhetorically - I don't understand how a company with Google's resources, data infrastructure, and talent density manages to fumble every single execution decision at this scale. The capability is clearly there - 2.5 Pro and 3.0 Pro proved it. It is shocking to me how a company of this scale and talent can screw up so monumentally on every single aspect of building the product they care so much about.

To people who have been paying long-term: what keeps you subscribed? Is there a use case where Gemini is best-in-class that I'm missing? Or is this just inertia?

reddit.com
u/Any-Explanation-9275 — 25 days ago