I "reverse-engineered" Gemini Pro's new usage limits. Here's what $20/month buys you.
Google won't tell you how the new limits work - just a percentage bar. So I ran identical prompts in two parallel continuous chats - one in the Gemini app, one in AI Studio. Same model (3.1 Pro), same thinking level (max), same documents, same prompts. One continuous chat in each, never refreshed.
AI Studio shows input tokens, output tokens, and total. It does NOT show thinking tokens. On max thinking, these are likely massive - but completely invisible. Keep that in mind for every number below. The AIStudio tokens are cumulative
Also, keep in mind that the usage limits in the app are FLUID - Google has set out a limited overall pool of daily compute cost for the Pro subs. If too many people use it, they will cut you off after 1 prompt. This gives Google STATIC and PREDICTABLE compute cost - no matter the usage, compute will cost them a preset amount. The entire risk of the usage rate IS ON THE USER. It is you, who is going to be cut off your service if too many people use it today.
If Google decides to give out tens of millions of free Pro subs, guess who is going to pay for it? : ) You are going to pay for it - by being cut off of the service you are paying for.
Prompt 1 - uploaded a 29-page PDF, asked for a 10-page analysis. Input: 16,295 | Output: 4,154 | Total visible: 20,449 | Gemini app: 9% of 5hr window
Prompt 2 - follow-up in the same chat, asked for a personalized take. Input: 16,320 | Output: 6,837 | Total visible: 23,157 | Gemini app: 13% (+4%)
Prompt 3 - attached two large documents (17k + 163k tokens), asked for analysis. Input: 191,636 | Output: 10,531 | Total visible: 202,167 | Gemini app: 33% (+20%)
Three prompts. 202k visible tokens. 33% of my 5hr window. Thinking tokens on top - uncounted, unshown, but clearly eating quota.
The API cost equivalent for all three prompts: $0.51. That means Google gives Pro subscribers roughly $1.50 worth of compute per 5-hour window. For $20/month. And won't even show you what's being counted.
I also checked DevTools on the Gemini web app. Zero token data in the network responses. Google tracks everything server-side and gives you a percentage bar with no numbers.
This method is flawed, and very imperfect I know - the custom instructions in my Gemini app, the 3.1 Pro in app does not equal 3.1 Pro in AIStudio etc etc. But it gives us a picture.
If anyone has a better metric or method, please share it.