u/Charming_Support726 — reddlx

Remix - A Web Framework for Building Anything

Rate/Quota Limits on Azure

Hi there,

I’ve been using OpenCode with Azure AI Foundry for quite a while now. It was a bit rough at the beginning since OC didn’t fully support the API yet, but it ran smoothly for months after that.

Azure is my provider of choice because I purchase Microsoft membership packs since years for my company. The included Azure credits are a paid component, not cheap or free—I use them for LLMs (helps paying API PAYG costs) and to host some support containers and VMs.

Last weekend, I had to redeem the next voucher, and massive problems started:

The new redemption process always creates a fresh subscription. Since you’re now limited to one voucher per person per year, this forces you to abandon already deployed resources.

This is the bigger issue: despite high rate limits configured on AI Foundry, MS seems to artificially throttle the Token-Per-Minute (TPM) throughput by a massive factor. My quota shows 10M TPM, but after about a day, my deployment gets automatically scaled down to 100k or 50k. It looks like it’s being adjusted by some background script. I’ve verified this multiple times. My paid subscriptions and old credit-based ones stay at their original high TPM limits. I mean even 100k are hardly usable when crossing 50-80k context size, because tool calling breaks.

Man, I actually paid for this stuff. Is anyone else experiencing the same? Just to give a clear example: Kimi K2.6 dropped to 5k TPM - this is not even usable for chatting!

P.S.: Posting here because I suspect more developers are affected. The main Azure sub is mostly populated by non-AI Foundry users.

EDIT: Meanwhile I received additional information. MS seems to be regarding the new subscription as a "new customer without payment history" - So agent coding seems to trigger the new usage limits system much harder then in my 8-year old sub with huge payment history. Thats called: „Temporary Rate Limit Adjustment"

reddit.com

u/Charming_Support726 — 7 days ago

▲ 0 r/AZURE

AI Foundry Rate-Limits

I think similar things happened to a few people. Last year I received 2 sponsorships running a year by buying products ( over $3k credits total until August)

Last year I enabled the first one - which was working til today. I used them from time-to-time for coding. Today the first sponsorship ran out, and for the second I was confronted with the new process, but I managed enabling it. But I needed to create a new subscription.

Quite happy that OpenAI was still working (and I did not need to get an approval again) I tried to finish a project. After some use I encountered an issue:

Although I provisioned every model with 10M rate-limit (and that shows in the model card) after a while K2.6, gpt-5.5 and gpt-5.4-mini are hardwired to 100k tokens per minute. Which is fairly unusable. My old subscription still is on 10M.

Any experience from people here - any chance to increase this ?

reddit.com

u/Charming_Support726 — 12 days ago

▲ 2 r/opencodeCLI

Previous Sessions gone or archived

Not sure if it is only me.

Upgraded Opencode to the latest version - and by the upgrade all the previous sessions have disappeared. I saw similar issues ( in a pile of 4k Issues) on Github. It seems to be "not a bug", but sessions being archived.

No way to unarchive? Seriously? Removing them from displaying or selecting w/o asking the user? I archiving can't be undone, it is deleting.

EDIT: After some additional searching I found it. It was not the archive problem from the github issues. It is a similar thing that happened to me in the beginning of the year. The DB-Migration-Skript didnt find the right version label in my environment and created a new DB under "~/.local/share/opencode/opencode-.db".

It did a backup of the new empty DB and created a link to the old db instead. Works.

I leave that post in-place, in case somebody else is facing the same issue.

reddit.com

u/Charming_Support726 — 13 days ago

▲ 4 r/opencodeCLI+1 crossposts

I gave Mistral Medium 3.5 a go in Opencode and don't receive any thinking tokens at all. I can't switch to a thinking variant either. The result when testing it are subpar. Anyone else noticing - solutions ?

I wanna get this one running, so there at least one European model in the list

reddit.com

u/Charming_Support726 — 18 days ago