u/BarracudaNumerous824

I've been using Claude Code (Pro plan) for my side projects for a while now, and overall it's been great. But lately, I feel like I'm not fully utilizing the subscription because of the 5-hour usage windows.

The bigger issue is that I don't always get long uninterrupted blocks of time to code, so a lot of those 5-hour slots end up partially wasted. Then, when I do get time for a deep coding session, I sometimes hit the limits faster than I'd like.

Because of that, I've been considering moving part of my workflow to open-source models like GLM 5.1, MiniMax 2.5, and DeepSeek V4 Pro for coding tasks, using Pi as the agentic harness. As a developer, I already prefer steering the model closely and keeping tighter control over the generated code, so I'm wondering if this setup might actually fit my workflow better.

DeepInfra's API pricing and latency also look really appealing, which is why I'm considering it as the provider.

Curious if anyone here has tried a similar hybrid workflow:

Claude/OpenAI for higher-level reasoning
OSS models for iterative coding tasks
Pi or another agentic framework for orchestration

Also, if you've used DeepInfra extensively for coding workflows:

how reliable has it been?
how's the latency under sustained usage?
any issues with rate limits, downtime, or context handling?
does it actually end up being significantly cheaper in practice?

Would love to hear real-world experiences, especially from people who moved away from Claude Pro/Max because of usage caps.

Anyone running Pi CLI with DeepSeek/GLM via DeepInfra for coding tasks?