u/Jsquared534 — reddlx

I have been testing local models with Continue and Cline. I almost literally gave up on using agents after June 1st because of how terrible the experience was. But, i figured out that was just Continue being so buggy with the latest Qwen releases. Cline has been great on an M5 Pro Macbook Pro with 48gb ram.

Cline shows token usage for each session. I've went through three sessions in roughly 2 hours this evening. A total of 3 million tokens, roughly 40k of which were "output tokens" as far as what the Frontier model APIs would say. These were not massive features. My workflow is intentionally small features. That would be the entire $10 per month plan burned through in 2 hours. Even if you look at that very conservatively and say that's the maximum daily cost, you're still looking at roughly $300 a month worth of API usage. That's a non-starter for me.

I've adjusted my workflow to use the GUI web interface for Claude to read and enhance context files about the project-overview and current feature, as well as some coding and ai interaction context, and then using Qwen 3.6 35b, which runs on the Mac without constant memory pressure as long as you close xcode when it's not in active use. It's been actually just as performant as Claude Sonnet 4.6 was. Keeping in mind that I'm having the Claude web interface do a lot of the thinking on the front end based on my original engineering plans, and then Qwen is doing it's thinking based on the updated context instructions I paste into it.