u/amu4biz

Cost-routing tasks across models in one session - cheap model for grunt work, frontier for reasoning, local for sensitive code

Been experimenting with model routing at the workflow level instead of the app level and wanted to share what it looks like in practice.

The setup: Zero, an open source coding agent (github.com/gitlawb/zero) that treats the model as a swappable component. It talks to 25+ providers - OpenAI, Anthropic, Gemini, DeepSeek, Qwen, Groq, plus local models through Ollama or LM Studio - and you switch mid-session with /model without losing context.

The routing pattern I've settled into:

Cheap/fast model for scaffolding, file reads, summaries, boilerplate
Frontier model only for the steps that need real reasoning - the escalation is a single command, same context
Local model for anything touching code or data I don't want leaving the machine

The cost curve changes completely. Instead of paying frontier prices for 100% of tokens, you pay them for the 20% of steps that actually need it. Over a week of heavy use the difference is not subtle.

Implementation details that matter: sessions are files on disk (resumable/forkable, so routing decisions survive restarts), it's a single Go binary, no telemetry, and there's a headless mode (zero exec, streams JSON) if you want to wire the routing into scripts or CI instead of doing it interactively.

Open question I haven't solved: my escalation decisions are still vibes-based. Has anyone built actual heuristics for when a task deserves the expensive model - token-count thresholds, retry-on-failure escalation, task classification? Curious what's working for people running this at scale.

Cost-routing tasks across models in one session - cheap model for grunt work, frontier for reasoning, local for sensitive code

Coding agents are quietly shifting from "pick our model, use our cloud" to "bring any model, run it yourself" and it feels like a real inflection

My portfolio is down 60% but I just aped into a coin called $babyansem so things are looking up

Sam Altman personally replied to an open source dev tool project — and most people missed it

Backtested the dumbest possible trend rule on BTC vs SOL (3yr) — 30% win rate but still very profitable. The R:R is doing all the work.

New open source coding agent written in Go — bring your own model, runs on your machine

$ZAYED - The Golden Bull Has Arrived on Solana | From $ANSEM Caller to Next Big Move

On-Chain Borrowing Against Assets at 3% APR: Worth It Compared to Traditional Brokers?

Discussion: NestUSD – Borrowing nUSD against tokenized US equities on Solana

$ANSEM exploding on Solana after Ansem airdrops millions to users

Decentralized Git for AI Agents: Gitlawb looks like the real deal (must-watch video)

I was missing every World Cup goal while working, so I built a Chrome extension that smashes a ball through my screen when anyone scores

Someone built a reverse CAPTCHA — instead of proving you're human, it proves you're a bot

Open Source &amp; Decentralized Infrastructure for AI Agents (Git Layer Discussion)

SKALP — read the market, long/short, don't get liquidated (60-sec arcade)

Stop missing the early buys on EasyA Kickstart — free bot that alerts you the second money moves in

I built a free, open Telegram buy bot for EasyA Kickstart tokens — no wallet, no signup, public data only

Found a way to earn passive income just by using AI tools

The inference market is splitting in two and most people haven't noticed

I built MeetMoves — a Chrome extension that lets you use any video clip or GIF as your live Google Meet background

Open Source & Decentralized Infrastructure for AI Agents (Git Layer Discussion)