u/BookwormSarah1

Cut my LangGraph agent from $300/day to $63 by routing boring sub tasks off Opus 4.1

I've been running a fairly typical LangGraph agent that does research, writes code, and deploys. The loop was eating around $300 a day on Opus 4.1, and most of those calls weren't hard reasoning. They were things like reading a file, summarizing a log, or calling a search tool and reformatting the result. Pure overhead that happened to run on the most expensive model in the stack.

So I split the agent into two tiers. Hard sub tasks (architectural decisions, debugging unfamiliar code) still hit Opus 4.1. Everything else, the routine tool calling and summarization work, now goes through a cheap default model. For the past week that default has been a mix of DeepSeek V4 Pro and Tencent Hunyuan Hy3 preview, with the Hy3 preview handling most steps that involve many tool calls.

The routing lives in a LangGraph ConditionalEdge. The router node inspects the task metadata and branches accordingly. Something like:

builder.add_conditional_edges(
"router",
route_task,
{
"hard": "opus_node",
"cheap": "hy3_node",
},
)

The route_task function checks if the step touches more than three files in an unfamiliar repo or asks for an architectural decision. If so, it hits Opus 4.1. Otherwise, it goes to the cheap tier.

I run the cheap tier on a refurbished Mac Studio M2 Ultra with 192GB of unified memory. Cost me around $5,500. The official deployment path from Tencent is vLLM or SGLang on eight H200 class GPUs, which isn't happening in a home lab. The Apple Silicon route works because the 4 bit quantized weights land around 165GB and fit in unified memory with some headroom. Setup was conda plus the community MLX port from Hugging Face. Hours of fiddling, not a clean afternoon. Throughput lands around 5 to 12 tokens per second depending on context length. That sounds slow, but most of my agent steps spend their wall clock time waiting on tool execution anyway, so it doesn't bottleneck the loop. I'd like to try the 8 bit MLX build once someone publishes it, mainly to see if reasoning across files gets stronger.

The model itself is a 295B MoE with 21B active parameters per token and a 256K context window. For tool calling specifically, OpenRouter had it ranked first by call volume shortly after launch, which is what made me try it. In my own loop it's been reliable across workflows that run 200 to 300 tool calls without derailing.

Opus 4.1 costs roughly $15 per million input, $75 per million output. My daily burn is about 10M input and 2M output. Running everything on Opus lands around $300. Now I send 80% of that through the cheap tier at $0.18 per million input and $0.59 per million output. That part costs under $3. Opus handles the remaining 20%, roughly $60. Total lands around $63.

A concrete example from this week. I had the agent convert a long Notion export into a slide deck. That single run burned 4.2 million output tokens. On Opus 4.1 the output alone would have been over $300. The cheap tier handled it for roughly $2.50 and the slide quality was fine. Not Opus level on design taste, but completely usable for an internal draft. I wouldn't use it for a deck going to a client without a final polish pass.

Where the cheap tier isn't the right choice, and I still reach for Opus every time, is deep debugging across a codebase I don't know well, or tasks that need holding a very precise spec in memory across many turns. It also struggles with long chains of math proofs where one wrong step cascades. For those, the cost of Opus 4.1 is worth it.

Honestly the thing I overlooked at first was tool latency. I kept blaming the model for slow responses when it was actually a webhook I wrote that was sleeping on cold starts. Took me three days of staring at LangSmith traces to realize the bottleneck was a 2 second cold boot on a lambda, not the LLM. The routing pattern only started paying off after I fixed that.

reddit.com
u/BookwormSarah1 — 14 hours ago

spent years hiding my IP and encrypting DNS only to find out my browser fingerprint was identifying me the whole time

We focus a lot on VPNs and legislation to stop corporate tracking, but browser fingerprinting is basically a giant loophole for mass surveillance. I spent years hiding my IP address and encrypting my DNS queries, thinking my browser configuration was private. But I recently realized that browser fingerprinting completely bypasses these protections. The source is published on GitHub and I read the egress handler before running it locally. I used Leakish to audit my configuration and found that my Canvas rendering and AudioContext signatures were still completely unique. It seems my VPN was just hiding my location while my browser was still broadcasting a unique identity. How do you mitigate AudioContext tracking without breaking the web?

reddit.com
u/BookwormSarah1 — 1 day ago

Little changes that actually save money on Amazon

I started paying more attention to my Amazon purchases and realized I was almost never using coupon codes, just pasting expired ones and paying full price. Then I started using a plugin like coupert that automatically tries codes for me and tracks cashback, and it made a huge difference. Just waiting a day before any impulse buy and letting the plugin do its thing has saved me both time and money on stuff I was already buying.

reddit.com
u/BookwormSarah1 — 4 days ago
▲ 0 r/DIY

A few things i learned putting up a hardtop gazebo this weekend

Spent most of the weekend putting together a hardtop gazebo and a couple things would've saved me a lot of time if i knew earlier 😂. First, don't fully tighten the frame too early. Leaving everything slightly loose until the roof sections line up makes life way easier. Another thing that confused me: the roof panels looked like they were peeling after installation. Turns out it was just the protective shipping film and you're supposed to remove it afterward. I thought i had somehow scratched the panels already. Overall pretty happy with how it turned out though. Quick question for anyone who's built one before: did you leave the protective film on until the very end? I removed part of mine too early and it made handling the roof panels way more awkward during assembly. Mine's a Costway model if that matters.

reddit.com
u/BookwormSarah1 — 10 days ago

starting to realize most people don’t fail at side hustles, they just run out of mental energy first

Lately i've noticed almost everyone around me wants some kind of side hustle now. People talk about ecommerce, reselling, digital products, content creation, flipping stuff online… It feels like everyone wants an extra income stream because relying on one paycheck just doesn't feel safe anymore. But at the same time, most people never actually start anything. And honestly i understand why now.

After work your brain is already exhausted. You sit down thinking you're finally going to work on your side hustle, then somehow an hour disappears just researching, comparing ideas, checking products, watching videos, overthinking everything. I've been stuck in that cycle a lot lately. One night i'm looking at products, next night i'm checking suppliers, then i convince myself the margins probably aren't worth it anyway and close all the tabs. Few days later i repeat the exact same process again.

The weird thing is i don't even think fear of failure is the biggest issue anymore. Feels more like people are mentally overloaded all the time and don't have enough energy left after work to deal with uncertainty on top of everything else. Recently i've been trying to stop obsessing over finding the "perfect" side hustle idea and instead just pay attention to what regular people are already selling successfully. Feels more practical than endlessly brainstorming ideas that never go anywhere

still haven't fully started anything yet, but honestly this already feels more productive than spending months watching "best side hustle ideas for 2025" videos.

reddit.com
u/BookwormSarah1 — 11 days ago

Moin zusammen!

Ich habe mich in letzter Zeit intensiver mit KI-3D-Tools beschäftigt, die nativen 3MF-Support bieten, statt nur das klassische STL-Format. Meiner Meinung nach wird die Bedeutung von 3MF oft unterschätzt.

Klar, für einfache einfarbige Drucke macht es keinen großen Unterschied. Aber sobald Multicolor-Druck oder komplexere Slicing-Workflows ins Spiel kommen, bietet 3MF echte Vorteile: Farbbereiche, Materialzuweisungen und sogar die Ausrichtung auf dem Druckbett bleiben beim Export erhalten, anstatt direkt verloren zu gehen.

Das bedeutet zwar nicht, dass KI-Modelle dadurch plötzlich perfekt für präzise funktionale Bauteile geeignet sind. Aber der Schritt vom „generierten Modell“ zum „druckfertigen Projekt“ wird meiner Erfahrung nach deutlich flüssiger.

Mich würde eure Meinung interessieren: Findet ihr, dass 3MF im Druckalltag tatsächlich Zeit spart? Oder müsst ihr bei KI-Modellen ohnehin so viel manuell nachbearbeiten, dass das Dateiformat am Ende egal ist?

Ich freue mich auf eure Ansichten!

reddit.com
u/BookwormSarah1 — 25 days ago

Hi guys, Avata360 has been selling crazy well since launch and they just opened up official sales in Europe. Been looking into it more and this comparison scene is actually a pretty solid one to test with, its a harbor at sunset with a bunch of fishing boats parked out front, coastline and mountains way off in the distance, and the sun is right in the middle of going down so the dynamic range challenge is real.

First thing that jumps out is the overall color tone and the two drones are honestly going in pretty different directions here. Avata360 leans warmer, the golden tones around the sunset area are more concentrated and the sky transitions from warm to cool pretty smoothly. The A1 side runs a bit cooler overall, the sky is brighter and the warm colors around the sun spread out more, theres even a slightly pinkish tint to it. Neither one is really wrong tho its more just how each drone interprets color differently. Now the sun itself right, Avata360 keeps that flare way tighter so the edges are still kinda there, but the A1 just lets it bloom out more and it does look moodier but it eats up some of the highlight detail around the edges. The water follows the same trend too, Avata360 has this golden reflection thats more focused with the ocean looking deeper teal, and the A1 has a bigger reflection area thats more pinkish warm with the water being a lighter cyan, both look nice just in their own way.

Frame rate wise Avata360 running 8K/60 gives double what the A1 does at 8K/30 which means way more room for speed ramps or pulling frames in post, and for anyone who does a lot of slow mo or speed transitions thats gonna be a noticeable difference in the editing workflow.

Both drones got the sunset looking good not gonna lie, its really just a color style thing and how they deal with highlights, like it comes down to what kinda vibe fits the edit more and whether that 60fps is something thats actually gonna get used in post or not.

u/BookwormSarah1 — 28 days ago