u/ChampionshipNo2815

▲ 2 r/WOZCODE+1 crossposts

How WOZCODE cuts file reading tokens by 40 to 60% without losing any useful information

Something I found interesting digging into how WOZCODE actually works under the hood.

When a coding session reads a file it normally gets the whole thing dumped into context. Every function body, every implementation detail, every line. The model has it whether it needs it right now or not.

WOZCODE does something called AST truncation instead. It parses the file structure and stubs out function bodies while keeping types, exports, and signatures fully intact. So the model gets the shape of the code without the implementation details it does not need yet.

40 to 60% fewer tokens per file read. And because every token in context gets re-ingested on every call after that, a smaller read early in the session stays smaller across everything that follows. The savings stack up across the whole session not just at the point of reading.

The implementation details are not gone either. They load when the model actually needs to work inside a specific function. It is just not front-loading everything upfront when the model only needs to understand the structure.

Found it to be a genuinely interesting approach to the context bloat problem. Curious if others have looked into this or found other ways to keep file reads lean in longer sessions.

reddit.com
u/ChampionshipNo2815 — 18 hours ago
▲ 12 r/LLMDevs

Started measuring actual API call counts on my Claude Code sessions. The numbers are worse than I expected.

Been integrating Claude Code into our engineering workflow for a few months. Started noticing the costs were higher than made sense for the tasks we were running so I actually sat down and traced what was happening.

For a straightforward refactor task, rename a hook across a few files, Claude Code runs Glob to find the files, Grep to filter, Read on each file individually, Edit on each file individually, then Read again on each to verify the edit landed. That is north of 10 API calls for something that structurally needs 2. And each call re-ingests everything before it as input tokens so the cost compounds across the session.

I started benchmarking specific tasks before and after any tooling change. Same prompt, clean state, real API usage fields, not estimates. The turn count gap on complex multi-file work was significant enough to change how we structure sessions.

Curious whether other engineering teams are actually measuring this or just absorbing the cost and moving on. Would be interested in what numbers others are seeing on real workloads.

reddit.com
u/ChampionshipNo2815 — 3 days ago

Started measuring actual API call counts on my Claude Code sessions. The numbers are worse than I expected.

Been integrating Claude Code into our engineering workflow for a few months. Started noticing the costs were higher than made sense for the tasks we were running so I actually sat down and traced what was happening.

For a straightforward refactor task, rename a hook across a few files, Claude Code runs Glob to find the files, Grep to filter, Read on each file individually, Edit on each file individually, then Read again on each to verify the edit landed. That is north of 10 API calls for something that structurally needs 2. And each call re-ingests everything before it as input tokens so the cost compounds across the session.

I started benchmarking specific tasks before and after any tooling change. Same prompt, clean state, real API usage fields, not estimates. The turn count gap on complex multi-file work was significant enough to change how we structure sessions.

Curious whether other engineering teams are actually measuring this or just absorbing the cost and moving on. Would be interested in what numbers others are seeing on real workloads.

reddit.com
u/ChampionshipNo2815 — 3 days ago
▲ 2 r/WOZCODE+1 crossposts

I stopped trusting AI coding benchmarks and started measuring my own sessions instead

I kept seeing benchmark numbers from various tools but something always felt off. They all test on the same kind of controlled repos. Perfect little codebases that look nothing like the messy real-world stuff I actually work on.

So I started measuring my own sessions instead.

The process I landed on was simple. Pick a few tasks I run regularly. Run each one twice from a clean state. Compare turn count, cost, and time. No estimates, just real API usage data.

The results on my own code were different from every published benchmark I had read. Not wildly different but different enough to matter. And seeing the actual turn count on tasks I had been running for months without thinking about it was genuinely surprising. Some of them were way more expensive than I assumed.

I've been doing this before trying any new tool or workflow change now. Takes about five minutes and tells you way more than any benchmark someone else ran on a demo repo.

reddit.com
u/ChampionshipNo2815 — 3 days ago
▲ 62 r/WOZCODE+1 crossposts

I figured out why I keep hitting my Claude Code session limit before lunch. It's not what I thought.

Been on Max for a while. Kept hitting my limit mid-task and assumed I was just doing too much. Turned out I wasn't doing too much. My tools were just incredibly inefficient under the hood.

Traced a single refactor task. One rename across a few files. Claude Code ran 161 turns to finish it. Every read, every grep, every edit is its own API call. Each one re-ingests everything before it as input tokens. By turn sixty you're paying context cost on the entire session history.

Once I understood that I started looking for ways to batch the calls. Found a Claude Code plugin that collapses search and read into one call and stacks all edits into one roundtrip. Same task finished in 52 turns.

Didn't change my plan. Didn't change my model. Just changed how many roundtrips each task makes.

reddit.com
u/ChampionshipNo2815 — 5 days ago
▲ 5 r/WOZCODE+2 crossposts

I traced every API call Claude Code made during a refactor. Here's what I found.

Been using Claude Code heavily for the past few months. Started noticing sessions on larger codebases felt sluggish in a way I couldn't explain. The model wasn't struggling. It just felt... slow between actions.

So I actually traced what was happening under the hood during a straightforward task renaming a hook across three files.

This is what Claude Code ran:

Glob to find the files. Grep to filter which ones had the hook. Read on file one. Read on file two. Read on file three. Edit on file one. Read on file one again to verify. Edit on file two. Read on file two again to verify. Edit on file three. Read on file three to verify.

That's 11 calls. For one rename.

The part that surprised me most each call re-ingests the output of everything before it as input tokens. So by call 11 you're paying input cost on the entire session history. The slowness wasn't latency. It was context ballooning across a dozen roundtrips.

The fix is simple in theory. Batch the reads into one discovery call. Batch the writes into one edit call. Same outcome, fraction of the roundtrips.

I ended up switching to a Claude Code plugin called WOZCODE that handles exactly this. Cut the same task from 11 calls down to 2. But even without it, just knowing what's happening under the hood changed how I think about structuring tasks.

Curious if anyone else has dug into their call counts or found other ways to reduce roundtrips.

https://reddit.com/link/1tdfqik/video/u27qzhjjz61h1/player

reddit.com
u/ChampionshipNo2815 — 9 days ago
▲ 8 r/WOZCODE+3 crossposts

I had no idea how much I was actually spending on Claude Code until I ran one command

You know that feeling when you open a SaaS bill and it's way higher than you expected? I had that with my AI API costs last month.

The frustrating thing was I couldn't even explain why. I was just... using Claude Code, building stuff, and somewhere tokens were piling up. I had no visibility into whether a single session cost me $0.10 or $2.

So I started digging into what was actually happening under the hood.

Turns out the problem isn't how much you're spending. It's that Claude Code sends a new API call for almost every single action. Context loading, tool calls, follow-ups it adds up in ways that are completely invisible to you while you're working.

Once I understood that, batching the calls was obvious. But the part that surprised me was realizing I had no idea what I'd been spending before I fixed it. No baseline. No comparison. Just vibes.

The thing I wish existed from day one: a simple command that shows you session vs lifetime how many calls were made, how many tokens moved, what it translated to in dollars. Real numbers from your actual usage, not estimates.

For anyone else building on Claude Code, do you actually track what you spend per session? Curious if this is just me or if everyone's flying blind here.

u/ChampionshipNo2815 — 10 days ago
▲ 4 r/WOZCODE+1 crossposts

I had no idea how much I was actually spending on Claude Code until I ran this benchmark

Been using Claude Code pretty heavily for the past few months. Bug fixes, refactors, shipping features the whole thing. At some point I realized I had zero idea what each session was actually costing me. I knew it was adding up but I could not see the breakdown anywhere. Tokens, turns, money per task all just running in the background.

So we built something to fix that. It is called WozCode and it runs as a plugin directly inside Claude Code. No separate tool, no demo repo, no fake workload.

You run /woz-benchmark inside your actual codebase, pick a real task you would normally do, and it shows you exactly what that task cost tokens used, turns taken, total spend. Then it shows you what the same task costs with WozCode optimizing the session.

https://reddit.com/link/1tbdr1i/video/mdcs6n9qnr0h1/player

The numbers were honestly kind of embarrassing when I first saw them. Not in a catastrophic way but in a "I cannot believe I did not know this" way.

If you are on Claude Code and curious what your workflow is actually costing you, just run it in your own repo. Takes 30 seconds to set up and you do not need to create an account to start.

reddit.com
u/ChampionshipNo2815 — 11 days ago
▲ 5 r/TestFlight+4 crossposts

Dino Initiative

This is Dino Initiative

We have 258k followers on Instagram & 95k on TikTok

We want to make this as a global brand like Hello Kitty.

This is a mental health app that’s more like your companion.

Please leave any feedback you have for this

testflight.apple.com
u/ChampionshipNo2815 — 8 days ago
▲ 3 r/WOZCODE+1 crossposts

Getting started with WOZCODE takes 30 seconds.

https://reddit.com/link/1taas41/video/6xwo0uskqj0h1/player

Getting started with WOZCODE takes 30 seconds.

If you're already on Claude Code, open your terminal and run:

claude plugin marketplace add WithWoz/wozcode-plugin

claude plugin install woz@wozcode-marketplace

Launch Claude Code, type /woz-login, authenticate in the browser, and you're live.

The only prerequisite is an active Claude subscription. No extra setup. No new workflow to learn. WOZCODE runs directly inside the environment you're already using.

From your very first session, every token used, every dollar spent, and every minute saved gets tracked automatically on your dashboard.

That's it. Two commands and you're in.

wozcode.com

reddit.com
u/ChampionshipNo2815 — 12 days ago
▲ 36 r/replit

I was invited to Replit HQ for lunch and I’m totally surprised

I was invited to Replit HQ for lunch by Zheng who’s the agent creator and I was totally surprised to see one of my apps that I build was on their office wall , I didn’t even knew I was one of the finalists. That is really sweet gesture by team. That honestly made my day, they are really amazing I had really good conversations with Milos who’s the Replit mobile app builder. It’s really an amazing experience

u/ChampionshipNo2815 — 16 days ago

I was at the Code for Claude event yesterday and there were two guys outside holding WOZCODE signs talking to people about token costs.

Surprised to see a bunch of founders and engineers kept stopping to talk to them.

One guy said his company burned through like $1M just on tokens.

Security eventually came over because a crowd started forming outside the venue 😭

AI companies are really fighting token wars in public now.

u/ChampionshipNo2815 — 16 days ago
▲ 10 r/WOZCODE+2 crossposts

We pulled up outside the Anthropic event with WOZCODE signs and somehow ended up in the middle of a token-cost therapy session.

People kept stopping to talk to us, and almost everyone had the same complaint: AI bills are getting insane.

One company told us they’ve burned $1M just on tokens.

Security eventually came over because too many people were gathering around us, which honestly made the whole thing even funnier.

Peak SF moment: standing outside an AI event, talking to founders about token spend, while holding signs about making AI cheaper.

Token wars are getting real.

u/ChampionshipNo2815 — 16 days ago
▲ 11 r/AIDiscussion+5 crossposts

I was just looking over some benchmarks of WozCode vs vanilla Claude Opus and it’s honestly wild how much money we are actually wasting. If you look at the raw stats, WozCode is like 67% cheaper and finished the whole test suite way faster, I ran this benchmarking on my current repo.

u/ChampionshipNo2815 — 24 days ago