u/DullContribution3191

after almost getting a surprise bill i started logging every interaction by model and task type. ran this for 14 days on my telegram + discord agent

heartbeats (every 30 mins, 672 total)... 38% of my token usage. was running on opus. genuinely insane waste for a status ping

file reads and summaries... 29% of usage. also on opus. flash handles this identically

actual conversations where model quality mattered... 22% of usage

complex tasks where opus was genuinely better than flash... 11% of usage

so 67% of my spend was on tasks where the cheapest model (v4 flash at $0.14/M) would have been identical quality to opus ($6.75/M effective after tokenizer)

the fix... switch your primary model to deepseek/deepseek-v4-flash in your openclaw.json under agents.defaults.model.primary. then use /model anthropic/claude-opus-4-7 mid-session only when you actually need it for somthing hard. switches instantly, no restart, same session. type /model deepseek/deepseek-v4-flash when youre done with the hard part and go back to cheap

went from ~$170/month to about $35 with this approach. the quality difference on heartbeats, file reads, and simple questions is genuinley zero

honestly the most frustrating part was spending 2 weeks manually logging everything just to find this out. i run my gmail agent on betterclaw free tierwith BYOK and they recently added an update that shows exactly how your api key is spending per task which is genuinley a great update... caught my heartbeat waste there instantly instead of 2 weeks of manual tracking. but yeah switching your primary to flash and /model-ing up to opus only when needed is the move

reddit.com
u/DullContribution3191 — 15 days ago
▲ 58 r/ClaudeCowork+2 crossposts

opus 4.7 is now default on max and team. but claude code v2.1.100+ is silently burning 40% more tokens

two things happened this week that ppl need to know about

first... opus 4.7 is now the default model on max and team premium plans. xhigh effort level is the recommended setting for coding work. you can adjust with /effort in your session. this is genuinley a step up for agentic coding

but second... claude code v2.1.100+ added roughly 20,000 invisible tokens to every request. confirmed via proxy logs by multiple users. your quotas burn ~40% faster than expected. the current version is v2.1.126 (may 1) which fixed the opus 4.7 /context percentage bug but did NOT fix the underlying token inflation

the community is calling it "tokenocalypse" lol. workaround is downgrading to v2.1.34 or reinstalling via npm instead of native binary (the native binary has a TTL/cache regression that shrinks prompt-cache TTL from 60 mins to 5)

check your version with claude --version. if youre on v2.1.100+ and hitting limits faster than expected... this is probaly why. not your prompts getting longer, the tool itself is using more tokens behind the scenes

reddit.com
u/DullContribution3191 — 15 days ago
▲ 39 r/better_claw+2 crossposts

ok seeing a lot of "4.24 broke everything" posts today so heres what actually works because i went through this yesterday on 2 of my self-hosted instances before giving up and moving them to betterclaw

the problem: the postinstall-bundled-plugins.mjs script deletes files it shouldnt (#72042). your npm package ships with 4,116 js files. after the postinstall script runs only 2,499 remain. 1,617 files just gone. thats why you get ERR_MODULE_NOT_FOUND for restart-sentinel services codex/provider and a bunch of other modules

this has been happening since at least march (#54790 same pattern on 3.24, #53818 on 3.22, #61787 on 4.3-4.5, #70343 on 4.21). its a recurring packaging bug that never gets properly fixed

what actually works:

step 1: stop your gateway completely first. this is important because if the gateway is running while you reinstall it creates hash mismatches that make everything worse

on mac: launchctl unload ~/Library/LaunchAgents/openclaw.launch-agent.plist on linux systemd: systemctl stop openclaw or systemctl --user stop openclaw manual: just kill the node process

step 2: fully remove the broken install npm uninstall -g openclaw npm cache clean --force

step 3: install 4.23 specifically (last stable version before the postinstall bug) npm install -g openclaw@2026.4.23

step 4: restart your gateway on mac: launchctl load ~/Library/LaunchAgents/openclaw.launch-agent.plist on linux: systemctl start openclaw or openclaw gateway restart

step 5: verify openclaw status --deep

your config, workspace, memory files, and cron jobs are all safe in ~/.openclaw/ which npm uninstall doesnt touch

if doctor loops forever:

dont run openclaw doctor --fix on a broken 4.24 install. the doctor itself depends on the missing dist files so it loops trying to repair something it cant load. uninstall completely first then reinstall 4.23

if youre on pnpm:

some people report that pnpm add -g openclaw@2026.4.24 works where npm doesnt because pnpm handles the dependency hoisting differently. worth trying if you want 4.24 features specifically. but honestly 4.23 is fine

if you dont want to deal with this ever again:

this is the 5th time in 2 months that an openclaw update has self-destructed during installation (3.22, 3.24, 4.3, 4.21, 4.24). the root cause is the postinstall script and nobody has fixed it properly. every version just patches the specific failure without fixing the architecture

i moved my production agents to betterclaw after this happened to me twice. they test updates before pushing them and i havent touched a terminal for agent maintenance since march. free plan to try. but even if you stay self-hosted: pin your version and never use @latest

the 4.25-beta might fix this (the beta notes mention plugin install fixes) but its beta so dont run it on production

anyone find a different fix?? curious if theres a way to make 4.24 work without the pnpm workaround

u/DullContribution3191 — 10 days ago

spent 2 hours debugging a 400 error this morning before finding the answer buried in the migration docs. opus 4.7 removed temperature, top_p, and top_k as api parameters. any code that passes them now returns an error. no deprecation warning. no graceful fallback. just broken.

also thinking.budget_tokens is gone. if you were using that to cap reasoning costs, you need to switch to the effort level system (low, high, xhigh, max). and thinking.display now defaults to "omitted" which means your ui shows blank where reasoning used to be unless you explicitly request "summarized."

three breaking api changes in a single model release with no transition period. if you have production code calling claude, check your parameters today. the errors look like random api failures unless you know what changed.

reddit.com
u/DullContribution3191 — 1 month ago