u/Electrical-Ad-9808

Check it out if your OpenClaw pay-per-token cost is skyrocketing, too. 😄

I've been running a few OpenClaw instances on a Raspberry Pi at home, mostly as my assistants.

After Anthropic's April 4 policy change cut OpenClaw off from Claude Code subscriptions, the Pi started burning pay-per-token API rates for work it used to do under my subscription. The Agent SDK credit pool (the new mechanism) drained faster than I could keep up with.

So I wrote otterly -

npmjs.com/package/otterly

It's a small npm package that exposes the local Claude CLI as an OpenAI-compatible HTTP server on localhost:11434. Anything that speaks the OpenAI chat completions API (Cursor, Aider, Continue, OpenClaw, plain curl) can point at it. Every request is billed to the Claude Code subscription you're already paying for.

The OpenClaw recipe is documented in the README. There's also an FAQ on the website that's honest about what otterly is and isn't, particularly around self-hosting, team use, and Anthropic's terms.

I'm posting this because OpenClaw users on Pro and Max plans are still paying the second bill needlessly. If that's you, this could be the solution

What's janky (because you'll find out anyway)

It piggybacks on your authenticated Claude Code session. That means it shares Claude Code's rate limits. If you hit the 5-hour cap in Claude Code, Otterly's down until it resets.
Some weird OpenAI params (logit_bias, n>1, seed) are stubbed.
First request after a long idle can be slow while the session re-auths. Subsequent requests are ~3–7ms overhead over raw API.
Single user, your machine, your subscription. Don't expose it publicly. Don't resell. Don't be that person.

I got tired of paying for Claude twice. I use Claude Code for all my coding needs. I also have a bunch of side stuff - Open claw agents, batch scripts, an eval harness, and continue in my editor that all wanted to hit Sonnet. Every one of those was burning API credits in parallel with my subscription.

Same model, two bills, felt dumb.

Meanwhile, my local LLM rigs were sitting idle because, honestly, Sonnet smokes anything I can run on 24GB of VRAM for the kind of work I do. I didn't want a local model. I wanted local *plumbing* to a remote model I was already paying for.

So I built otterly

https://www.npmjs.com/package/otterly

What's janky (because you'll find out anyway)

It piggybacks on your authenticated Claude Code session. That means it shares Claude Code's rate limits. If you hit the 5-hour cap in Claude Code, Otterly's down until it resets.
Some weird OpenAI params (logit_bias, n>1, seed) are stubbed.
First request after a long idle can be slow while the session re-auths. Subsequent requests are ~3–7ms overhead over raw API.
Single user, your machine, your subscription. Don't expose it publicly. Don't resell. Don't be that person.

Use Claude Code subscription for OpenClaw again!

Use your Claude Code subscription as a local AI API. For free