u/CommunityThink3735

▲ 0 r/ollama

I'm building local AI apps as a side project and I keep hitting the same wall: how do you charge users per inference when the whole point is that the app runs on their hardware, offline?

The options I've found all have downsides:

  • License server pings — kills offline use, defeats the point
  • One-time license keys — fine for unlimited usage, useless for credit-based pricing
  • Cloud API in front of the local model — defeats the point even harder
  • Honor system + local counter — trivially bypassable

So I started prototyping a small library: server signs an Ed25519 credit bundle, ships it to the device, the device verifies the signature locally and decrements a counter per inference. No network call at inference time. Bundle expires, user tops up. The signing key never leaves the server, the verify code is open source.

It works as a v0.1 (Node + Python verifiers, hosted sign API), but before I build out the dashboard / Stripe / docs, I want to gut-check:

  1. If you already sell a local AI app — what are you actually doing for usage limits today? Even "we don't bother" is a useful answer.
  2. If you'd like to sell one but haven't shipped paid yet — is metering the blocker, or is it something else?
  3. Is the bearer-token weakness (user can copy the bundle to N devices) a dealbreaker for you, or acceptable with short expiry?

I'd rather hear "this isn't a real problem, just charge a flat license fee" now than ship a SaaS into a market that doesn't want it.

Happy to share the repo and design notes in comments if anyone wants to pick it apart — leaving the link out of the OP because I want answers about the problem, not opinions about my code.

reddit.com
u/CommunityThink3735 — 19 days ago