How do you know if an AI feature is actually profitable?
I’ve been digging into pricing for AI products and one thing keeps coming up.
A founder can see revenue in Stripe and model spend in OpenAI, but still not really know which feature or customer is killing the margin.
For example:
A user pays $49/mo.
They run a PDF Q&A workflow a lot.
Some runs retry.
Some answers fail.
Some files create huge context windows.
At the end of the month the OpenAI bill is higher than expected, but it’s hard to tell what actually caused it.
Tokens help, but they don’t tell the whole story. The thing I’d want to know is cost per usable outcome.
Like:
PDF answer delivered: $0.42
Failed/retried spend: 18%
Customer A crossed expected usage
This workflow probably needs a soft cap or a different plan limit
I’m trying to figure out if this is painful enough that people would pay for a small diagnostic before building a whole internal usage system.
If you run a paid AI feature, how are you tracking this today?
Do you look at tokens only, or do you track cost per workflow/outcome?
And would you pay someone to audit one workflow and tell you where the margin risk is?