u/Adr-740

90% of LLM classification calls are unnecessary - we measured it and built a drop-in fix (open source)
▲ 27 r/AIDeveloperNews+1 crossposts

90% of LLM classification calls are unnecessary - we measured it and built a drop-in fix (open source)

I kept running into the same pattern in production:

LLMs being used for things like:

- intent detection

- tagging

- moderation

…but most of those calls are actually very simple.

So I tested it.

On a standard benchmark (Banking77):

→ ~90%+ of inputs can be handled by a lightweight ML model

→ while keeping ~95% agreement with the LLM

Built a small library around that idea:

→ It learns from your LLM outputs

→ routes “easy” cases to a cheap model

→ keeps hard ones on the LLM

→ with a guarantee on quality (you set the threshold)

Result:

massive cost reduction without noticeable degradation

Fully open-sourced here:

https://github.com/adrida/tracer

Would love feedback from people running high-volume LLM pipelines - curious if you’re seeing the same pattern.

u/Adr-740 — 2 days ago