LLM Guard has a 3.3% false positive rate. Arc Sentry has 0%. Here’s the full comparison.
LLM Guard is what most people reach for when they need prompt injection detection on self-hosted models. So I ran both on the same 130-prompt deployment benchmark with the same configuration.
Arc Sentry: 92% detection, 0% false positives.
LLM Guard: 70% detection, 3.3% false positives.
The false positive gap is the one that matters in production. A 3.3% FPR means your security layer is breaking legitimate user requests. At any real traffic volume that’s a support nightmare.
The architectural reason for the difference: LLM Guard uses a generic classifier trained on attack datasets. Arc Sentry calibrates on your actual deployment traffic. It learns what your users normally say, then flags prompts that push the model’s internal state away from that baseline. A prompt that looks suspicious to a generic classifier might be completely normal for your users — and Arc Sentry won’t flag it.
Also caught Crescendo multi-turn attacks at Turn 2 with 75% confidence. LLM Guard caught 0 out of 8 turns.
Works on Mistral, Llama, Qwen. ~20 warmup prompts to calibrate. GPU for whitebox layers, CPU for the behavioral pre-filter.
GitHub: https://github.com/9hannahnine-jpg/arc-sentry
PyPI: https://pypi.org/project/arc-sentry/
If you’re using OpenAI, Anthropic, or any hosted API instead of self-hosting — Arc Gate is the proxy version. Same governance layer, no GPU required, one URL change.
https://github.com/9hannahnine-jpg/arc-gate — $29/month for production, 500 free requests to try it.