u/Turbulent-Tap6723

LLM Guard has a 3.3% false positive rate. Arc Sentry has 0%. Here’s the full comparison.

LLM Guard is what most people reach for when they need prompt injection detection on self-hosted models. So I ran both on the same 130-prompt deployment benchmark with the same configuration.

Arc Sentry: 92% detection, 0% false positives.
LLM Guard: 70% detection, 3.3% false positives.

The false positive gap is the one that matters in production. A 3.3% FPR means your security layer is breaking legitimate user requests. At any real traffic volume that’s a support nightmare.

The architectural reason for the difference: LLM Guard uses a generic classifier trained on attack datasets. Arc Sentry calibrates on your actual deployment traffic. It learns what your users normally say, then flags prompts that push the model’s internal state away from that baseline. A prompt that looks suspicious to a generic classifier might be completely normal for your users — and Arc Sentry won’t flag it.

Also caught Crescendo multi-turn attacks at Turn 2 with 75% confidence. LLM Guard caught 0 out of 8 turns.

Works on Mistral, Llama, Qwen. ~20 warmup prompts to calibrate. GPU for whitebox layers, CPU for the behavioral pre-filter.

GitHub: https://github.com/9hannahnine-jpg/arc-sentry

PyPI: https://pypi.org/project/arc-sentry/

If you’re using OpenAI, Anthropic, or any hosted API instead of self-hosting — Arc Gate is the proxy version. Same governance layer, no GPU required, one URL change.

https://github.com/9hannahnine-jpg/arc-gate — $29/month for production, 500 free requests to try it.

u/Turbulent-Tap6723

LLM Guard has a 3.3% false positive rate. Arc Sentry has 0%. Here’s the full comparison.

Your AI agent cannot tell the difference between webpage content and instructions. Arc Gate fixes that.

I benchmarked my AI agent runtime firewall against 3 public academic datasets — here are the honest results including where it fails

I realized prompt injection becomes way more dangerous once AI agents get tool access.

Your agent’s biggest security problem is not the model. It is what the model reads.

Your agent’s biggest security problem is not the model. It is what the model reads.

I built a runtime firewall for AI agents as a real-world application of information geometry. Public red-team environment and reproducible benchmark inside.

Your AI agent is one poisoned webpage away from doing something catastrophic

Your AI agent is one poisoned webpage away from doing something catastrophic

Your AI agent is one poisoned webpage away from doing something catastrophic

Your AI agent is one poisoned webpage away from doing something catastrophic

Your AI agent is one poisoned webpage away from doing something catastrophic

Your AI agent is one poisoned webpage away from doing something catastrophic

Built a tool that stops AI agents from being hijacked by malicious content in webpages and emails

Built a tool that stops AI agents from being hijacked by malicious content in webpages and emails

Built a one-line prompt injection detector for LangChain — blocks attacks before they reach your LLM

Built a tool that stops AI agents from being hijacked by malicious content in webpages and emails

We built a public red team environment for our AI agent security proxy — submit attacks and get a full security trace back

Session authority state machine for LLM proxy-level prompt injection defense — looking for feedback

Built a tool that stops AI agents from being hijacked by malicious content in webpages and emails