u/Specialist-Bee9801

AI chatbots can pass QA and still fail badly

AI chatbots can look perfectly fine during demos and QA testing — then fail once real users start interacting with them.

Some of the issues I kept seeing while stress-testing chatbot APIs and AI agents:

  • hidden instructions leaking
  • support bots inventing policies
  • tools/actions triggered unexpectedly
  • memory/context confusion between sessions
  • indirect prompt injection through retrieved content

The scary part is that many of these systems still technically “work” while producing the wrong outcome for the business or customer.

That’s why I built PromptBrake.

It stress-tests the actual AI/chatbot endpoint companies ship using repeatable adversarial scenarios to help catch risky behavior before deployment.

I also recently added a self-hosted deployment option so teams can run scans inside their own infrastructure without sending prompts, responses, or internal workflows to a third party.

I recorded a short demo showing a real chatbot API scan here: YouTube demo

Would genuinely love feedback from others building AI products:

  • Are you testing chatbot behavior before launch?
  • Are teams around you asking for self-hosted AI tooling yet?
  • What’s the hardest part of validating AI agent behavior today?
reddit.com
u/Specialist-Bee9801 — 6 days ago

The biggest AI agent risk we found wasn’t hallucinations

We’ve been spending a lot of time stress-testing AI agents lately, and honestly, the biggest surprise wasn’t hallucinations.

It was how easy it was to change an agent’s behavior through completely normal-looking conversations.

A few things we kept running into:

  • agents leaking hidden instructions
  • support bots making up policies that don’t exist
  • tools getting triggered in ways they shouldn’t
  • memory/context getting mixed between sessions
  • prompts that looked “safe” failing after a few back-and-forth messages
  • indirect prompt injection through retrieved content

What really stood out was that most of these systems actually looked fine during normal QA/testing.

The issues only started showing up once we interacted with them more like real users… or people intentionally trying to push boundaries.

Feels like a lot of teams are still mainly testing:
“Does the agent work?”

instead of:
“How does the agent behave when things get weird?”

Curious how others here are approaching testing before deployment.

Are you mostly doing:

  • manual testing?
  • adversarial prompting?
  • eval pipelines?
  • simulations?
  • red teaming?
reddit.com
u/Specialist-Bee9801 — 8 days ago

We added a self-hosted mode to PromptBrake

A lot of people liked PromptBrake but didn’t want AI traffic or prompts leaving their environment.

Especially teams working on internal copilots or customer AI systems.

So we spent the last few weeks building a self-hosted deployment mode.

Now PromptBrake can run locally using a lightweight Docker runner, while scans remain within your infrastructure.

Honestly, the hardest part wasn’t Docker — it was keeping the product simple without turning it into bloated enterprise software.

We still iterating on this. How do other teams here handle private AI testing or internal AI security workflows?

reddit.com
u/Specialist-Bee9801 — 10 days ago

We added a self-hosted deployment mode to PromptBrake

One thing we kept hearing while building PromptBrake was:

“Can we run this inside our own infrastructure?”

Especially for teams working with internal copilots, private LLMs, customer support AI, or sensitive prompts.

So we added a new self-hosted Enterprise deployment mode.

PromptBrake can now run locally with a lightweight Docker-based runner, keeping prompts, scan traffic, and AI interactions within your environment.

The interesting part for us wasn’t just the deployment itself — it was figuring out how to keep the product lightweight/simple while still supporting private infrastructure workflows.

Curious how others here are approaching:

  • self-hosted AI tooling
  • private AI testing
  • internal AI security workflows
  • enterprise deployment friction

Would love feedback from people building in this space.

reddit.com
u/Specialist-Bee9801 — 10 days ago

Howdy all 👋

I relaunched PromptBrake on Product Hunt a while ago, but it didn't get much traction.

Would really appreciate some honest feedback on the page:
https://www.producthunt.com/products/promptbrake/launches/promptbrake-2

It's a tool for testing AI APIs before release (prompt injection, data leaks, unsafe tool behavior, etc).

I simplified the messaging this time to focus on the "before you ship" use case, but the value isn't coming across clearly.

Curious what you think:

  • Is the problem clear?
  • Does this feel useful or too niche?
  • What would make you actually try it?

Appreciate any blunt feedback — trying to figure out what's not clicking 🙏

u/Specialist-Bee9801 — 19 days ago

Hey — would really appreciate some honest feedback on something I’ve been building.

It’s called PromptBrake. The idea is pretty simple: before you ship an AI-powered API, you run a scan and see if it breaks in obvious ways (prompt injection, data leaks, unsafe tool behavior, etc).

You just point it at your endpoint (OpenAI, Claude, Gemini, or your own), run a scan, and it shows:

  • what failed
  • why it failed
  • and how to fix it

Right now, there’s a free trial on both plans — you can run scans without adding a credit card or committing to anything.

I’m mainly trying to understand:

  • does this feel useful?
  • is the value clear?
  • anything confusing or missing?

Happy to share the link if anyone wants to try it — just didn’t want this to come off as spam.

Appreciate any feedback 🙏

reddit.com
u/Specialist-Bee9801 — 19 days ago

I’m building PromptBrake, a pre-release security scanner for LLM-powered API endpoints.

It runs repeatable scans against your real endpoint to catch issues like prompt injection, data leaks, unsafe tool use, schema/output failures, and other risky behavior before production.

I’m looking for feedback from people building AI apps with OpenAI, Claude, Gemini, RAG, agents, or custom LLM APIs.

Specific feedback I’d value:

  • Is the value clear from the homepage?
  • Would you trust this with a staging/dev endpoint?
  • Are the findings and remediation guidance useful?
  • What would stop you from trying it?

There’s a free trial and demo here:
https://promptbrake.com

reddit.com
u/Specialist-Bee9801 — 24 days ago

What it is:
PromptBrake is a pre-release security testing tool for LLM-powered APIs. It runs attack scenarios against the endpoint you actually ship, and reports PASS/WARN/FAIL findings on issues such as prompt injection, system prompt leakage, cross-user data leakage, unsafe tool use, sensitive data echo, and schema/output bypasses.

who it’s for:
Teams building products on top of OpenAI, Claude, Gemini, or custom LLM-backed API endpoints. The main use case is checking an AI feature before launch, after prompt/model/tool changes, or as part of a release gate.

What I need help with:
I’m looking for blunt feedback on positioning and usefulness:

  1. Is “pre-release security testing for LLM APIs” clear, or would you describe this differently?
  2. If you ship AI features, would you run a tool like this before production?
  3. Which finding would matter most to you: prompt injection, data leakage, tool abuse, or output/schema bypass?
  4. Does the product feel too security-team-focused, or is it useful for normal product/engineering teams too?
  5. What would you need to trust the scan results?

link:
https://promptbrake.com

reddit.com
u/Specialist-Bee9801 — 25 days ago