u/CheesecakeScared1807

I built an open-source tool that finds reliability failures in multi-agent AI systems

I built an open-source tool that finds reliability failures in multi-agent AI systems

I've been building multi-agent systems and kept hitting the same problem: every testing tool tests individual agents, but production failures happen in the interactions BETWEEN agents.

So I built swarm-test — it builds a NetworkX interaction graph of your agent system and runs 5 chaos engineering tests:

  1. **Cascade Failure** — which agents bring down the whole system if they fail

  2. **Context Leakage** — sensitive data crossing agent boundaries

  3. **Intent Drift** — agents acting outside their role or being manipulated

  4. **Collusion Detection** — agents bypassing safety rules together

  5. **Blast Radius** — single points of failure and critical dependency paths

**Real results from my own 14-agent system:**

- 32 findings total

- 14 CRITICAL cascade failures (every agent had 92%+ blast radius)

- 3 agent collusion cliques communicating outside the orchestrator

- 13 privilege escalation paths

- 1 single point of failure affecting 92% of the system

**3-line API:**

Works with CrewAI today. LangGraph support coming in v0.2.

- PyPI: `pip install swarm-test`

- GitHub: https://github.com/surajkumar811/swarm-test

Open source, MIT licensed. Solo founder, built with Claude Code.

Would love feedback — what tests would YOU want for your multi-agent systems?

https://preview.redd.it/itgxmlm3lr2h1.png?width=2920&format=png&auto=webp&s=9260c0c7934c69772a58f4c5ead22bb373ef4363

https://preview.redd.it/s4z3qp06lr2h1.png?width=2848&format=png&auto=webp&s=d5e12511e4b643491ed8991ebfe62cd258c875cd

reddit.com