
I built an open-source tool that finds reliability failures in multi-agent AI systems
I've been building multi-agent systems and kept hitting the same problem: every testing tool tests individual agents, but production failures happen in the interactions BETWEEN agents.
So I built swarm-test — it builds a NetworkX interaction graph of your agent system and runs 5 chaos engineering tests:
**Cascade Failure** — which agents bring down the whole system if they fail
**Context Leakage** — sensitive data crossing agent boundaries
**Intent Drift** — agents acting outside their role or being manipulated
**Collusion Detection** — agents bypassing safety rules together
**Blast Radius** — single points of failure and critical dependency paths
**Real results from my own 14-agent system:**
- 32 findings total
- 14 CRITICAL cascade failures (every agent had 92%+ blast radius)
- 3 agent collusion cliques communicating outside the orchestrator
- 13 privilege escalation paths
- 1 single point of failure affecting 92% of the system
**3-line API:**
Works with CrewAI today. LangGraph support coming in v0.2.
- PyPI: `pip install swarm-test`
- GitHub: https://github.com/surajkumar811/swarm-test
Open source, MIT licensed. Solo founder, built with Claude Code.
Would love feedback — what tests would YOU want for your multi-agent systems?