▲ 3 r/AI_Agents
Are you actually running AI agents in production? What’s failing the most?
I'm doing research into production AI agent systems and trying to separate real-world problems from demo-level success.
A lot of agent demos look impressive until they hit:
- long-running workflows
- inconsistent tool outputs
- permission boundaries
- retries/recovery
- memory drift
- context loss
- hidden hallucinations
- orchestration complexity
What surprised me is that the actual “reasoning” often isn’t the biggest problem.
The bigger issues seem to be:
- reliability
- state management
- workflow continuity
- evaluation/testing
- governance
- infrastructure costs
For people actually running agents in production (or even serious internal tooling):
- what stack are you using?
- what works better than expected?
- what constantly breaks?
- what problem became bigger than you originally thought?
Especially curious about:
- memory systems
- multi-agent coordination
- long-term context
- human approval flows
- observability/debugging
Would love to hear real experiences rather than hype.
Even failed experiments are useful.
u/Comfortable_Way8312 — 10 days ago