u/SprinklesPutrid5892

Looking for real AI agent workflows to test a review evidence format

I’m looking for a few people who are already using AI agents or AI coding agents in real workflows.

I’m testing a lightweight review format for AI-assisted work: what the agent was asked to do, what it touched, what checks actually ran, what failed, what was only scaffolded, and what remains unverified.

Good fit:

  • AI coding PRs
  • support agents
  • ops/internal workflow agents
  • agents touching tools, tickets, repos, or customer-facing processes

I’m not trying to sell a control plane or another agent platform. I’m trying to understand what evidence humans need before trusting AI-assisted work.

If you have one real workflow, I’d be happy to help turn it into a review packet and learn from what breaks.

Happy to share more details if relevant.

reddit.com
u/SprinklesPutrid5892 — 3 days ago

How should teams review AI-assisted work before trusting it?

A lot of agent demos show the action.

Fewer show the review trail behind the action.

That is the gap I’ve been working on.

I built MindForge Guard as a CLI-first evidence layer for single-agent AI workflows. The idea is to turn an agent workflow into a deterministic report that a human can review.

The report focuses on:

  • what the agent was asked to do
  • what scope it had
  • what evidence supports the action
  • what is missing
  • what risk/drift signals are visible
  • what still needs human review

It is intentionally not an agent runtime, not an approval system, not a blocker, and not a control plane.

The goal is narrower: review before trust.

I’m looking for feedback from people building or operating agents:

Would you maintain an Evidence Pack for agent actions?
What evidence would make an agent workflow more reviewable?
Where does this break down?

reddit.com
u/SprinklesPutrid5892 — 9 days ago
▲ 6 r/AIAgentsInAction+2 crossposts

How should teams review AI-assisted work before trusting it?

One governance problem I’m seeing more often: AI-assisted work is becoming harder to review after the fact.

Not because the output is always bad, but because the surrounding evidence is fragmented.

For a single-agent workflow, reviewers often need to reconstruct:

  • what the agent was asked to do
  • what authority or scope it had
  • what tools/data it relied on
  • what evidence supports the result
  • what evidence is missing
  • whether the next decision still needs a human

I’ve been building MindForge Guard around this narrow problem.

It takes an Evidence Pack and produces a deterministic governance report for human review.

It does not approve, block, deploy, certify, or act as a runtime control plane. The point is not automated enforcement. The point is review evidence before trust.

I’m doing a small soft launch and would genuinely appreciate critique from this community.

Questions I’m trying to pressure-test:

  1. Is “single-agent governance evidence” a useful category?
  2. Where would this fit in an enterprise review process?
  3. What evidence would you expect to see before trusting AI-assisted work?
  4. What should a tool like this absolutely not claim to do?

Link: https://mindforge.run

u/SprinklesPutrid5892 — 8 days ago