r/Agentic_Coding

I made an evidence-gate workflow for coding agents — Codex + Claude Code support
▲ 3 r/Agentic_Coding+3 crossposts

I made an evidence-gate workflow for coding agents — Codex + Claude Code support

I’m the maker of Superloopy, an MIT-licensed workflow layer for coding agents.

The pattern I’m trying to make practical is an “evidence gate” before an agent claims a task is done:

  1. turn the task into explicit acceptance criteria

  2. ask the agent to leave receipts under `.superloopy/evidence/`

  3. use command-backed checks where possible, not just prose summaries

  4. keep manual/visual proof separate from deterministic proof

  5. finish with a report that says what passed, what still needs judgment, and where the artifacts are

It now works with both Codex and Claude Code. The implementation is intentionally thin: plugin hooks, skills/subagents, and a small CLI around evidence + final gates. The goal is not to create another agent, but to make existing coding agents easier to audit when they say “done.”

Repo:

https://github.com/beefiker/superloopy

I’m curious how other people building with coding agents structure this. Do you keep evidence artifacts? Do you require tests/screenshots/logs before accepting work? Or does that add too much ceremony for your workflow?

u/Simple_Somewhere7662 — 15 hours ago
▲ 9 r/Agentic_Coding+5 crossposts

I built a Codex session review app using Codex. How are you tracking your AI coding workflows?

I built a small free macOS tool for reviewing Codex sessions using the Codex desktop app. Are people here using anything similar to improve their AI coding workflows?

After longer Codex runs, I kept finding that the transcript was technically available, but hard to review.

The things I wanted to inspect were:

- What changed

- Which files were touched

- Where tokens went

- Which tool calls mattered

- Whether the prompt/context was good enough to reuse

- What context would be useful to share during code review

So I made BuildrAI, a local-first app that turns Codex session artifacts into timelines, token usage, prompt/session evaluation, changed-file context, and shareable reports.

I’m curious how other people are handling this.

Do you review Codex sessions after the fact, or do you mostly trust the final diff?

u/michaliskarag — 1 day ago
▲ 2 r/Agentic_Coding+1 crossposts

What are your most useful agent hooks?

I started using the stop hook in most of my projectes, so I don't need to trust the agent to "remember" it has validate its changes.
My hook uses git to check if there are changed files. If there are some it runs a few scripts and if they fail their output is piped back to the agent.

I always had git hooks in place that would run typecheck, linting and unit tests (only the ones related to the changes), but in my current workflow my agents don't commit themselves.
Usually it is me doing the commits and I often got annoyed when there were still broken tests or linting issues left to fix, so I switched to using agent hooks to run pretty much what I have in my lint-staged config.

Some downsides to this approach:
You need to be aware of this and not ask the agent about a codebase that is currently in a dirty state or it would start cleaning up the issues after it answered you.

This isn't watertight either. I once had a run where Sonnet 4.6 couldn't fix the linting issue so it changed my oxlint config instead.

---
Are you using agent hooks?
Which ones do you find most useful?

reddit.com
u/T4212 — 2 days ago