I'm a CFO who built AI agents that replaced 80% of my monthly close variance analysis. AMA on the architecture.

15-year CFO here. Got tired of burning 3-5 days every month on variance narratives.

So I built an LLM agent that does it in minutes. Board-ready output. Not a dashboard. Not a summary. The actual narrative my board reads.

My team now does the 20% that requires judgment — interpreting what the board needs to hear, flagging one-time vs. structural variances, connecting numbers to operational drivers. The agent handles the other 80%.

Also built:

• Multi-agent commodity intelligence

• Automated compliance scanning that runs continuously instead of quarterly

• ERP-to-AI orchestration connecting Sage/NetSuite/SAP to agent-driven workflows

Key insight that most AI vendors get wrong: finance teams don't adopt tools built by people who've never closed a quarter. The trust gap isn't about the technology. It's about who built it and whether they understand the workflow.

I designed every tool around the actual process I ran for 15 years. Not what an engineer imagined a CFO does.

The tools are open-sourced (48 repos) so it is for benefit of CFO community. No self promotion here! Happy to go deep on architecture decisions, what worked, what failed, and what I'd do differently.

reddit.com

u/Key_Cook_9770 — 8 days ago

▲ 1 r/mlscaling

[P] CHP: Open-source Consensus Hardening Protocol for preventing sycophantic convergence in multi-agent LLM systems

Repo: https://codeberg.org/cubiczan/consensus-hardening-protocol

**Problem:**

Multi-agent LLM systems converge on false consensus in 1-2 deliberation rounds. Same-model agents are particularly susceptible — cosine similarity between outputs exceeds 0.95 almost immediately, regardless of information diversity. This is well-documented in the CONSENSAGENT literature (ACL 2025) and the GroupDebate paper, but there's no standard protocol for preventing it in production deployments.

The root cause: LLM agents are trained to be agreeable. When you put multiple agreeable agents in a deliberation loop, they don't debate — they ratify.

**CHP Architecture:**

Structured state machine:

EXPLORING → ADVISORY_LOCK → PROVISIONAL_LOCK → LOCKED

Key mechanisms:

• Foundation disclosure — agents must commit to their reasoning chain before seeing other agents' outputs. Prevents anchoring bias and information cascading.

• Adversarial attack — structurally enforced contrarian roles with logical proof requirements. Not soft prompting ("please consider alternatives") but hard architectural constraint (the adversarial agent must produce a logically valid counter-argument or the round fails).

• R0 gate — quantitative convergence scoring. If inter-agent agreement exceeds threshold before adversarial round completes, the consensus is flagged as potentially sycophantic and the deliberation resets.

• Cross-model payload envelopes — each agent's reasoning, model identity, confidence score, and dissent log are packaged in an auditable envelope.

Anti-sycophancy mitigations:

• Heterogeneous base models in specialist clusters (GPT-4o + Claude + DeepSeek)

• Independent parallel initialization

• Optimal Weighting per-agent accuracy tracking

• GroupDebate subgroup partitioning — 51.7% token cost reduction while preserving accuracy

**Production deployment:**

CHP is running in production across finance AI tools:

• LLM-based CFO variance analysis (single-agent, CHP validates output quality)

• Multi-agent commodity intelligence across lithium/nickel/cobalt markets (multi-agent, CHP governs inter-agent consensus)

• CHP-hardened institutional research over AlphaVantage fundamentals + FRED macro panel

Not theoretical — shipped.

**Design decisions:**

I chose a state machine over a probabilistic framework because enterprise compliance teams need deterministic audit trails, not probability distributions. The state progression is inspectable: you can see exactly when each agent committed, what evidence the adversarial agent produced, and why the consensus was accepted or rejected.

Framework-agnostic. Integrates via standard chat-completion APIs.

Looking for feedback on the R0 gate calibration methodology and the adversarial role prompting architecture. Both are areas where I think the community could improve on what I've built.

u/Key_Cook_9770 — 8 days ago

▲ 2 r/AiAutomations+1 crossposts

Building an adversarial consensus protocol for multi-agent AI systems. The idea: instead of just averaging agent outputs (which groupthink), run them through attack/defense rounds where agents try to break each other's reasoning before reaching a hardened consensus. Includes foundation disclosure (what does each agent actually know?) and a gate that rejects early consensus to force deeper exploration.

https://github.com/Cubiczan/consensus-hardening-protocol

Would love feedback from people building multi-agent systems.

reddit.com

u/Key_Cook_9770 — 21 days ago

▲ 0 r/CFO

If you're a VP of Finance/CFO dealing with multi-stakeholder decisions or managing finance ops at scale, I built a couple of tools you might find useful:

consensus-hardening-protocol)

Decision governance layer for high-stakes CFO workflows where a single AI answer isn't good enough. It coordinates multiple specialized agents (finance, strategy, compliance) through a structured consensus process with:

Foundation disclosure + adversarial attack — every recommendation gets stress-tested before it reaches you
Cross-model validation — packets enforce payload integrity across different LLMs so you're not locked into one provider
Auditable state progression — EXPLORING → PROVISIONAL_LOCK → LOCKED with third-party validation gates
Built-in CFO workflow suite — variance analysis, 13-week cash forecast, SaaS model, board reporting, AP optimizer, all with mandatory verification floors

The framework runs locally, outputs Markdown/JSON/Excel artifacts, and enforces a 100% verification requirement for finance decisions.

MetaboCommand (https://github.com/zan-maker/metabocommand)

Multi-agent orchestration dashboard for eCommerce finance + ops teams. Twelve specialized agents across five "metabolic systems" (Capital Reflex, Revenue Velocity, Inventory Intelligence, Customer Lifetime, Operational Health) that surface anomalies and route decisions through role-scoped approval queues.

Real-time collaboration via Supabase — see who's reviewing what, watch approvals flow across tabs, get Slack notifications. Built on Next.js 16 + TypeScript with full RLS enforcement.

Both are MIT licensed, Python/TypeScript respectively, and designed to be LLM-agnostic so you can plug in whatever models you prefer.

Why I built these: Running finance for a growth Venture backed company + overseeing M&A negotiations, I kept hitting the same wall — AI agents produce conclusions without showing their work, different models give conflicting advice, and there's no systematic way to harden decisions before they become capital commitments.

Would love feedback from other Finance practitioners on what's missing or what workflows would be most valuable to add next.

reddit.com

u/Key_Cook_9770 — 23 days ago