Tried a two-agent workflow for an API integration spec this week. Result was about 80% good to publish.
Tried something at work this week that worked better than I expected, and I want to share where it broke down because that's the more useful part.
Context: I'm a Tech BA on a payments API project at a Canadian bank. ISO 20022 backend, Kafka, AWS, the usual stack. New integration coming in, and I needed to produce a functional analysis page in Confluence that matched the team's reference format.
The setup was two agents in sequence. Agent 1 had access to the legacy services codebase and I asked it for the complete flow of the existing integration. Request mapping, validations, Kafka topics, error handling, the lot. Agent 2 had MCP access to Confluence with my PAT, plus the new API contract, plus a link to a reference page that represented "good" in our team. I gave it Agent 1's output, my own writeup of what the new service needs to do, and asked it to produce the new page.
The draft was honestly close to ship-quality on the technical mapping. Giving it the contract was the unlock there, it stopped hallucinating field names and the request/response examples actually validated. Field-by-field, it was solid.
Where I spent my cleanup time was somewhere else entirely.
The agent treated the reference page as inspiration instead of as a contract. It added sections that weren't in the reference (an assumptions block, a risks block, a glossary it invented). It drifted on the color scheme for the Confluence panels and lozenges, which I think is because I gave it the rendered page instead of the storage format. And it put code snippets inside a functional analysis page, which on our team is a hard line, code lives in the technical analysis page, not this one. The agent doesn't know that's a team convention, it just knows "technical documentation usually has code blocks."
Lesson I'm taking from this: the reference page is doing double duty as both the template and the example, and agents can't separate those two roles. Next time I'm going to maintain a stripped-down template with empty sections and inline notes like "field mapping table goes here", separate from a finished example for tone. Then the prompt becomes "fill the template, use the example for tone only, do not copy its structure."
The other thing I'll change is the prompt order. Constraints first (no added sections, no code blocks, match these macros), task last. Agents weight recent instructions heavier, so burying the constraints above the task lets them survive the generation.
PS, audit traceability is a thing worth thinking about if you do this in a regulated shop. Every edit Agent 2 made to Confluence shows up under my name in the page history. Fine for a draft, less fine if it's pushing directly to a page devs reference.
Curious if anyone else has run this kind of two-agent setup for spec work. Especially interested in how you handle the "follow the format exactly" problem, because that's where mine still leaks.