u/AdAffectionate7019

Why AI coding agents often fail at multi-app tasks — a small experiment

I've noticed that AI coding agents (Claude Code, Cursor, Codex, etc.) are quite good at local tasks, but they often struggle when a feature involves multiple parts of a project — frontend, backend, shared libraries, and so on.

I ran a small test to look into this.

Test Setup:

  • Same starting monorepo
  • Same prompt: “Implement a minimal login feature for this project.”
  • Two versions:
    • Plain monorepo (no extra context)
    • Workspace with a structured context bundle (manifest + guidance files)

Results:

  • Plain version: ~4m48s. Backend API worked, but browser login failed.
  • With context bundle: ~7m16s. Full flow worked — browser login, session persistence, and logout all succeeded.

Splitting the bundle showed:

  • The manifest helped the agent understand the project structure.
  • The guidance files (AGENTS.md / CLAUDE.md) helped with execution and verification.

My friend and I originally built this system for our own internal use because we kept hitting this exact problem. We're now exploring whether it can help others too.

Has anyone else experienced similar issues when working with multiple coding agents across a monorepo? What solutions have worked for you?

Would appreciate any thoughts or similar experiences.

reddit.com
u/AdAffectionate7019 — 1 day ago