u/AccomplishedIce9767

I just gave Gemini Code Agent one of the easiest refactor tasks imaginable and it completely imploded.

Task:

  • I uploaded the relevant files into context
  • I gave explicit migration instructions
  • I asked it to migrate a React codebase from Material UI v5 to Material UI v6
  • I specified exactly how components should be rewritten
  • I told it to preserve existing interfaces and types

What happened:

  • It ignored parts of the instructions
  • It rewrote unrelated code
  • It changed interfaces that had absolutely nothing to do with the migration
  • It introduced type inconsistencies
  • It mixed old and new MUI patterns together
  • Some files looked like they were generated from pure guesswork

The craziest part is this wasn’t even an architecture task. It wasn’t asking for system design, business logic, or complex reasoning. This was mostly structured framework migration work with provided context.

I’m honestly amazed at how unreliable these “AI coding agents” still are once you move beyond toy demos.

People keep showing benchmark charts and cherry-picked examples, but in a real codebase:

  • preserving conventions matters
  • not touching unrelated interfaces matters
  • following scoped instructions matters
  • consistency across files matters

If an agent can’t safely execute a constrained migration with the exact files already provided, how are we pretending these tools are replacing engineers anytime soon?

Right now they feel more like:
“generate 80% of the work and spend 200% of the time reviewing the damage.”

Anyone else having similar experiences with Gemini Code Agent or other autonomous coding tools?

reddit.com
u/AccomplishedIce9767 — 3 days ago