So many problems here can be solved by stricter frameworks, linter and context engines.
The llm writes spaghetti code? It's folder structure doesnt make sense? It forgets important stuff in long chats? It writes unnecessary code?
I feel like most people use a completely empty project and use just an llm with the current repo as context. Ofc it wont be as good, the larger the project the worse it gets.
Had none of those problems working with codex, telling it to use frameworks and bundles over self implementations