Does anyone have a solid QA protocol for vibe-coded software to cut down on production bugs? 100% built by Codex software in 5 weeks with 80 features across desktop and mobile.
I built my software 100% vibe-coded, and now I need to tighten up QA.
I’ve already set up around 200 automated Playwright tests for the software. I created the test cases using screenshots from both the mobile app and desktop version. I’m also using Check now to test workflows like invoicing, since that’s one of the features inside the app.
I’m curious what other vibe coders, or people working with mostly AI-generated codebases, are doing to reduce bugs and errors. Right now I have a staging environment with a new version of the software that I’m also unit testing each feature on using Codex as that is my primary AI coding tool. I’ve used Claude code, but I really prefer Codex and I have some technical background. I completed a degree in computer science southern private school so I’m not clueless when it comes to the software development life cycle but for vibecoding specifically I’m curious what protocols people have used to cut down on production bugs. To me it seems like with AI you can build much faster, but the testing process needs to be much more thorough than it needed to be before. I catch them by looking at post hog, and sometimes the users tell me directly, but the ones who onboard itself self serve without me talking to them thought the software was broken. Partly because to do the invoicing feature you needed to connect your Stripe account or Square account and I was doing that part of the onboarding on the phone and I’ve now moved up doing this step in the onboarding process but steal, I think that just goes to show that I really need to cut down on bugs. I get regression with sign in as well so I’ll fix a bug with sign in and then two days later. People can’t login with Google again. It’s extremely aggravating.
Right now, I dogfood the product, do manual testing, and run automated Playwright tests every morning around 4 AM. Sometimes the tests give me useful insight, and sometimes I question their reliability, but overall I know I need much stronger QA to reduce production bugs.
The hard part is that the software is already pretty extensive. I have around 70–100 features, so there are a lot of edge cases. Even with dogfooding and manual testing, users still report bugs I didn’t catch.
Marketing is working pretty well through Meta ads, and people are ready to use the software. The problem is that a lot of trials don’t convert because users hit bugs. The UI is also confusing, but I’m not focused on fixing that yet. The bigger issue right now is literal software bugs breaking the experience.