u/ProofMight5229

How do you actually triage 1000+ regression tests without losing your mind?

I run a large Playwright regression suite (~1000 tests, TypeScript). I've invested a lot in making the suite itself solid — strict data-testid selectors only, no flimsy CSS/XPath locators, and I've built custom tooling with Claude Code (AI-assisted skills) that runs the tests and auto-generates detailed reports on Confluence so the whole team can review results without touching the codebase.

So the test infrastructure is pretty tight. My problem isn't writing or maintaining tests — it's what happens after they finish.

Every execution gives me a wall of results and I spend a lot of time figuring out what's actually going on. For each failure I have to determine: is this a flaky test? A real test defect? An actual product bug? A one-time environment issue (slow load, timeout, whatever)?

I end up re-running tests manually just to check if the failure reproduces. When it does, I still go back and forth with manual QA or product to confirm whether it's a known behavior or a real bug. That loop alone eats hours.

Everything runs locally for now — no CI/CD yet, no historical data on pass/fail trends. Just me going through the Confluence report after each run trying to make sense of it.

For those of you dealing with large Playwright suites:

  • How do you classify failures efficiently? Do you have a system for separating flaky from real, even without CI history?
  • How do you handle flaky tests — retries, quarantine, tagging? Playwright has built-in retries but I'm not sure how people actually use them in practice at this scale.
  • When you suspect a real bug, what's your process before escalating? Do you just file it and move on, or do you verify manually first?
  • Any techniques or workflows that helped with triage specifically? Even just how you organize your review process would be useful.

Would love to hear how other teams deal with this because right now it feels like I'm doing most of it in my head and it doesn't scale.

reddit.com
u/ProofMight5229 — 15 hours ago

Open Claw for QA automation ?

Hello everyone, I am a QA automation consultant in a big enterprise (300 to 400 employees). I am also responsible of the release and manage several manuals QA.
I'm curious if someone uses OpenClaw for his development process. I know that OpenClaw can has really good results, but I'm concerned that maybe it can also have security issues.

reddit.com
u/ProofMight5229 — 15 hours ago

Automated failure analysis after regression — anyone done it?

Hey everyone,

I'm a QA Automation Engineer at a mid-size company (~300-400 employees), and I own the entire automation effort. My main job is to build out automated regression coverage after every sprint.

The real goal is to cut down our release blocking time right now it's a major pain point. Devs can be blocked for up to 48 hours waiting on regression results. My target is to cut that by 50%.

I'm making good progress on that front, but now I want to take it a step further. What I'm looking for is a way to automatically triage test failures once a regression run completessomething that can analyze a failure, determine whether it's a real bug or a false positive, classify its severity (critical, major, etc.), and then automatically create a Jira ticket assigned to the right person.

Has anyone actually implemented something like this? Would love to hear how you approached it and any advice you have.

reddit.com
u/ProofMight5229 — 8 days ago

Automated failure analysis after regression — anyone done it?

Hey everyone,

I'm a QA Automation Engineer at a mid-size company (~300-400 employees), and I own the entire automation effort. My main job is to build out automated regression coverage after every sprint.

The real goal is to cut down our release blocking time right now it's a major pain point. Devs can be blocked for up to 48 hours waiting on regression results. My target is to cut that by 50%.

I'm making good progress on that front, but now I want to take it a step further. What I'm looking for is a way to automatically triage test failures once a regression run completes something that can analyze a failure, determine whether it's a real bug or a false positive, classify its severity (critical, major, etc.), and then automatically create a Jira ticket assigned to the right person.

Has anyone actually implemented something like this? Would love to hear how you approached it and any advice you have.

reddit.com
u/ProofMight5229 — 8 days ago