u/balal6

I'm a new-ish AI Engineer and I kept getting burned by the same freaking problem. I'd change what I thought was the tiniest thing and something would break...and it would take me like 3 days to notice but of course after :) our :) clients :) saw :) it :)

Got sick of it so built a testing framework for AI agents and I've been using it for a little bit now, just publicly put it out there a few days ago. It's been helping me out but wondering thoughts on what I'm missing or could add or just general thoughts on issues you have when building AI.

Right now one command tests everything...agents, pipelines, ML models, vector stores. Has schema checks, latency thresholds, LLM as judge quality scoring.

Wondering what it's missing or what would make you actually use something like this? I'd be happy to go into the technical decisions if anyone is curious! Not trying to self promo as much but I'm the only AI engineer on my team and would love people in similar positions as me to discuss with lol.

GitHub.com/ryva-dev/ryva

Roast my project that is testing framework for AI agents