u/Kortopi-98

Automating root cause analysis for AI agent failures

I’m an SRE and have a high tolerance for pain. But for the last few months I’ve been babysitting LLM agents in production. This is a new kind of hell.

When an agent behaves unexpectedly and an alert fires, I open the logs. Then I have to go through fifty thousand lines to find the prompt or tool call that sent things off the rails. It feels more like I’m doing archaeology. This is not sustainable.

The failure mode is rarely a crash, it’s more of a drift. The agent completed successfully by the metrics we’re tracking but the output was wrong, which only becomes apparent downstream. Sometimes it’s hours later so by the time I’m investigating I have to reverse-engineer intent from a log file. The log files were not designed for this.

I’ve tried dumping structured JSON logs with full prompt / response pairs, but that ends up becoming its own archaeological dig. Datadog gives me spans and latency and token counts, which doesn’t tell me what I need to know. Grafana gives me a dashboard of things that are not the problem. How do you all deal with this?

reddit.com
u/Kortopi-98 — 6 hours ago

My client asks me to build an “account matrix” for them and I have no idea what’s that

Got a client recently from a friend, and they asked if I could help build an “account matrix”system for their nutra brand. I said sure, and in fact, I have no idea what’s that LOL.
So I google it immediately, below summarize from Gemini:

An account matrix (sometimes called an account farm or profile matrix) is a strategic setup where you manage a structured network of multiple accounts rather than relying on just one.

Instead of putting all your eggs in one basket, you spread your presence across a "matrix" of accounts to maximize reach, segment audiences, or mitigate the risk of bans.

So that means I need multiple TikTok accounts and use different strategies for each of them? I am thinking of building the official brand account(main account), and then create matrix accounts about lifestyle & authority. 
Seeking advice from anyone who has experience with this strategy.

reddit.com
u/Kortopi-98 — 3 days ago

What tools are you using to grow your business on social media?

I’ve been trying to figure out what actually helps people stay consistent and grow, because there are so many tools out there it gets kind of overwhelming. Some people swear by scheduling apps, others focus more on analytics, and some just keep it super simple and do everything manually.

If you’re managing multiple accounts or running content for a business, what’s been your setup? Are you using anything for planning posts, tracking what performs well, or just organizing everything so it doesn’t get messy? Also curious, what’s one tool that actually made things easier for you, and what’s something you tried but didn’t really stick with?

reddit.com
u/Kortopi-98 — 3 days ago

Nobody uses EOS

Only leadership uses EOS. Everyone else checked out because the tool feels like homework. Anyone got their whole team to actually use it without dragging them into it?

reddit.com
u/Kortopi-98 — 14 days ago

My biggest gripe with robots is the quality of the cut. On paper, it automates everything. The Path is set up, the timer is set up, the robot gets it done and go back to charge. That should be the end of the day. The thing is tho, most units use tiny, swinging razors that just flatten thick Bermuda grass instead of cutting it. I always find myself using the gas mower to finish off their jobs, cleaning up uneven patches or using a trimmer to finish the job. That doesn’t really live up to the promise of automating the work. 

I’ve been trying a different setup recently with Yarbo and a straight blade. My first impression is it actually cuts better than I expected. But it has its downside: Because it’s a rigid straight blade and not swinging razors, it is much less "forgiving." If you have loose rocks or hidden debris, this blade will hit them hard. I already learned it the hard way. What I should have done when I first tried it, I should have cleared off its path of debris. Luckily, blades weren’t chipped, thank goodness.

At the end of the day, I realize that "set it and forget it" is not something I can entirely hope for right now. Everything has its trade off. With my old bot, I was out there finishing up the job with a manual mower; with this one, I’m spending that same time doing a "perimeter sweep" for my kid’s toys and loose stones so I don't blow a motor. It’s definitely a reminder that these things are still just tools

reddit.com
u/Kortopi-98 — 15 days ago

I spent like 2 weeks building a synthetic dataset using an LLM api. 5k examples, carefully prompted, checked a random sample manually and it looked clean. trained on it, eval results were mid. not terrible but not where i needed them to be.

My advisor was like just try the 200 examples we annotated by hand and see what happens. I thought there was no way 200 would beat 5k but sure whatever lets waste 40 minutes 🙄 I ran it on a 5090 I rented on hyperai cause our lab cluster was booked as usual.

The 200 hand-labeled ones outperformed the 5k synthetic set by a pretty embarrassing margin. I genuinley sat there staring at the eval output for a minute like... what.

After some digging I think what happend is the synthetic data had these subtle formatting patterns that the model was latching onto instead of learning the actual task. like it wasnt learning my classification labels it was learning the LLMs writing quirks lol. As soon as I mixed like 1k synthetic with the 200 real ones things improved even more which kinda confirmed the synthetic data wasnt garbage, just not good enough on its own.

Most tutorials out there still tell people to just generate more data when results are bad. IMO, for domain stuff thats genuinley terrible advice 😬

reddit.com
u/Kortopi-98 — 21 days ago