Evals for AWS AgentCore
Hey r/aws! I'm one of the maintainers of DeepEval, an open-source framework to evaluate AI agents (it's like Pytest for LLMs), and I wanted to share a recent integration we released with AgentCore that you might find useful.
Long story short, we found:
- AgentCore to be increasingly popular with our community, and
- No easy way exist to test these agents without being coupled to AWS's platform
So we made evals for AgentCode 100% open-source by integrating it in DeepEval, it's literally 2 lines of code:
That's literally it. Under the hood, "instrument_agentcore" traces agentcore agents, while "invoke" calls agentcore allowing DeepEval to capture the trace. And once we have the trace, you can simply use DeepEval's metrics for evals, in this code snippet task completion.
You might also notice that we were able to use Pytest, that's because that's what DeepEval wraps.
Anyway, hope this was helpful, super curious to know whether you see yourself using this integration. Not going to drop a link here for obvious reasons but, LMK if you're interested!