I Built pytest for AI Agents — Here's What I Learned

python dev.to April 02, 2026

Every AI agent developer knows this pain: You run your agent. It works. You run it again. It doesn't. You have no idea why. You check the logs — there are no logs. You check the cost — $4.20 for a single run. You cry. I got tired of this, so I built AgentProbe — an open-source testing framework for AI agents. Think pytest, but for LLM-powered agents. What it does Record — Capture every LLM call, tool call, and decision your agent makes. Test — Run 35+ built-in assertions: cost lim

Read Full Tutorial open_in_new