pacabench
Benchmarking LLM agents shouldn't mean wrestling with brittle scripts and lost progress. pacabench is a local-first tool that provides reproducible, reliable benchmarks with isolated execution, persistent state, and built-in metrics tracking—no SDK required.