eval-harness-kit
Build and run deterministic evaluation suites for agent workflows (single-turn or agentic). Use when you need reproducible eval runs with manifests, graders, metrics, and JSONL logs for capability or
Also installable via skills CLI
npx skills add aufrank/agent-skills/data/eval-harness-kit