ai-eval-design-and-iteration
Develop "quizzes" (evals) to measure model performance on specific tasks. Use these benchmarks to guide fine-tuning, determine product UX patterns, and track performance improvements over time. Use th
Also installable via skills CLI
npx skills add samarv/Shanon/.claude/skills/ai-eval-design-and-iteration