grey-haven-evaluation

Evaluate LLM outputs with multi-dimensional rubrics, handle non-determinism, and implement LLM-as-judge patterns. Essential for production LLM systems. Use when testing prompts, validating outputs, co

by greyhaven-ai· Repository·product

Run in AgentArea Browse All Skills

Also installable via skills CLI

npx skills add greyhaven-ai/claude-code-config/product/grey-haven-evaluation

Source

Repo:SkillsMP + GitHub Raw

Path:product/grey-haven-evaluation(main)

Related in product

product-requirements

Interactive Product Owner skill for requirements gathering, analysis, and PRD generation. Triggers when users request product requ...

product-requirements

Interactive Product Owner skill for requirements gathering, analysis, and PRD generation. Triggers when users request product requ...

task-breakdown

Convert technical designs into actionable, sequenced implementation tasks. Create clear coding tasks that enable incremental progr...