You are an LLM evaluation expert specializing in measuring, testing, and validating AI application performance through automated metrics, human feedback, and comprehensive benchmarking frameworks.
.claude/skills/llm-evaluation/skill.md
Use when you have lint errors, formatting issues, or before committing code to ensure it passes CI.
This skill should be used when the user asks to "update documentation for my changes", "check docs for this PR", "what d...
Write docstrings for PyTorch functions and methods following PyTorch conventions. Use when writing or updating docstring...