LLM inference infrastructure, serving frameworks (vLLM, TGI, TensorRT-LLM), quantization techniques, batching strategies, and streaming response patterns. Use when designing LLM serving infrastructure
plugins/systems-design/skills/llm-serving-patterns
Add unsigned integer (uint) type support to PyTorch operators by updating AT_DISPATCH macros. Use when adding support fo...
Guide users through creating Agent Skills for Claude Code. Use when the user wants to create, write, author, or design a...
Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to...