llm-serving-patterns
LLM inference infrastructure, serving frameworks (vLLM, TGI, TensorRT-LLM), quantization techniques, batching strategies, and streaming response patterns. Use when designing LLM serving infrastructure
Also installable via skills CLI
npx skills add melodic-software/claude-code-plugins/design/llm-serving-patterns