llm-serving-patterns

LLM inference infrastructure, serving frameworks (vLLM, TGI, TensorRT-LLM), quantization techniques, batching strategies, and streaming response patterns. Use when designing LLM serving infrastructure

by melodic-software· Repository·design
Also installable via skills CLI
npx skills add melodic-software/claude-code-plugins/design/llm-serving-patterns

Source

Path:design/llm-serving-patterns(main)

Related in design