vllm-skill

Serves LLMs with high throughput using vLLM's PagedAttention and continuous batching. Use when deploying production LLM APIs, optimizing inference latency/throughput, or serving models with limited GP

by DoanNgocCuong· Repository·other
Also installable via skills CLI
npx skills add DoanNgocCuong/home/3 - THE ROAD/3.2 [STRUCTURES - B- MILESTONES]/3.2.2 MONEYGame/3.2.1.1 KIẾM TIỀN - SKILL/your_project/claude/skills/DataScienceAndAI/6 - Applications/1_LLMs/vllm-skill

Source

Path:3 - THE ROAD/3.2 [STRUCTURES - B- MILESTONES]/3.2.2 MONEYGame/3.2.1.1 KIẾM TIỀN - SKILL/your_project/claude/skills/DataScienceAndAI/6 - Applications/1_LLMs/vllm-skill/SKILL.md(main)

Related in other

vllm-skill | AgentArea Skills