llm-inference-batching-scheduler
Guidance for optimizing LLM inference request batching and scheduling problems. This skill applies when designing batch schedulers that minimize cost while meeting latency and padding constraints, inv
Also installable via skills CLI
npx skills add letta-ai/skills/design/llm-inference-batching-scheduler