llm-inference-batching-scheduler

Guidance for optimizing LLM inference request batching and scheduling problems. This skill applies when designing batch schedulers that minimize cost while meeting latency and padding constraints, inv

by letta-ai· Repository·data
Also installable via skills CLI
npx skills add letta-ai/skills/data/llm-inference-batching-scheduler

Source

Path:data/llm-inference-batching-scheduler(main)

Related in data

llm-inference-batching-scheduler | AgentArea Skills