llm-inference-batching-scheduler

Guidance for optimizing LLM inference request batching and scheduling problems. This skill applies when designing batch schedulers that minimize cost while meeting latency and padding constraints, inv

by letta-ai· Repository·design
Also installable via skills CLI
npx skills add letta-ai/skills/letta/benchmarks/trajectory-only/llm-inference-batching-scheduler

Source

Path:letta/benchmarks/trajectory-only/llm-inference-batching-scheduler(main)

Related in design

llm-inference-batching-scheduler | AgentArea Skills