Accelerate LLM inference using speculative decoding, Medusa multiple heads, and lookahead decoding techniques. Use when optimizing inference speed (1.5-3.6× speedup), reducing latency for real-time ap
airesearch_skills/19-emerging-techniques/speculative-decoding/SKILL.md(main)