inference
Fast inference with Unsloth and vLLM backend. Covers model loading, fast_generate(),thinking model output parsing, and memory management for efficient inference.
Also installable via skills CLI
npx skills add atrawog/bazzite-ai-plugins/bazzite-ai-jupyter/skills/inference