text-generation-inference

Deploy LLMs with Hugging Face Text Generation Inference. Configure quantization, continuous batching, and tensor parallelism. Use for production LLM serving, high-throughput inference, and model deplo

by fgarofalo56· Repository·other
Also installable via skills CLI
npx skills add fgarofalo56/Suppercharge_Microsoft_Fabric/.github/skills/text-generation-inference

Source

Path:.github/skills/text-generation-inference/SKILL.md(main)

Related in other