accelerate
Simplest distributed training API. 4 lines to add distributed support to any PyTorch script. Unified API for DeepSpeed/FSDP/Megatron/DDP. Automatic device placement, mixed precision (FP16/BF16/FP8). I
Also installable via skills CLI
npx skills add MesferAli/XCircle/.claude/skills/accelerate