sentencepiece
Language-independent tokenizer treating text as raw Unicode. Supports BPE and Unigram algorithms. Fast (50k sentences/sec), lightweight (6MB memory), deterministic vocabulary. Used by T5, ALBERT, XLNe
Also installable via skills CLI
npx skills add ovachiever/droid-tings/skills/sentencepiece