simpo

Simple Preference Optimization for LLM alignment. Reference-free alternative to DPO with better performance (+6.4 points on AlpacaEval 2.0). No reference model needed, more efficient than DPO. Use for

by ihatesea69· Repository·other

Run in AgentArea Browse All Skills

Also installable via skills CLI

npx skills add ihatesea69/HieuNghi-AI-Skills/airesearch_skills/06-post-training/simpo

Source

Repo:github.com/ihatesea69/HieuNghi-AI-Skills

Path:airesearch_skills/06-post-training/simpo/SKILL.md(main)

Related in other

agent-memory-yamadashy-repomix

Use this skill when the user asks to save, remember, recall, or organize memories. Triggers on: 'remember this', 'save t...

by yamadashy

21,427

task-execution-engine

CLI tool for configuring and monitoring Claude Code

by davila7

18,218

qiuzhi

指导Claude按照二哥的风格撰写求职类文章，包括公司薪资爆料、年终奖盘点、求职攻略、offer选择建议等内容。

by itwanger

16,619