attribution-patching
Gradient-based approximation to activation patching for scalable circuit analysis. Use when activation patching is too slow or when analyzing many components simultaneously.
Also installable via skills CLI
npx skills add ndif-team/skills/data/attribution-patching