ai-multimodal

Analyze images/audio/video with Gemini API (better vision than Claude). Generate images (Imagen 4), videos (Veo 3). Use for vision analysis, transcription, OCR, design extraction, multimodal AI.

by JorgeZuloaga· Repository·other

Run in AgentArea Browse All Skills

Also installable via skills CLI

npx skills add JorgeZuloaga/audio-dev-mcps/mcp-rew/.opencode/skills/ai-multimodal

Source

Repo:github.com/JorgeZuloaga/audio-dev-mcps

Path:mcp-rew/.opencode/skills/ai-multimodal/SKILL.md(main)

Related in other

agent-memory-yamadashy-repomix

Use this skill when the user asks to save, remember, recall, or organize memories. Triggers on: 'remember this', 'save t...

by yamadashy

21,427

task-execution-engine

CLI tool for configuring and monitoring Claude Code

by davila7

18,218

qiuzhi

指导Claude按照二哥的风格撰写求职类文章，包括公司薪资爆料、年终奖盘点、求职攻略、offer选择建议等内容。

by itwanger

16,619