volcengine-video-understanding▌
freestylefly/canghe-skills · updated Apr 8, 2026
MDX-style export adds YAML metadata + attribution linking explainx.ai and this canonical listing URL.
使用字节跳动火山方舟视频理解 API(doubao-seed-2-0-pro-260215 等模型)对视频进行深度理解和分析。
火山视频理解
使用字节跳动火山方舟视频理解 API(doubao-seed-2-0-pro-260215 等模型)对视频进行深度理解和分析。
推荐方式:Files API 上传 + Responses API 分析
- 支持最大 512MB 视频文件
- 自动视频预处理(FPS采样)
- 文件可重复使用(存储7天)
功能
- 视频上传:通过 Files API 上传本地视频(推荐,最大512MB)
- 内容理解:分析视频场景、人物、动作、情感
- 视频问答:基于视频内容回答用户问题
- 视频描述:自动生成视频描述和摘要
前置要求
需要设置 ARK_API_KEY 环境变量。
配置方式(推荐)
- 复制配置模板:
cp .canghe-skills/.env.example .canghe-skills/.env
- 编辑
.canghe-skills/.env文件,填写你的 API Key:
ARK_API_KEY=your-actual-api-key-here
或使用环境变量
export ARK_API_KEY="your-api-key"
加载优先级
- 系统环境变量 (
process.env) - 当前目录
.canghe-skills/.env - 用户主目录
~/.canghe-skills/.env
使用方法
1. 基础视频分析(Files API 方式 - 推荐)
cd ~/.openclaw/workspace/skills/volcengine-video-understanding
python3 scripts/video_understand.py /path/to/video.mp4 "描述这个视频的内容"
2. 视频问答
python3 scripts/video_understand.py /path/to/video.mp4 "视频中出现了哪些人物?"
3. 情感分析
python3 scripts/video_understand.py /path/to/video.mp4 "分析视频中人物的情感变化"
4. 指定模型和帧率
python3 scripts/video_understand.py /path/to/video.mp4 "总结视频要点" \
--model doubao-seed-2-0-pro-260215 \
--fps 2
5. 保存结果到文件
python3 scripts/video_understand.py /path/to/video.mp4 "描述视频" --output result.json
参数说明
| 参数 | 默认值 | 说明 |
|---|---|---|
video_path |
必填 | 视频文件路径 |
instruction |
必填 | 分析指令/问题 |
--model |
doubao-seed-2-0-pro-260215 | 模型 ID |
--fps |
1 | 视频采样帧率(预处理) |
--output |
- | 结果输出文件路径 |
支持的模型
doubao-seed-2-0-pro-260215(默认)doubao-seed-2-0-lite-250728doubao-seed-1-6-251015- 其他 Seed 系列视频理解模型
分析示例
示例 1:视频内容描述
python3 scripts/video_understand.py ~/Desktop/video.mp4 "详细描述这个视频的内容,包括场景、人物和动作"
示例 2:视频摘要
python3 scripts/video_understand.py ~/Desktop/video.mp4 "用3句话总结这个视频的要点"
示例 3:动作识别
python3 scripts/video_understand.py ~/Desktop/video.mp4 "视频中的人物在做什么动作?按时间顺序描述"
示例 4:场景分析
python3 scripts/video_understand.py ~/Desktop/video.mp4 "分析视频中的场景变化和环境特征"
技术细节
调用流程
- 上传视频:通过 Files API 上传本地视频文件,指定 FPS 预处理配置
- 等待处理:等待视频预处理完成(状态变为 processed)
- 创建任务:调用 Responses API 进行视频理解
- 获取结果:返回分析结果
API 格式
Files API 上传:
curl https://ark.cn-beijing.volces.com/api/v3/files \
-H "Authorization: Bearer $ARK_API_KEY" \
-F 'purpose=user_data' \
-F '[email protected]' \
-F 'preprocess_configs[video][fps]=1'
Responses API 分析:
{
"model": "doubao-seed-2-0-pro-260215",
"input": [
{
"role": "user",
"content": [
{
"type": "input_video",
"file_id": "file-xxxx"
},
{
"type": "input_text",
"text": "用户指令"
}
]
}
]
}
FPS 设置建议
| FPS | 适用场景 |
|---|---|
| 0.3-0.5 | 慢节奏视频、静态场景、节省token |
| 1 | 一般视频分析(默认) |
| 2-3 | 快速动作、细节分析 |
限制
- 视频格式:MP4(推荐)、MOV、AVI
- 文件大小:最大 512MB(Files API 方式)
- 存储时间:上传的文件默认存储 7 天
- 处理时间:根据视频长度和复杂度,通常 10-60 秒
Python API 使用
from scripts.video_understand import analyze_video
result = analyze_video(
file_path="/path/to/video.mp4",
instruction="描述视频内容",
model="doubao-seed-2-0-pro-260215",
fps=1
)
# 提取回答
text = ""
for item in result.get("output", []):
if item.get("type") == "message":
for content in item.get("content", []):
if content.get("type") == "output_text":
text = content.get("text", "")
break
print(text)
错误处理
常见错误及解决方案:
| 错误 | 原因 | 解决方案 |
|---|---|---|
| API Key 错误 | 未设置或错误 | 检查 ARK_API_KEY 环境变量 |
| 文件不存在 | 路径错误 | 检查文件路径 |
| 上传失败 | 文件过大或格式不支持 | 检查文件大小(<512MB)和格式 |
| 处理超时 | 视频过长或复杂 | 缩短视频或降低 FPS |
参考文档
How to use volcengine-video-understanding on Cursor
AI-first code editor with Composer
Prerequisites
Before installing skills in Cursor, ensure your development environment meets these requirements:
- ›Cursor installed and configured on your development machine
- ›Node.js version 16.0+ with npm package manager (verify with
node --version) - ›Active project directory or workspace where you want to add volcengine-video-understanding
Execute installation command
Execute the skills CLI command in your project's root directory to begin installation:
The skills CLI fetches volcengine-video-understanding from GitHub repository freestylefly/canghe-skills and configures it for Cursor.
Select Cursor when prompted
The CLI will show a list of available agents. Use arrow keys to navigate and space to select Cursor:
Verify installation
Confirm successful installation by checking the skill directory location:
Reload or restart Cursor to activate volcengine-video-understanding. Access the skill through slash commands (e.g., /volcengine-video-understanding) or your agent's skill management interface.
Security & Verification Notice
We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.
Skills execute code in your development environment. Always verify the publisher's identity, review recent commits, and test in isolated environments before production deployment.
List & Monetize Your Skill
Submit your Claude Code skill and start earning
Use Cases▌
Task Automation & Efficiency
Automate repetitive workflows and reduce manual effort
Example
Generate reports, summarize documents, draft communications
Save 3-5 hours per week on routine tasks
Knowledge Enhancement
Learn new skills, understand complex topics, get expert guidance
Example
Explain concepts, provide examples, suggest learning resources
Accelerate learning and skill development by 2x
Quality Improvement
Enhance output quality through reviews, suggestions, and refinements
Example
Review drafts, suggest improvements, catch errors
Improve work quality by 30-40% with less effort
Implementation Guide▌
Prerequisites
- ›Claude Desktop or compatible AI client with skill support
- ›Clear understanding of task or problem to solve
- ›Willingness to iterate and refine outputs
Time Estimate
15-45 minutes depending on use case complexity
Installation Steps
- 1.Install skill using provided installation command
- 2.Test with simple use case relevant to your work
- 3.Evaluate output quality and relevance
- 4.Iterate on prompts to improve results
- 5.Integrate into regular workflow if valuable
Common Pitfalls
- ⚠Expecting perfect results without iteration
- ⚠Not providing enough context in prompts
- ⚠Using skill for tasks outside its intended scope
- ⚠Accepting outputs without review and validation
Best Practices▌
✓ Do
- +Start with clear, specific prompts
- +Provide relevant context and constraints
- +Review and refine all outputs before using
- +Iterate to improve output quality
- +Document successful prompt patterns
✗ Don't
- −Don't use without understanding skill limitations
- −Don't skip validation of outputs
- −Don't share sensitive information in prompts
- −Don't expect skill to replace human judgment
💡 Pro Tips
- ★Be specific about desired format and style
- ★Ask for multiple options to choose from
- ★Request explanations to understand reasoning
- ★Combine AI efficiency with human expertise
When to Use This▌
✓ Use When
Use when skill capabilities match your task, clear ROI on time saved, and you can validate outputs. Best for repetitive tasks, learning, and quality improvement.
✗ Avoid When
Avoid when task requires deep expertise you can't validate, involves sensitive decisions, or when learning process is more valuable than speed of completion.
Learning Path▌
- 1Familiarize yourself with skill capabilities and limitations
- 2Start with low-risk, non-critical tasks
- 3Progress to more complex and valuable use cases
- 4Build expertise through regular use and experimentation
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.7★★★★★27 reviews- ★★★★★Olivia Gonzalez· Dec 12, 2024
Useful defaults in volcengine-video-understanding — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Ren Taylor· Nov 15, 2024
Solid pick for teams standardizing on skills: volcengine-video-understanding is focused, and the summary matches what you get after install.
- ★★★★★Noah Singh· Nov 3, 2024
volcengine-video-understanding is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Advait Park· Oct 22, 2024
Keeps context tight: volcengine-video-understanding is the kind of skill you can hand to a new teammate without a long onboarding doc.
- ★★★★★Noah Rahman· Oct 6, 2024
volcengine-video-understanding has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Piyush G· Sep 25, 2024
I recommend volcengine-video-understanding for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Aarav Torres· Sep 25, 2024
volcengine-video-understanding fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Shikha Mishra· Aug 16, 2024
Useful defaults in volcengine-video-understanding — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Aarav Jain· Aug 16, 2024
We added volcengine-video-understanding from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Diego Perez· Aug 4, 2024
volcengine-video-understanding fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
showing 1-10 of 27