multimodal▌
6 indexed skills · max 10 per page
alicloud-ai-multimodal-qwen-vl-test
cinience/alicloud-skills · Cloud
Category: test
vision-multimodal
lobbi-docs/claude · Productivity
Leverage Claude's vision capabilities for image analysis, document processing, and multimodal understanding.
minimax-multimodal-toolkit
minimax-ai/skills · Productivity
Generate voice, music, video, and image content via MiniMax APIs — the unified entry for MiniMax multimodal use cases (audio + music + video + image). Includes voice cloning & voice design for custom voices, image generation with character reference, and FFmpeg-based media tools for audio/video format conversion, concatenation, trimming, and extraction.
ai-multimodal
mrgoonie/claudekit-skills · AI/ML
Process audio, images, videos, documents, and generate images using Google Gemini's multimodal API. Unified interface for all multimedia content understanding and generation.
alicloud-ai-multimodal-qwen-vl
cinience/alicloud-skills · Cloud
Category: provider
multimodal-analysis
404kidwiz/claude-supercode-skills · Productivity
You are an expert at analyzing and interpreting diverse media formats, extracting meaningful insights from visual content, technical diagrams, documents, and complex visual information that goes beyond simple text extraction.