videoagent▌
3 indexed skills · max 10 per page
videoagent-audio-studio
pexoai/pexo-skills · Video
Unified audio generation dispatcher routing TTS, music, sound effects, and voice cloning to optimal models. \n \n Routes requests to ElevenLabs (TTS, voice cloning, SFX) or fal.ai (music) based on request type, with latencies ranging from <1s to ~15s \n Supports five audio capabilities: multilingual text-to-speech with voice selection, low-latency turbo TTS, background music composition, sound effect generation (up to 22 seconds), and voice cloning from audio samples \n Requires only ELEVEN
videoagent-image-studio
pexoai/pexo-skills · Video
Unified access to 8 AI image generation models with automatic model selection and zero API key setup. \n \n Supports Midjourney, Flux (Pro/Dev/Schnell), Ideogram, Recraft, SDXL, and Nano Banana with automatic model routing based on user intent \n Handles Midjourney's async polling transparently; all models return consistent output format with image URLs \n Includes Midjourney actions (upscale, variation, reroll) and reference image support for style consistency \n All requests routed through hos
videoagent-video-studio
pexoai/pexo-skills · Video
Generate short AI videos from text or images using 7 backend models with zero API key setup. \n \n Supports three generation modes: text-to-video, image-to-video, and reference-based generation for consistent output \n Seven models available (minimax, kling, veo, hunyuan, grok, seedance, pixverse) with automatic selection or manual override via --model flag \n Configurable duration (4–12 seconds), aspect ratios (16:9, 9:16, 1:1, 4:3, 3:4), and automatic prompt enhancement for better results \n S