tag

tts

16 indexed skills · max 10 per page

skills (16)

alicloud-ai-audio-tts

cinience/alicloud-skills · Cloud

0

Category: provider

alicloud-ai-audio-tts-voice-clone

cinience/alicloud-skills · Cloud

0

Voice cloning and text-to-speech synthesis using Alibaba Cloud Qwen TTS VC models. \n \n Supports two model variants: standard batch processing ( qwen3-tts-vc-2026-01-22 ) and real-time streaming ( qwen3-tts-vc-realtime-2026-01-15 ) \n Accepts voice samples as file paths or raw bytes; generates cloned voice IDs for reuse across multiple synthesis requests \n Normalized interface handles text input, voice enrollment, optional streaming output, and returns audio URLs or PCM chunks \n Requires DASH

tts

marswaveai/skills · Productivity

0

Convert text to natural-sounding speech with single or multi-speaker audio generation. \n \n Two modes: Quick mode for instant single-voice MP3 output, and Script mode for multi-speaker dialogue with per-character voice assignment \n Automatic mode detection based on input structure; supports both plain text and structured scripts with character markers \n Built-in speaker selection with language support (Chinese and English) and preference saving to local config \n Configurable output modes: in

speak-tts

emzod/speak · Productivity

0

Real-time text-to-speech with voice cloning on Apple Silicon, entirely on-device. \n \n Supports multiple input sources (text files, markdown, stdin, web articles, PDFs) and output modes (streaming, file save, playback, or both) \n Voice cloning from 10–30 second WAV samples at 24000 Hz mono; includes emotion tags like [laugh] , [sigh] , and [gasp] for audible effects \n Batch processing with auto-chunking for long documents, concatenation utilities, and resume capability for interrupted generat

speakturbo-tts

emzod/speak-turbo · Productivity

0

Ultra-fast text-to-speech with ~90ms latency and 8 built-in voices. \n \n Delivers audio in approximately 90ms after daemon warmup, with first run taking 2-5 seconds for model initialization \n Includes 8 pre-configured voices (alba, marius, javert, jean, fantine, cosette, eponine, azelma) accessible via simple command-line flags \n Supports file output with configurable directory allowlisting, quiet mode, and UTF-8 text input including long-form content \n Auto-starting daemon with 1-hour idle

tts

noizai/skills · Productivity

0

Text-to-speech with dual backends, voice cloning, and timeline-accurate audio synthesis for dubbing and video narration. \n \n Supports two backends: Kokoro (local, offline) for simple speech synthesis, and Noiz (cloud) for voice cloning, emotion control, and precise segment timing \n Simple mode converts text, files, or URLs to audio with optional voice cloning from reference audio; timeline mode aligns speech to SRT subtitles with per-segment voice and emotion control \n Voice maps enable gran

prevpage 2 / 2next