audio▌
21 indexed skills · max 10 per page
alicloud-ai-audio-tts-realtime-test
cinience/alicloud-skills · Cloud
Category: test
alicloud-ai-audio-tts-voice-design
cinience/alicloud-skills · Cloud
Category: provider
alicloud-ai-audio-asr
cinience/alicloud-skills · Cloud
Category: provider
audiocraft-audio-generation
davila7/claude-code-templates · Productivity
Comprehensive guide to using Meta's AudioCraft for text-to-music and text-to-audio generation with MusicGen, AudioGen, and EnCodec.
alicloud-ai-audio-tts
cinience/alicloud-skills · Cloud
Category: provider
alicloud-ai-audio-tts-voice-clone
cinience/alicloud-skills · Cloud
Voice cloning and text-to-speech synthesis using Alibaba Cloud Qwen TTS VC models. \n \n Supports two model variants: standard batch processing ( qwen3-tts-vc-2026-01-22 ) and real-time streaming ( qwen3-tts-vc-realtime-2026-01-15 ) \n Accepts voice samples as file paths or raw bytes; generates cloned voice IDs for reuse across multiple synthesis requests \n Normalized interface handles text input, voice enrollment, optional streaming output, and returns audio URLs or PCM chunks \n Requires DASH
musickit-audio
dpearson2699/swift-ios-skills · Productivity
Search the Apple Music catalog, manage playback with ApplicationMusicPlayer, check subscriptions, and publish Now Playing metadata via MPNowPlayingInfoCenter and MPRemoteCommandCenter. Targets Swift 6.2 / iOS 26+.
audio-transcriber
sickn33/antigravity-awesome-skills · Productivity
Transcribe audio files to structured Markdown with intelligent meeting minutes and executive summaries. \n \n Supports MP3, WAV, M4A, OGG, FLAC, WEBM formats with automatic format detection and conversion via ffmpeg \n Auto-detects and uses Faster-Whisper (4-5x faster) or OpenAI Whisper with zero configuration; offers one-click dependency installation \n Extracts rich metadata (speakers, timestamps, language, duration, file size) and generates structured meeting minutes with topics, decisions, a
dialogue-audio
inferen-sh/skills · Productivity
Realistic multi-speaker dialogue audio generation with Dia TTS via inference.sh CLI. \n \n Supports two-speaker conversations with automatic voice assignment using [S1] and [S2] speaker tags \n Emotion and pacing controlled through punctuation ( . , ! , ? , ... , — ) and parenthetical sound cues like (laughs) , (sighs) , and (whispers) \n Includes structured patterns for interviews, tutorials, debates, and conversational content with practical script-writing guidelines \n Post-production support
web-audio-api
martinholovsky/claude-skills-generator · Backend
This skill provides Web Audio API expertise for creating audio feedback, voice processing, and sound effects in the JARVIS AI Assistant.