qwen-voice▌
ada20204/qwen-voice · updated Apr 8, 2026
Use the bundled scripts. Configure DASHSCOPE_API_KEY in one of:
Qwen Voice (ASR + TTS)
Use the bundled scripts. Configure DASHSCOPE_API_KEY in one of:
~/.config/qwen-voice/.env(recommended)<repo>/.qwen-voice/.env(dev/testing)
ASR (speech → text)
Non-timestamp (default)
python3 skills/qwen-voice/scripts/qwen_asr.py --in /path/to/audio.ogg
With timestamps (chunk-based)
python3 skills/qwen-voice/scripts/qwen_asr.py --in /path/to/audio.ogg --timestamps --chunk-sec 3
Notes:
- Timestamps are generated by fixed-length chunking (not word-level alignment).
- Input audio is converted to mono 16kHz WAV before sending.
TTS (text → speech)
Preset voice (default: Cherry)
python3 skills/qwen-voice/scripts/qwen_tts.py --text '你好,我是 Pi。' --voice Cherry --out /tmp/out.ogg
Clone voice (create once, reuse)
- Create a voice profile from a sample audio:
python3 skills/qwen-voice/scripts/qwen_voice_clone.py --in ./voice_sample.ogg --name george --out work/qwen-voice/george.voice.json
- Use the cloned voice to synthesize:
python3 skills/qwen-voice/scripts/qwen_tts.py --text '你好,我是 George。' --voice-profile work/qwen-voice/george.voice.json --out /tmp/out.ogg
Notes:
.oggoutput is Opus, suitable for Telegram voice messages.- Voice cloning uses DashScope customization endpoint + Qwen realtime TTS model.
- Scripts use a local venv at
work/venv-dashscope(auto-created on first run).
Typical chat workflow
- When user sends voice message/audio: run ASR and reply with the transcribed text.
- When user explicitly asks for voice reply: run TTS and send the generated
.oggas a voice note.
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.5★★★★★27 reviews- ★★★★★Ganesh Mohane· Dec 12, 2024
qwen-voice fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Kofi Agarwal· Dec 4, 2024
We added qwen-voice from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Yash Thakker· Nov 27, 2024
qwen-voice has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Tariq Brown· Nov 19, 2024
qwen-voice reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Kofi Chen· Nov 11, 2024
qwen-voice is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Sakshi Patil· Nov 3, 2024
Registry listing for qwen-voice matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Chaitanya Patil· Oct 22, 2024
qwen-voice reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Dhruvi Jain· Oct 18, 2024
Solid pick for teams standardizing on skills: qwen-voice is focused, and the summary matches what you get after install.
- ★★★★★Zaid Reddy· Oct 10, 2024
Registry listing for qwen-voice matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Kofi Thompson· Oct 2, 2024
Keeps context tight: qwen-voice is the kind of skill you can hand to a new teammate without a long onboarding doc.
showing 1-10 of 27