whisper-transcription▌
guia-matthieu/clawfu-skills · updated Apr 8, 2026
Transcribe any audio or video to text using OpenAI's Whisper model - the same technology powering ChatGPT voice features.
Whisper Transcription
Transcribe any audio or video to text using OpenAI's Whisper model - the same technology powering ChatGPT voice features.
When to Use This Skill
- Podcast repurposing - Convert episodes to blog posts, show notes, social snippets
- Video subtitles - Generate SRT/VTT files for YouTube, social media
- Interview extraction - Pull quotes and insights from recorded calls
- Content audit - Make audio/video libraries searchable
- Translation - Transcribe and translate foreign language content
What Claude Does vs What You Decide
| Claude Does | You Decide |
|---|---|
| Structures production workflow | Final creative direction |
| Suggests technical approaches | Equipment and tool choices |
| Creates templates and checklists | Quality standards |
| Identifies best practices | Brand/voice decisions |
| Generates script outlines | Final script approval |
Dependencies
pip install openai-whisper torch ffmpeg-python click
# Also requires ffmpeg installed on system
# macOS: brew install ffmpeg
# Ubuntu: sudo apt install ffmpeg
Commands
Transcribe Single File
python scripts/main.py transcribe audio.mp3 --model medium --output transcript.txt
python scripts/main.py transcribe video.mp4 --format srt --output subtitles.srt
Batch Transcription
python scripts/main.py batch ./recordings/ --format txt --output ./transcripts/
Transcribe + Translate
python scripts/main.py translate foreign-audio.mp3 --to en
Extract Timestamps
python scripts/main.py timestamps podcast.mp3 --format json
Examples
Example 1: Podcast to Blog Post
# Transcribe 1-hour podcast
python scripts/main.py transcribe episode-42.mp3 --model medium
# Output: episode-42.txt (full transcript with timestamps)
# Processing time: ~5 min for 1 hour audio on M1 Mac
Example 2: YouTube Subtitles
# Generate SRT for video upload
python scripts/main.py transcribe marketing-video.mp4 --format srt
# Output: marketing-video.srt
# Upload directly to YouTube/Vimeo
Example 3: Batch Process Interview Library
# Transcribe all recordings in folder
python scripts/main.py batch ./customer-interviews/ --model small --format txt
# Output: ./customer-interviews/*.txt (one per audio file)
Model Selection Guide
| Model | Speed | Accuracy | VRAM | Best For |
|---|---|---|---|---|
tiny |
Fastest | ~70% | 1GB | Quick drafts, short clips |
base |
Fast | ~80% | 1GB | Social media clips |
small |
Medium | ~85% | 2GB | Podcasts, interviews |
medium |
Slow | ~90% | 5GB | Professional transcripts |
large |
Slowest | ~95% | 10GB | Critical accuracy needs |
Recommendation: Start with small for most marketing content. Use medium for client deliverables.
Output Formats
| Format | Extension | Use Case |
|---|---|---|
txt |
.txt | Blog posts, analysis |
srt |
.srt | Video subtitles (YouTube) |
vtt |
.vtt | Web video subtitles |
json |
.json | Programmatic access |
tsv |
.tsv | Spreadsheet analysis |
Performance Tips
- GPU acceleration - 10x faster with CUDA GPU
- Audio extraction - Script auto-extracts audio from video
- Chunking - Long files auto-split for memory efficiency
- Language detection - Automatic, or specify with
--language
Skill Boundaries
What This Skill Does Well
- Structuring audio production workflows
- Providing technical guidance
- Creating quality checklists
- Suggesting creative approaches
What This Skill Cannot Do
- Replace audio engineering expertise
- Make subjective creative decisions
- Access or edit audio files directly
- Guarantee commercial success
Related Skills
- video-processing - Extract audio from video
- youtube-downloader - Download videos to transcribe
- content-repurposer - Transform transcripts to content
- podcast-production - Create podcasts
Skill Metadata
- Mode: cyborg
category: automation
subcategory: audio-processing
dependencies: [openai-whisper, torch, ffmpeg-python]
difficulty: beginner
time_saved: 10+ hours/week
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.7★★★★★71 reviews- ★★★★★Benjamin Smith· Dec 28, 2024
whisper-transcription fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Alexander Kapoor· Dec 28, 2024
We added whisper-transcription from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Kabir Dixit· Dec 24, 2024
whisper-transcription reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Charlotte Nasser· Dec 24, 2024
whisper-transcription is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Dhruvi Jain· Dec 20, 2024
Registry listing for whisper-transcription matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Diya White· Dec 16, 2024
Keeps context tight: whisper-transcription is the kind of skill you can hand to a new teammate without a long onboarding doc.
- ★★★★★Henry Haddad· Dec 12, 2024
whisper-transcription has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Hana Kim· Dec 12, 2024
whisper-transcription reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Charlotte Khanna· Dec 8, 2024
whisper-transcription fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Dev Khanna· Nov 19, 2024
whisper-transcription reduced setup friction for our internal harness; good balance of opinion and flexibility.
showing 1-10 of 71