whisper-transcription

guia-matthieu/clawfu-skills · updated Apr 8, 2026

$npx skills add https://github.com/guia-matthieu/clawfu-skills --skill whisper-transcription
0 commentsdiscussion
summary

Transcribe any audio or video to text using OpenAI's Whisper model - the same technology powering ChatGPT voice features.

skill.md

Whisper Transcription

Transcribe any audio or video to text using OpenAI's Whisper model - the same technology powering ChatGPT voice features.

When to Use This Skill

  • Podcast repurposing - Convert episodes to blog posts, show notes, social snippets
  • Video subtitles - Generate SRT/VTT files for YouTube, social media
  • Interview extraction - Pull quotes and insights from recorded calls
  • Content audit - Make audio/video libraries searchable
  • Translation - Transcribe and translate foreign language content

What Claude Does vs What You Decide

Claude Does You Decide
Structures production workflow Final creative direction
Suggests technical approaches Equipment and tool choices
Creates templates and checklists Quality standards
Identifies best practices Brand/voice decisions
Generates script outlines Final script approval

Dependencies

pip install openai-whisper torch ffmpeg-python click
# Also requires ffmpeg installed on system
# macOS: brew install ffmpeg
# Ubuntu: sudo apt install ffmpeg

Commands

Transcribe Single File

python scripts/main.py transcribe audio.mp3 --model medium --output transcript.txt
python scripts/main.py transcribe video.mp4 --format srt --output subtitles.srt

Batch Transcription

python scripts/main.py batch ./recordings/ --format txt --output ./transcripts/

Transcribe + Translate

python scripts/main.py translate foreign-audio.mp3 --to en

Extract Timestamps

python scripts/main.py timestamps podcast.mp3 --format json

Examples

Example 1: Podcast to Blog Post

# Transcribe 1-hour podcast
python scripts/main.py transcribe episode-42.mp3 --model medium

# Output: episode-42.txt (full transcript with timestamps)
# Processing time: ~5 min for 1 hour audio on M1 Mac

Example 2: YouTube Subtitles

# Generate SRT for video upload
python scripts/main.py transcribe marketing-video.mp4 --format srt

# Output: marketing-video.srt
# Upload directly to YouTube/Vimeo

Example 3: Batch Process Interview Library

# Transcribe all recordings in folder
python scripts/main.py batch ./customer-interviews/ --model small --format txt

# Output: ./customer-interviews/*.txt (one per audio file)

Model Selection Guide

Model Speed Accuracy VRAM Best For
tiny Fastest ~70% 1GB Quick drafts, short clips
base Fast ~80% 1GB Social media clips
small Medium ~85% 2GB Podcasts, interviews
medium Slow ~90% 5GB Professional transcripts
large Slowest ~95% 10GB Critical accuracy needs

Recommendation: Start with small for most marketing content. Use medium for client deliverables.

Output Formats

Format Extension Use Case
txt .txt Blog posts, analysis
srt .srt Video subtitles (YouTube)
vtt .vtt Web video subtitles
json .json Programmatic access
tsv .tsv Spreadsheet analysis

Performance Tips

  1. GPU acceleration - 10x faster with CUDA GPU
  2. Audio extraction - Script auto-extracts audio from video
  3. Chunking - Long files auto-split for memory efficiency
  4. Language detection - Automatic, or specify with --language

Skill Boundaries

What This Skill Does Well

  • Structuring audio production workflows
  • Providing technical guidance
  • Creating quality checklists
  • Suggesting creative approaches

What This Skill Cannot Do

  • Replace audio engineering expertise
  • Make subjective creative decisions
  • Access or edit audio files directly
  • Guarantee commercial success

Related Skills

Skill Metadata

  • Mode: cyborg
category: automation
subcategory: audio-processing
dependencies: [openai-whisper, torch, ffmpeg-python]
difficulty: beginner
time_saved: 10+ hours/week

Discussion

Product Hunt–style comments (not star reviews)
  • No comments yet — start the thread.
general reviews

Ratings

4.771 reviews
  • Benjamin Smith· Dec 28, 2024

    whisper-transcription fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Alexander Kapoor· Dec 28, 2024

    We added whisper-transcription from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.

  • Kabir Dixit· Dec 24, 2024

    whisper-transcription reduced setup friction for our internal harness; good balance of opinion and flexibility.

  • Charlotte Nasser· Dec 24, 2024

    whisper-transcription is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.

  • Dhruvi Jain· Dec 20, 2024

    Registry listing for whisper-transcription matched our evaluation — installs cleanly and behaves as described in the markdown.

  • Diya White· Dec 16, 2024

    Keeps context tight: whisper-transcription is the kind of skill you can hand to a new teammate without a long onboarding doc.

  • Henry Haddad· Dec 12, 2024

    whisper-transcription has been reliable in day-to-day use. Documentation quality is above average for community skills.

  • Hana Kim· Dec 12, 2024

    whisper-transcription reduced setup friction for our internal harness; good balance of opinion and flexibility.

  • Charlotte Khanna· Dec 8, 2024

    whisper-transcription fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Dev Khanna· Nov 19, 2024

    whisper-transcription reduced setup friction for our internal harness; good balance of opinion and flexibility.

showing 1-10 of 71

1 / 8