baoyu-youtube-transcript▌
jimliu/baoyu-skills · updated Apr 8, 2026
MDX-style export adds YAML metadata + attribution linking explainx.ai and this canonical listing URL.
Downloads transcripts (subtitles/captions) from YouTube videos. Works with both manually created and auto-generated transcripts. No API key or browser required — uses YouTube's InnerTube API directly and automatically falls back to yt-dlp when YouTube blocks the direct API path.
YouTube Transcript
Downloads transcripts (subtitles/captions) from YouTube videos. Works with both manually created and auto-generated transcripts. No API key or browser required — uses YouTube's InnerTube API directly and automatically falls back to yt-dlp when YouTube blocks the direct API path.
Fetches video metadata and cover image on first run, caches raw data for fast re-formatting.
Script Directory
Scripts in scripts/ subdirectory. {baseDir} = this SKILL.md's directory path. Resolve ${BUN_X} runtime: if bun installed → bun; if npx available → npx -y bun; else suggest installing bun. Replace {baseDir} and ${BUN_X} with actual values.
| Script | Purpose |
|---|---|
scripts/main.ts |
Transcript download CLI |
Usage
# Default: markdown with timestamps (English)
${BUN_X} {baseDir}/scripts/main.ts <youtube-url-or-id>
# Specify languages (priority order)
${BUN_X} {baseDir}/scripts/main.ts <url> --languages zh,en,ja
# Without timestamps
${BUN_X} {baseDir}/scripts/main.ts <url> --no-timestamps
# With chapter segmentation
${BUN_X} {baseDir}/scripts/main.ts <url> --chapters
# With speaker identification (requires AI post-processing)
${BUN_X} {baseDir}/scripts/main.ts <url> --speakers
# SRT subtitle file
${BUN_X} {baseDir}/scripts/main.ts <url> --format srt
# Translate transcript
${BUN_X} {baseDir}/scripts/main.ts <url> --translate zh-Hans
# List available transcripts
${BUN_X} {baseDir}/scripts/main.ts <url> --list
# Force re-fetch (ignore cache)
${BUN_X} {baseDir}/scripts/main.ts <url> --refresh
Options
| Option | Description | Default |
|---|---|---|
<url-or-id> |
YouTube URL or video ID (multiple allowed) | Required |
--languages <codes> |
Language codes, comma-separated, in priority order | en |
--format <fmt> |
Output format: text, srt |
text |
--translate <code> |
Translate to specified language code | |
--list |
List available transcripts instead of fetching | |
--timestamps |
Include [HH:MM:SS → HH:MM:SS] timestamps per paragraph |
on |
--no-timestamps |
Disable timestamps | |
--chapters |
Chapter segmentation from video description | |
--speakers |
Raw transcript with metadata for speaker identification | |
--exclude-generated |
Skip auto-generated transcripts | |
--exclude-manually-created |
Skip manually created transcripts | |
--refresh |
Force re-fetch, ignore cached data | |
-o, --output <path> |
Save to specific file path | auto-generated |
--output-dir <dir> |
Base output directory | youtube-transcript |
Optional Environment Variables
| Variable | Description |
|---|---|
YOUTUBE_TRANSCRIPT_COOKIES_FROM_BROWSER |
Passed to yt-dlp --cookies-from-browser during fallback, e.g. chrome, safari, firefox, or chrome:Profile 1 |
Input Formats
Accepts any of these as video input:
- Full URL:
https://www.youtube.com/watch?v=dQw4w9WgXcQ - Short URL:
https://youtu.be/dQw4w9WgXcQ - Embed URL:
https://www.youtube.com/embed/dQw4w9WgXcQ - Shorts URL:
https://www.youtube.com/shorts/dQw4w9WgXcQ - Video ID:
dQw4w9WgXcQ
Output Formats
| Format | Extension | Description |
|---|---|---|
text |
.md |
Markdown with frontmatter (incl. description), title heading, summary, optional TOC/cover/timestamps/chapters/speakers |
srt |
.srt |
SubRip subtitle format for video players |
Output Directory
youtube-transcript/
├── .index.json # Video ID → directory path mapping (for cache lookup)
└── {channel-slug}/{title-full-slug}/
├── meta.json # Video metadata (title, channel, description, duration, chapters, etc.)
├── transcript-raw.json # Raw transcript snippets from YouTube API (cached)
├── transcript-sentences.json # Sentence-segmented transcript (split by punctuation, merged across snippets)
├── imgs/
│ └── cover.jpg # Video thumbnail
├── transcript.md # Markdown transcript (generated from sentences)
└── transcript.srt # SRT subtitle (generated from raw snippets, if --format srt)
{channel-slug}: Channel name in kebab-case{title-full-slug}: Full video title in kebab-case
The --list mode outputs to stdout only (no file saved).
Caching
On first fetch, the script saves:
meta.json— video metadata, chapters, cover image path, language infotranscript-raw.json— raw transcript snippets from YouTube API ({ text, start, duration }[])transcript-sentences.json— sentence-segmented transcript ({ text, start: "HH:mm:ss", end: "HH:mm:ss" }[]), split by sentence-ending punctuation (.?!…。?!etc.), timestamps proportionally allocated by character length, CJK-aware text mergingimgs/cover.jpg— video thumbnail
Subsequent runs for the same video use cached data (no network calls). Use --refresh to force re-fetch. If a different language is requested, the cache is automatically refreshed.
When YouTube returns anti-bot / blocked responses on the direct InnerTube path, the script retries with alternate client identities and then falls back to yt-dlp if available. If fallback is needed but yt-dlp is unavailable, the agent should decide how to make yt-dlp available and continue rather than pushing the installation decision to the user.
SRT output (--format srt) is generated from transcript-raw.json. Text/markdown output uses transcript-sentences.json for natural sentence boundaries.
Workflow
When user provides a YouTube URL and wants the transcript:
- Run with
--listfirst if the user hasn't specified a language, to show available options - Always single-quote the URL when running the script — zsh treats
?as a glob wildcard, so an unquoted YouTube URL causes "no matches found": use'https://www.youtube.com/watch?v=ID' - Default: run with
--chapters --speakersfor the richest output (chapters + speaker identification) - The script auto-saves cached data + output file and prints the file path
- For
--speakersmode: after the script saves the raw file, follow the speaker identification workflow below to post-process with speaker labels
When user only wants a cover image or metadata, running the script with any option will also cache meta.json and imgs/cover.jpg.
When re-formatting the same video (e.g., first text then SRT), the cached data is reused — no re-fetch needed.
Chapter & Speaker Workflow
Chapters (--chapters)
The script parses chapter timestamps from the video description (e.g., 0:00 Introduction), segments the transcript by chapter boundaries, groups snippets into readable paragraphs, and saves as .md with a Table of Contents. No further processing needed.
If no chapter timestamps exist in the description, the transcript is output as grouped paragraphs without chapter headings.
Speaker Identification (--speakers)
Speaker identification requires AI processing. The script outputs a raw .md file containing:
- YAML frontmatter with video metadata (title, channel, date, cover, description, language)
- Video description (for speaker name extraction)
- Chapter list from description (if available)
- Raw transcript in SRT format (pre-computed start/end timestamps, token-efficient)
After the script saves the raw file, spawn a sub-agent (use a cheaper model like Sonnet for cost efficiency) to process speaker identification:
- Read the saved
.mdfile - Read the prompt template at
{baseDir}/prompts/speaker-transcript.md - Process the raw transcript following the prompt:
- Identify speakers using video metadata (title → guest, channel → host, description → names)
- Detect speaker turns from conversation flow, question-answer patterns, and contextual cues
- Segment into chapters (use description chapters if available, else create from topic shifts)
- Format with
**Speaker Name:**labels, paragraph grouping (2-4 sentences), and[HH:MM:SS → HH:MM:SS]timestamps
- Overwrite the
.mdfile with the processed transcript (keep the YAML frontmatter)
When --speakers is used, --chapters is implied — the processed output always includes chapter segmentation.
Error Cases
| Error | Meaning |
|---|---|
| Transcripts disabled | Video has no captions at all |
| No transcript found | Requested language not available |
| Video unavailable | Video deleted, private, or region-locked |
| IP blocked | Too many requests, try again later |
| Age restricted | Video requires login for age verification |
| bot detected | The script retries alternate clients and then yt-dlp; if fallback tooling is missing, the agent should resolve that itself, otherwise if it still fails try YOUTUBE_TRANSCRIPT_COOKIES_FROM_BROWSER=safari (or your browser) |
How to use baoyu-youtube-transcript on Cursor
AI-first code editor with Composer
Prerequisites
Before installing skills in Cursor, ensure your development environment meets these requirements:
- ›Cursor installed and configured on your development machine
- ›Node.js version 16.0+ with npm package manager (verify with
node --version) - ›Active project directory or workspace where you want to add baoyu-youtube-transcript
Execute installation command
Execute the skills CLI command in your project's root directory to begin installation:
The skills CLI fetches baoyu-youtube-transcript from GitHub repository jimliu/baoyu-skills and configures it for Cursor.
Select Cursor when prompted
The CLI will show a list of available agents. Use arrow keys to navigate and space to select Cursor:
Verify installation
Confirm successful installation by checking the skill directory location:
Reload or restart Cursor to activate baoyu-youtube-transcript. Access the skill through slash commands (e.g., /baoyu-youtube-transcript) or your agent's skill management interface.
Security & Verification Notice
We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.
Skills execute code in your development environment. Always verify the publisher's identity, review recent commits, and test in isolated environments before production deployment.
List & Monetize Your Skill
Submit your Claude Code skill and start earning
Use Cases▌
User Story & Requirements Generation
Create detailed user stories, acceptance criteria, and feature specs
Example
Generate user stories for 'password reset feature' with acceptance criteria, edge cases, and test scenarios
Reduce spec writing time by 50%, ensure comprehensive coverage
Competitive Analysis
Research competitors, compare features, identify gaps
Example
Analyze 5 competitor products, create feature comparison matrix, suggest differentiation opportunities
Complete competitive research in 2 hours instead of 2 days
Roadmap Prioritization
Evaluate features using frameworks (RICE, ICE, Kano) and create prioritized backlogs
Example
Score 20 feature ideas using RICE framework, generate prioritized roadmap with rationale
Make data-driven prioritization decisions faster
Stakeholder Communication
Draft PRDs, status updates, and stakeholder presentations
Example
Create executive summary of Q3 roadmap, monthly progress report, feature launch announcement
Save 3-5 hours/week on communication overhead
Implementation Guide▌
Prerequisites
- ›Claude Desktop or compatible AI client
- ›Access to product documentation and roadmap tools (Jira, Notion, etc.)
- ›Understanding of product management frameworks (RICE, Jobs-to-be-Done, etc.)
- ›Stakeholder contact information and communication channels
Time Estimate
30-60 minutes to see productivity improvements
Installation Steps
- 1.Install product management skill
- 2.Start with user story generation for known feature
- 3.Progress to competitive analysis: research 2-3 competitors
- 4.Use for roadmap prioritization: apply RICE/ICE scoring
- 5.Draft stakeholder communications and refine based on feedback
- 6.Build template library for recurring PM tasks
- 7.Share effective prompts with product team
Common Pitfalls
- ⚠Not validating competitive research—verify facts before sharing
- ⚠Accepting user stories without involving engineering team
- ⚠Over-relying on frameworks without qualitative judgment
- ⚠Not customizing outputs to company culture and communication style
- ⚠Skipping stakeholder validation of generated requirements
Best Practices▌
✓ Do
- +Validate research and competitive analysis with real data
- +Collaborate with engineering when generating technical requirements
- +Customize frameworks and templates to your company context
- +Use skill for first drafts, refine with stakeholder input
- +Document successful prompt patterns for PM tasks
- +Combine AI efficiency with human judgment and intuition
✗ Don't
- −Don't publish competitive analysis without fact-checking
- −Don't finalize user stories without engineering review
- −Don't make prioritization decisions solely on AI scoring
- −Don't skip customer validation of generated requirements
- −Don't ignore company-specific context and culture
💡 Pro Tips
- ★Provide context: company goals, constraints, customer feedback
- ★Ask for alternatives: 'Show 3 ways to prioritize this roadmap'
- ★Request stakeholder-specific formatting: 'Executive summary vs. engineering spec'
- ★Use skill for 70% generation + 30% customization to company needs
When to Use This▌
✓ Use When
Use for user story writing, competitive research, roadmap prioritization, stakeholder communication, and PRD drafting. Best for reducing repetitive documentation and research work.
✗ Avoid When
Avoid for strategic product vision (requires deep customer empathy), pricing decisions (needs market and financial expertise), or when face-to-face customer discovery is more valuable than speed.
Learning Path▌
- 1Basic: user stories, feature specs, status updates
- 2Intermediate: competitive analysis, prioritization frameworks, PRDs
- 3Advanced: product strategy, go-to-market planning, OKR setting
- 4Expert: product vision, market positioning, business model innovation
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.4★★★★★29 reviews- ★★★★★Ganesh Mohane· Dec 16, 2024
Solid pick for teams standardizing on skills: baoyu-youtube-transcript is focused, and the summary matches what you get after install.
- ★★★★★Daniel Zhang· Dec 12, 2024
Registry listing for baoyu-youtube-transcript matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Aisha Taylor· Dec 4, 2024
baoyu-youtube-transcript has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Charlotte Tandon· Dec 4, 2024
baoyu-youtube-transcript reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Aanya Tandon· Nov 23, 2024
baoyu-youtube-transcript fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Sakshi Patil· Nov 7, 2024
We added baoyu-youtube-transcript from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Chaitanya Patil· Oct 26, 2024
baoyu-youtube-transcript fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Carlos Martinez· Oct 14, 2024
We added baoyu-youtube-transcript from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Rahul Santra· Sep 17, 2024
I recommend baoyu-youtube-transcript for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Henry Brown· Sep 9, 2024
baoyu-youtube-transcript has been reliable in day-to-day use. Documentation quality is above average for community skills.
showing 1-10 of 29