podcast▌
marswaveai/skills · updated Apr 8, 2026
Generate podcast episodes with 1-2 AI speakers discussing a topic. Supports quick overviews, deep analysis, and debate formats. Input can be a topic description, URL(s), or text. Output is a full audio episode with transcript.
When to Use
- User wants to create a podcast episode on any topic
- User provides a URL or text and wants it turned into a podcast discussion
- User asks for a "debate", "dialogue", or "discussion" format
- User says "podcast", "播客", or "录一期节目"
When NOT to Use
- User wants text-to-speech reading (use
/speech) - User wants an explainer video with visuals (use
/explainer) - User wants to generate an image (use
/image-gen) - User only wants to extract content from a URL without generating audio (use
/content-parser)
Purpose
Generate podcast episodes with 1-2 AI speakers discussing a topic. Supports quick overviews, deep analysis, and debate formats. Input can be a topic description, URL(s), or text. Output is a full audio episode with transcript.
Hard Constraints
- No shell scripts. Construct curl commands from the API reference files listed in Resources
- Always read
shared/authentication.mdfor API key and headers - Follow
shared/common-patterns.mdfor polling, errors, and interaction patterns - Never hardcode speaker IDs in API calls — use built-in defaults from
shared/speaker-selection.mdas fallback only; fetch from the speakers API when the user wants to change voice - Never fabricate API endpoints or parameters
- Always read config following
shared/config-pattern.mdbefore any interaction - Always follow
shared/speaker-selection.mdfor speaker selection (text table + free-text input) - Never save files to
~/Downloads/or.listenhub/— save artifacts to the current working directory with friendly topic-based names (seeshared/config-pattern.md§ Artifact Naming)
Step -1: API Key Check
Follow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately.
Step 0: Config Setup
Follow shared/config-pattern.md Step 0 (Zero-Question Boot).
If file doesn't exist — silently create with defaults and proceed:
mkdir -p ".listenhub/podcast"
echo '{"outputMode":"inline","language":null,"defaultMode":"quick","defaultMethod":"one-step","defaultSpeakers":{}}' > ".listenhub/podcast/config.json"
CONFIG_PATH=".listenhub/podcast/config.json"
CONFIG=$(cat "$CONFIG_PATH")
Do NOT ask any setup questions. Proceed directly to the Interaction Flow.
If file exists — read config silently and proceed:
CONFIG_PATH=".listenhub/podcast/config.json"
[ ! -f "$CONFIG_PATH" ] && CONFIG_PATH="$HOME/.listenhub/podcast/config.json"
CONFIG=$(cat "$CONFIG_PATH")
Setup Flow (user-initiated reconfigure only)
Only run when the user explicitly asks to reconfigure. Display current settings:
当前配置 (podcast):
输出方式:{inline / download / both}
语言偏好:{zh / en / 未设置}
默认模式:{quick / deep / debate / 未设置}
默认生成方式:{one-step / two-step}
默认主播:{speakerName(s) / 使用内置默认}
Then ask these questions in order and save:
-
outputMode: Follow
shared/output-mode.md§ Setup Flow Question. -
Language (optional): "默认语言?"
- "中文 (zh)"
- "English (en)"
- "每次手动选择" → keep
null
-
Mode (optional): "默认播客模式?"
- "Quick — 简短概述"
- "Deep — 深度分析"
- "Debate — 辩论对话"
- "每次手动选择" → keep
null
-
Method (optional): "默认生成方式?"
- "一步生成(推荐)" →
defaultMethod: "one-step" - "两步生成(先预览文本)" →
defaultMethod: "two-step" - "每次手动选择" → keep
null
- "一步生成(推荐)" →
After collecting answers, save immediately:
NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
# Save language if user chose one (not "每次手动选择")
if [ "$LANGUAGE" != "null" ]; then
NEW_CONFIG=$(echo "$NEW_CONFIG" | jq --arg lang "$LANGUAGE" '. + {"language": $lang}')
fi
# Save mode if user chose one
if [ "$MODE" != "null" ]; then
NEW_CONFIG=$(echo "$NEW_CONFIG" | jq --arg mode "$MODE" '. + {"defaultMode": $mode}')
fi
# Save method if user chose one
if [ "$METHOD" != "null" ]; then
NEW_CONFIG=$(echo "$NEW_CONFIG" | jq --arg method "$METHOD" '. + {"defaultMethod": $method}')
fi
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")
Interaction Flow
Step 1: Topic + Reference Materials
Ask topic and optional reference materials together in a single question using AskUserQuestion with two sub-questions, or a single free-text prompt:
What topic would you like to turn into a podcast? If you have reference materials (URLs or text), include them here too.
Accept: topic description, URL(s), pasted text, or any combination.
Examples of valid input:
- "AI developments in 2026"
- "https://example.com/article — discuss this"
- "The pros and cons of remote work. Reference: https://study.com/remote-work-2026"
Step 2: Mode
Default: "quick" — skip this question unless:
config.defaultModeis set to something else → use that value silently- User explicitly mentioned a mode keyword in Step 1 (e.g. "deep dive", "debate", "in depth") → infer mode from intent
Only ask this question if the user's intent is ambiguous AND no default is configured. In most cases, just use "quick".
Step 3: Language
Default: match the user's interaction language. Detect from the language the user used in Step 1:
- If the user wrote in Chinese →
zh - If the user wrote in English →
en - If
config.languageis set → use that value
Never ask this question. Always infer silently. Show in the confirmation summary so the user can override if needed.
Step 4: Speaker Count
Default: 2 speakers (dialogue) — the most common and engaging format.
Skip this question. Debate mode requires 2 speakers. For quick/deep, default to 2 speakers as well.
Only use 1 speaker if the user explicitly requests a monologue or solo format.
Step 5: Speaker Selection
Follow shared/speaker-selection.md:
- If
config.defaultSpeakers.{language}is set → use saved speakers silently - If not set → use built-in defaults from
shared/speaker-selection.md(no question asked) - Show the speaker(s) in the confirmation summary — user can change from there if desired
- Only show the full speaker list if the user explicitly asks to change voices
For 2-speaker mode (dialogue/debate): use Primary + Secondary defaults for the language.
Step 6: Generation Method
Default: "one-step" — skip this question unless:
config.defaultMethodis set → use that value silently- User explicitly asks to review text first → use "two-step"
Step 7: Confirm & Generate
Summarize all choices:
Ready to generate podcast:
Topic: {topic}
Mode: {mode}
Language: {language}
Speakers: {speaker name(s)}
References: {yes/no + brief description}
Method: {one-step/two-step}
Proceed?
Wait for explicit confirmation before calling any API. The user can adjust any parameter here before confirming.
Workflow
One-Step Generation
-
Submit (foreground):
POST /podcast/episodeswith collected parameters → extractepisodeId -
Tell the user the task is submitted
-
Poll (background): Run the following exact bash command with
run_in_background: trueandtimeout: 600000. Do NOT use python3, awk, or any other JSON parser — usejqas shown:EPISODE_ID="<id-from-step-1>" for i in $(seq 1 30); do RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/podcast/episodes/$EPISODE_ID" \ -H "Authorization: Bearer $LISTENHUB_API_KEY" \ -H "X-Source: skills" 2>/dev/null) STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.processStatus // "pending"') case "$STATUS" in success|completed) echo "$RESULT"; exit 0 ;; failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;; *) sleep 10 ;; esac done echo "TIMEOUT" >&2; exit 2 -
When notified of completion, Step 6: Present result
Read
OUTPUT_MODEfrom config. Followshared/output-mode.mdfor behavior.inlineorboth: DisplayaudioUrlas a clickable link.Present:
播客已生成! 在线收听:{audioUrl} 字幕:{subtitlesUrl}(如有) 时长:{audioDuration / 1000}s 消耗积分:{credits}downloadorboth: Also download the file. Generate a topic slug followingshared/config-pattern.md§ Artifact Naming.SLUG="{topic-slug}" # e.g. "ai-developments" NAME="${SLUG}-podcast.mp3" # Dedup: if file exists, append -2, -3, etc. BASE="${NAME%.*}"; EXT="${NAME##*.}"; i=2 while [ -e "$NAME" ]; do NAME="${BASE}-${i}.${EXT}"; i=$((i+1)); done curl -sS -o "$NAME" "{audioUrl}"Present:
已保存到当前目录: {NAME} -
Offer to show transcript or provide download URL on request
Two-Step Generation
- Step 1 — Submit text (foreground):
POST /podcast/episodes/text-content→ extractepisodeId - Poll text (background): Use the exact
jq-based polling loop above (substitute endpointpodcast/episodes/text-content/{episodeId}if needed), withrun_in_background: trueandtimeout: 600000 - When notified, save draft to a topic-based folder in cwd:
- Generate a topic slug following
shared/config-pattern.md§ Artifact Naming - Create
{slug}-podcast/folder (dedup if exists) - Write
draft.md(human-readable:**{speakerName}**: {content}per line) - Write
draft.json(rawscriptsarray) - Present the draft location and content preview
- Generate a topic slug following
- STOP: Present the draft and wait for explicit user approval
- Step 2 — Submit audio (foreground, after approval):
- No changes:
POST /podcast/episodes/{episodeId}/audiowith{} - With edits:
POST /podcast/episodes/{episodeId}/audiowith modified{scripts: [...]}
- No changes:
- Poll audio (background): Same exact
jq-based loop,run_in_background: true,timeout: 600000 - When notified, download audio to the same folder:
curl -sS -o {slug}-podcast/podcast.mp3 {audioUrl}- Present final result (same format as one-step, folder now has draft + final files)
After Successful Generation
Update config with the choices made this session:
NEW_CONFIG=$(echo "$CONFIG" | jq \
--arg lang "{language}" \
--arg mode "{mode}" \
--arg method "{one-step/two-step}" \
--argjson speakers '{"{language}": ["{speakerId}"]}' \
'. + {"language": $lang, "defaultMode": $mode, "defaultMethod": $method, "defaultSpeakers": (.defaultSpeakers + $speakers)}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
API Reference
- Speaker list:
shared/api-speakers.md - Speaker selection guide:
shared/speaker-selection.md - Episode creation:
shared/api-podcast.md - Polling:
shared/common-patterns.md§ Async Polling - Config pattern:
shared/config-pattern.md
Composability
- Invokes: speakers API (for speaker selection)
- Invoked by: content-planner (Phase 3)
Example
User: "Make a podcast about the latest AI developments"
Agent workflow:
- Detect: podcast request, topic = "latest AI developments", no references
- Infer: mode = "quick" (default), language = "en" (user wrote in English), 2 speakers (default), one-step (default)
- Show confirmation summary → user confirms
curl -sS -X POST "https://api.marswave.ai/openapi/v1/podcast/episodes" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-H "X-Source: skills" \
-d '{
"sources": [{"type": "text", "content": "The latest AI developments"}],
"speakers": [{"speakerId": "cozy-man-english"}],
"language": "en",
"mode": "deep"
}'
Poll until complete, then present the result with title and listen link.
Ratings
4.5★★★★★61 reviews- ★★★★★Ren Torres· Dec 20, 2024
podcast fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Ren Flores· Dec 12, 2024
podcast reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Camila Gupta· Dec 4, 2024
podcast has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Emma Khan· Nov 27, 2024
podcast has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Rahul Santra· Nov 3, 2024
podcast fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Ren Yang· Nov 3, 2024
Registry listing for podcast matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Anika Agarwal· Nov 3, 2024
podcast fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Sakura Ramirez· Oct 22, 2024
Useful defaults in podcast — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Anika Thompson· Oct 22, 2024
podcast has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Pratham Ware· Oct 18, 2024
podcast has been reliable in day-to-day use. Documentation quality is above average for community skills.
showing 1-10 of 61