Productivity

podcast

marswaveai/skills · updated Apr 8, 2026

$npx skills add https://github.com/marswaveai/skills --skill podcast
summary

Generate podcast episodes with 1-2 AI speakers discussing a topic. Supports quick overviews, deep analysis, and debate formats. Input can be a topic description, URL(s), or text. Output is a full audio episode with transcript.

skill.md

When to Use

  • User wants to create a podcast episode on any topic
  • User provides a URL or text and wants it turned into a podcast discussion
  • User asks for a "debate", "dialogue", or "discussion" format
  • User says "podcast", "播客", or "录一期节目"

When NOT to Use

  • User wants text-to-speech reading (use /speech)
  • User wants an explainer video with visuals (use /explainer)
  • User wants to generate an image (use /image-gen)
  • User only wants to extract content from a URL without generating audio (use /content-parser)

Purpose

Generate podcast episodes with 1-2 AI speakers discussing a topic. Supports quick overviews, deep analysis, and debate formats. Input can be a topic description, URL(s), or text. Output is a full audio episode with transcript.

Hard Constraints

  • No shell scripts. Construct curl commands from the API reference files listed in Resources
  • Always read shared/authentication.md for API key and headers
  • Follow shared/common-patterns.md for polling, errors, and interaction patterns
  • Never hardcode speaker IDs in API calls — use built-in defaults from shared/speaker-selection.md as fallback only; fetch from the speakers API when the user wants to change voice
  • Never fabricate API endpoints or parameters
  • Always read config following shared/config-pattern.md before any interaction
  • Always follow shared/speaker-selection.md for speaker selection (text table + free-text input)
  • Never save files to ~/Downloads/ or .listenhub/ — save artifacts to the current working directory with friendly topic-based names (see shared/config-pattern.md § Artifact Naming)

Step -1: API Key Check

Follow shared/config-pattern.md § API Key Check. If the key is missing, stop immediately.

Step 0: Config Setup

Follow shared/config-pattern.md Step 0 (Zero-Question Boot).

If file doesn't exist — silently create with defaults and proceed:

mkdir -p ".listenhub/podcast"
echo '{"outputMode":"inline","language":null,"defaultMode":"quick","defaultMethod":"one-step","defaultSpeakers":{}}' > ".listenhub/podcast/config.json"
CONFIG_PATH=".listenhub/podcast/config.json"
CONFIG=$(cat "$CONFIG_PATH")

Do NOT ask any setup questions. Proceed directly to the Interaction Flow.

If file exists — read config silently and proceed:

CONFIG_PATH=".listenhub/podcast/config.json"
[ ! -f "$CONFIG_PATH" ] && CONFIG_PATH="$HOME/.listenhub/podcast/config.json"
CONFIG=$(cat "$CONFIG_PATH")

Setup Flow (user-initiated reconfigure only)

Only run when the user explicitly asks to reconfigure. Display current settings:

当前配置 (podcast):
  输出方式:{inline / download / both}
  语言偏好:{zh / en / 未设置}
  默认模式:{quick / deep / debate / 未设置}
  默认生成方式:{one-step / two-step}
  默认主播:{speakerName(s) / 使用内置默认}

Then ask these questions in order and save:

  1. outputMode: Follow shared/output-mode.md § Setup Flow Question.

  2. Language (optional): "默认语言?"

    • "中文 (zh)"
    • "English (en)"
    • "每次手动选择" → keep null
  3. Mode (optional): "默认播客模式?"

    • "Quick — 简短概述"
    • "Deep — 深度分析"
    • "Debate — 辩论对话"
    • "每次手动选择" → keep null
  4. Method (optional): "默认生成方式?"

    • "一步生成(推荐)" → defaultMethod: "one-step"
    • "两步生成(先预览文本)" → defaultMethod: "two-step"
    • "每次手动选择" → keep null

After collecting answers, save immediately:

NEW_CONFIG=$(echo "$CONFIG" | jq --arg m "$OUTPUT_MODE" '. + {"outputMode": $m}')
# Save language if user chose one (not "每次手动选择")
if [ "$LANGUAGE" != "null" ]; then
  NEW_CONFIG=$(echo "$NEW_CONFIG" | jq --arg lang "$LANGUAGE" '. + {"language": $lang}')
fi
# Save mode if user chose one
if [ "$MODE" != "null" ]; then
  NEW_CONFIG=$(echo "$NEW_CONFIG" | jq --arg mode "$MODE" '. + {"defaultMode": $mode}')
fi
# Save method if user chose one
if [ "$METHOD" != "null" ]; then
  NEW_CONFIG=$(echo "$NEW_CONFIG" | jq --arg method "$METHOD" '. + {"defaultMethod": $method}')
fi
echo "$NEW_CONFIG" > "$CONFIG_PATH"
CONFIG=$(cat "$CONFIG_PATH")

Interaction Flow

Step 1: Topic + Reference Materials

Ask topic and optional reference materials together in a single question using AskUserQuestion with two sub-questions, or a single free-text prompt:

What topic would you like to turn into a podcast? If you have reference materials (URLs or text), include them here too.

Accept: topic description, URL(s), pasted text, or any combination.

Examples of valid input:

Step 2: Mode

Default: "quick" — skip this question unless:

  • config.defaultMode is set to something else → use that value silently
  • User explicitly mentioned a mode keyword in Step 1 (e.g. "deep dive", "debate", "in depth") → infer mode from intent

Only ask this question if the user's intent is ambiguous AND no default is configured. In most cases, just use "quick".

Step 3: Language

Default: match the user's interaction language. Detect from the language the user used in Step 1:

  • If the user wrote in Chinese → zh
  • If the user wrote in English → en
  • If config.language is set → use that value

Never ask this question. Always infer silently. Show in the confirmation summary so the user can override if needed.

Step 4: Speaker Count

Default: 2 speakers (dialogue) — the most common and engaging format.

Skip this question. Debate mode requires 2 speakers. For quick/deep, default to 2 speakers as well.

Only use 1 speaker if the user explicitly requests a monologue or solo format.

Step 5: Speaker Selection

Follow shared/speaker-selection.md:

  • If config.defaultSpeakers.{language} is set → use saved speakers silently
  • If not set → use built-in defaults from shared/speaker-selection.md (no question asked)
  • Show the speaker(s) in the confirmation summary — user can change from there if desired
  • Only show the full speaker list if the user explicitly asks to change voices

For 2-speaker mode (dialogue/debate): use Primary + Secondary defaults for the language.

Step 6: Generation Method

Default: "one-step" — skip this question unless:

  • config.defaultMethod is set → use that value silently
  • User explicitly asks to review text first → use "two-step"

Step 7: Confirm & Generate

Summarize all choices:

Ready to generate podcast:

  Topic: {topic}
  Mode: {mode}
  Language: {language}
  Speakers: {speaker name(s)}
  References: {yes/no + brief description}
  Method: {one-step/two-step}

  Proceed?

Wait for explicit confirmation before calling any API. The user can adjust any parameter here before confirming.

Workflow

One-Step Generation

  1. Submit (foreground): POST /podcast/episodes with collected parameters → extract episodeId

  2. Tell the user the task is submitted

  3. Poll (background): Run the following exact bash command with run_in_background: true and timeout: 600000. Do NOT use python3, awk, or any other JSON parser — use jq as shown:

    EPISODE_ID="<id-from-step-1>"
    for i in $(seq 1 30); do
      RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/podcast/episodes/$EPISODE_ID" \
        -H "Authorization: Bearer $LISTENHUB_API_KEY" \
        -H "X-Source: skills" 2>/dev/null)
      STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.processStatus // "pending"')
      case "$STATUS" in
        success|completed) echo "$RESULT"; exit 0 ;;
        failed|error) echo "FAILED: $RESULT" >&2; exit 1 ;;
        *) sleep 10 ;;
      esac
    done
    echo "TIMEOUT" >&2; exit 2
    
  4. When notified of completion, Step 6: Present result

    Read OUTPUT_MODE from config. Follow shared/output-mode.md for behavior.

    inline or both: Display audioUrl as a clickable link.

    Present:

    播客已生成!
    
    在线收听:{audioUrl}
    字幕:{subtitlesUrl}(如有)
    时长:{audioDuration / 1000}s
    消耗积分:{credits}
    

    download or both: Also download the file. Generate a topic slug following shared/config-pattern.md § Artifact Naming.

    SLUG="{topic-slug}"  # e.g. "ai-developments"
    NAME="${SLUG}-podcast.mp3"
    # Dedup: if file exists, append -2, -3, etc.
    BASE="${NAME%.*}"; EXT="${NAME##*.}"; i=2
    while [ -e "$NAME" ]; do NAME="${BASE}-${i}.${EXT}"; i=$((i+1)); done
    curl -sS -o "$NAME" "{audioUrl}"
    

    Present:

    已保存到当前目录:
      {NAME}
    
  5. Offer to show transcript or provide download URL on request

Two-Step Generation

  1. Step 1 — Submit text (foreground): POST /podcast/episodes/text-content → extract episodeId
  2. Poll text (background): Use the exact jq-based polling loop above (substitute endpoint podcast/episodes/text-content/{episodeId} if needed), with run_in_background: true and timeout: 600000
  3. When notified, save draft to a topic-based folder in cwd:
    • Generate a topic slug following shared/config-pattern.md § Artifact Naming
    • Create {slug}-podcast/ folder (dedup if exists)
    • Write draft.md (human-readable: **{speakerName}**: {content} per line)
    • Write draft.json (raw scripts array)
    • Present the draft location and content preview
  4. STOP: Present the draft and wait for explicit user approval
  5. Step 2 — Submit audio (foreground, after approval):
    • No changes: POST /podcast/episodes/{episodeId}/audio with {}
    • With edits: POST /podcast/episodes/{episodeId}/audio with modified {scripts: [...]}
  6. Poll audio (background): Same exact jq-based loop, run_in_background: true, timeout: 600000
  7. When notified, download audio to the same folder:
    • curl -sS -o {slug}-podcast/podcast.mp3 {audioUrl}
    • Present final result (same format as one-step, folder now has draft + final files)

After Successful Generation

Update config with the choices made this session:

NEW_CONFIG=$(echo "$CONFIG" | jq \
  --arg lang "{language}" \
  --arg mode "{mode}" \
  --arg method "{one-step/two-step}" \
  --argjson speakers '{"{language}": ["{speakerId}"]}' \
  '. + {"language": $lang, "defaultMode": $mode, "defaultMethod": $method, "defaultSpeakers": (.defaultSpeakers + $speakers)}')
echo "$NEW_CONFIG" > "$CONFIG_PATH"

API Reference

  • Speaker list: shared/api-speakers.md
  • Speaker selection guide: shared/speaker-selection.md
  • Episode creation: shared/api-podcast.md
  • Polling: shared/common-patterns.md § Async Polling
  • Config pattern: shared/config-pattern.md

Composability

  • Invokes: speakers API (for speaker selection)
  • Invoked by: content-planner (Phase 3)

Example

User: "Make a podcast about the latest AI developments"

Agent workflow:

  1. Detect: podcast request, topic = "latest AI developments", no references
  2. Infer: mode = "quick" (default), language = "en" (user wrote in English), 2 speakers (default), one-step (default)
  3. Show confirmation summary → user confirms
curl -sS -X POST "https://api.marswave.ai/openapi/v1/podcast/episodes" \
  -H "Authorization: Bearer $LISTENHUB_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Source: skills" \
  -d '{
    "sources": [{"type": "text", "content": "The latest AI developments"}],
    "speakers": [{"speakerId": "cozy-man-english"}],
    "language": "en",
    "mode": "deep"
  }'

Poll until complete, then present the result with title and listen link.

general reviews

Ratings

4.561 reviews
  • Ren Torres· Dec 20, 2024

    podcast fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Ren Flores· Dec 12, 2024

    podcast reduced setup friction for our internal harness; good balance of opinion and flexibility.

  • Camila Gupta· Dec 4, 2024

    podcast has been reliable in day-to-day use. Documentation quality is above average for community skills.

  • Emma Khan· Nov 27, 2024

    podcast has been reliable in day-to-day use. Documentation quality is above average for community skills.

  • Rahul Santra· Nov 3, 2024

    podcast fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Ren Yang· Nov 3, 2024

    Registry listing for podcast matched our evaluation — installs cleanly and behaves as described in the markdown.

  • Anika Agarwal· Nov 3, 2024

    podcast fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Sakura Ramirez· Oct 22, 2024

    Useful defaults in podcast — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.

  • Anika Thompson· Oct 22, 2024

    podcast has been reliable in day-to-day use. Documentation quality is above average for community skills.

  • Pratham Ware· Oct 18, 2024

    podcast has been reliable in day-to-day use. Documentation quality is above average for community skills.

showing 1-10 of 61

1 / 7