defuddle

joeseesun/defuddle-skill · updated Apr 8, 2026

$npx skills add https://github.com/joeseesun/defuddle-skill --skill defuddle
0 commentsdiscussion
summary

Extract clean article content from web pages, removing ads and clutter to return readable Markdown with metadata.

  • Parses URLs or local HTML files and outputs clean Markdown with frontmatter (title, author, publication date, word count)
  • Supports JSON metadata extraction including featured images, domain, favicon, and parse timing
  • Includes a guided workflow: extract content, preview summary, save to user-specified directory, and confirm file location
  • Works best on article-style page
skill.md

Defuddle - Web Content Extraction

Extract main article content from web pages, removing ads, sidebars, navigation, and other clutter. Output clean Markdown with metadata.

Prerequisites

Before first use, check if defuddle is installed:

command -v defuddle >/dev/null 2>&1 || npm install -g defuddle jsdom

Default Workflow

When user provides a URL, follow this workflow:

Step 1: Extract content as Markdown + JSON metadata

Always use both -m and -j flags to get markdown content with full metadata:

defuddle parse "<url>" -m -j

Step 2: Present a summary to the user

Show the user:

  • Title: from JSON title field
  • Author: from JSON author field
  • Source: domain
  • Word count: from JSON wordCount field
  • A brief preview (first 2-3 sentences)

Step 3: Ask where to save

If this is the first time using defuddle in this conversation, ask the user:

"Save to which directory? (e.g. ~/Documents, ~/Desktop, or a custom path)"

Remember the user's chosen directory for subsequent uses in the same conversation.

Step 4: Save as Markdown file

Write the file with frontmatter + full content:

---
title: {title}
author: {author}
source: {url}
date: {published or "Unknown"}
clipped: {today's date YYYY-MM-DD}
wordCount: {wordCount}
---

# {title}

{markdown content}

File naming: Use the article title as filename, sanitized for filesystem:

  • Replace special characters with spaces
  • Trim whitespace
  • Example: The Shape of the Essay Field.md

Step 5: Confirm to user

Tell the user the file path where it was saved.

CLI Reference

defuddle parse <source> [options]

Arguments:

  • <source> — URL (https://...) or local HTML file path

Options:

Flag Description
-m, --markdown Convert content to Markdown
-j, --json Output as JSON with full metadata
-o, --output <file> Write to file instead of stdout
-p, --property <name> Extract single property (title, description, domain, author, published, wordCount, content)
--debug Verbose logging

JSON Response Fields

When using -j, the response includes:

  • title — Article title
  • author — Author name
  • published — Publication date
  • description — Meta description
  • content — Extracted Markdown (when -m used)
  • domain — Source domain
  • favicon — Favicon URL
  • image — Featured image URL
  • site — Site name
  • wordCount — Word count
  • parseTime — Processing time in ms

Notes

  • Requires Node.js and npm
  • jsdom is required as a peer dependency
  • Works best with article-style pages (blogs, news, documentation)
  • Not designed for SPAs or JavaScript-heavy pages (e.g. WeChat articles need browser rendering)

Discussion

Product Hunt–style comments (not star reviews)
  • No comments yet — start the thread.
general reviews

Ratings

4.634 reviews
  • Dhruvi Jain· Dec 24, 2024

    I recommend defuddle for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.

  • Tariq Diallo· Dec 20, 2024

    Solid pick for teams standardizing on skills: defuddle is focused, and the summary matches what you get after install.

  • Min Rahman· Dec 20, 2024

    We added defuddle from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.

  • Rahul Santra· Nov 23, 2024

    Useful defaults in defuddle — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.

  • Oshnikdeep· Nov 15, 2024

    Solid pick for teams standardizing on skills: defuddle is focused, and the summary matches what you get after install.

  • Jin Anderson· Nov 11, 2024

    I recommend defuddle for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.

  • Michael Flores· Nov 11, 2024

    defuddle fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Pratham Ware· Oct 14, 2024

    Registry listing for defuddle matched our evaluation — installs cleanly and behaves as described in the markdown.

  • Ganesh Mohane· Oct 6, 2024

    defuddle is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.

  • Jin Torres· Oct 2, 2024

    Keeps context tight: defuddle is the kind of skill you can hand to a new teammate without a long onboarding doc.

showing 1-10 of 34

1 / 4