web-scraper

zephyrwang6/myskill · updated Apr 8, 2026

$npx skills add https://github.com/zephyrwang6/myskill --skill web-scraper
0 commentsdiscussion
summary

Fetch web page content and convert to clean markdown format.

skill.md

Web Scraper

Fetch web page content and convert to clean markdown format.

Usage

Run the fetch script to get web content:

python3 scripts/fetch_url.py <url> [options]

Options

  • --timeout <seconds>: Request timeout (default: 30)
  • --max-length <chars>: Maximum output length (default: 100000)
  • --raw: Output raw HTML instead of markdown

Examples

Fetch single URL:

python3 scripts/fetch_url.py "https://example.com/article"

Fetch with custom timeout:

python3 scripts/fetch_url.py "https://example.com/article" --timeout 60

Fetch multiple URLs in parallel:

for url in "https://url1.com" "https://url2.com"; do
  python3 scripts/fetch_url.py "$url" &
done
wait

Workflow

  1. Single URL: Run fetch_url.py with the URL
  2. Multiple URLs: Run multiple fetch commands in parallel using background processes
  3. Handle errors: If a URL fails, check:
    • Network connectivity
    • URL validity
    • Website may block automated requests (try different User-Agent or use browser automation)

Output Format

The script converts HTML to clean markdown:

  • Headings → #, ##, ###, etc.
  • Lists → - for unordered, 1. for ordered
  • Bold/Italic → **bold**, *italic*
  • Code blocks preserved
  • Navigation, footer, and ads removed

Troubleshooting

403 Forbidden: Website blocks automated requests. Consider:

  • Some sites require JavaScript rendering (not supported by this script)
  • Try accessing from a different network

Timeout errors: Increase timeout with --timeout 60

Empty content: Website may require JavaScript to render content

Discussion

Product Hunt–style comments (not star reviews)
  • No comments yet — start the thread.
general reviews

Ratings

4.758 reviews
  • Xiao Chawla· Dec 24, 2024

    web-scraper has been reliable in day-to-day use. Documentation quality is above average for community skills.

  • Chaitanya Patil· Dec 16, 2024

    web-scraper fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Li Abebe· Dec 16, 2024

    Registry listing for web-scraper matched our evaluation — installs cleanly and behaves as described in the markdown.

  • Aditi Haddad· Dec 16, 2024

    Solid pick for teams standardizing on skills: web-scraper is focused, and the summary matches what you get after install.

  • Kofi Martin· Dec 4, 2024

    web-scraper fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Chinedu Yang· Nov 23, 2024

    web-scraper is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.

  • Piyush G· Nov 7, 2024

    web-scraper is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.

  • Li Yang· Nov 7, 2024

    Useful defaults in web-scraper — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.

  • Shikha Mishra· Oct 26, 2024

    Keeps context tight: web-scraper is the kind of skill you can hand to a new teammate without a long onboarding doc.

  • Chen Kim· Oct 26, 2024

    I recommend web-scraper for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.

showing 1-10 of 58

1 / 6