tag

scraping▌

9 indexed skills · max 10 per page

skills (9)

scrapy-web-scraping

mindrally/skills · Backend

Expert guidance for building scalable web scrapers and crawlers using Scrapy with best practices for spider development, data extraction, and pipeline management. \n \n Covers spider architecture, CSS/XPath data extraction, Item Pipelines, and middleware development for request/response handling \n Includes strategies for rate limiting, User-Agent rotation, proxy management, and handling JavaScript-rendered content with Scrapy-Splash or Scrapy-Playwright \n Provides error handling patterns, perf

web-scraping

mindrally/skills · Backend

Web scraping and data extraction using Python tools for static, dynamic, and large-scale content. \n \n Supports static sites via requests and BeautifulSoup, dynamic content via Selenium and Playwright, and large-scale extraction via Scrapy and firecrawl \n Includes specialized tools for AI-powered extraction (jina), structured queries (agentQL), and complex automation workflows (multion) \n Built-in guidance on rate limiting, robots.txt compliance, error handling, session management, and pagina

web-scraping

jamditis/claude-skills-journalism · Backend

Reliable web scraping with cascading fallbacks, anti-bot bypass, and poison pill detection. \n \n Implements a scraping cascade architecture with four strategies: trafilatura for fast article extraction, requests with rotating user agents, Playwright with stealth mode for JavaScript-heavy sites, and async Playwright for Jupyter notebooks \n Includes poison pill detection to identify paywalls, CAPTCHAs, rate limits, Cloudflare blocks, and login walls using pattern matching and status code analysi

extract-listings

realtor.com/extract-listings-9v0y4r · real-estate

Search Realtor.com (for-sale, for-rent, sold, new-construction, foreclosure, pending) from a free-form location or pre-filtered URL and return structured listing JSON. Honors the full filter surface (price, beds/baths, sqft, lot size, year built, days-on-market, HOA, features, school rating, pets/furnished, sort, pagination). Read-only.

find-agent-contact-details

edgeprop.my/find-agent-contact-details-hvhaqj · real-estate

Extract a Malaysian real-estate agent's display name, full mobile phone (E.164), and email from any EdgeProp.my agent profile or listing page by parsing the inlined Next.js __NEXT_DATA__ payload — bypasses the UI's masked phone and email-form gateway.

fetch

browserbasehq/sdk · data

Use this skill to retrieve a URL without a full browser session, fetching HTML or JSON from static pages and inspecting headers.

firecrawl-scraping

casper-studios/casper-marketplace · Backend

Scrape individual web pages and convert them to clean, LLM-ready markdown. Handles JavaScript rendering, anti-bot protection, and dynamic content.

web-scraping-automation

aaaaqwq/claude-code-skills · Backend

此技能专门用于自动化网站数据爬取和 API 接口调用，包括：

web-scraping

yfe404/web-scraper · Backend

web-scraping