scraping▌
6 indexed skills · max 10 per page
web-scraping
jamditis/claude-skills-journalism · Backend
Reliable web scraping with cascading fallbacks, anti-bot bypass, and poison pill detection. \n \n Implements a scraping cascade architecture with four strategies: trafilatura for fast article extraction, requests with rotating user agents, Playwright with stealth mode for JavaScript-heavy sites, and async Playwright for Jupyter notebooks \n Includes poison pill detection to identify paywalls, CAPTCHAs, rate limits, Cloudflare blocks, and login walls using pattern matching and status code analysi
web-scraping
mindrally/skills · Backend
Web scraping and data extraction using Python tools for static, dynamic, and large-scale content. \n \n Supports static sites via requests and BeautifulSoup, dynamic content via Selenium and Playwright, and large-scale extraction via Scrapy and firecrawl \n Includes specialized tools for AI-powered extraction (jina), structured queries (agentQL), and complex automation workflows (multion) \n Built-in guidance on rate limiting, robots.txt compliance, error handling, session management, and pagina
firecrawl-scraping
casper-studios/casper-marketplace · Backend
Scrape individual web pages and convert them to clean, LLM-ready markdown. Handles JavaScript rendering, anti-bot protection, and dynamic content.
web-scraping-automation
aaaaqwq/claude-code-skills · Backend
此技能专门用于自动化网站数据爬取和 API 接口调用,包括:
scrapy-web-scraping
mindrally/skills · Backend
Expert guidance for building scalable web scrapers and crawlers using Scrapy with best practices for spider development, data extraction, and pipeline management. \n \n Covers spider architecture, CSS/XPath data extraction, Item Pipelines, and middleware development for request/response handling \n Includes strategies for rate limiting, User-Agent rotation, proxy management, and handling JavaScript-rendered content with Scrapy-Splash or Scrapy-Playwright \n Provides error handling patterns, perf
web-scraping
yfe404/web-scraper · Backend
web-scraping