agent-browser-automation

aradotso/trending-skills · updated Apr 8, 2026

$npx skills add https://github.com/aradotso/trending-skills --skill agent-browser-automation
0 commentsdiscussion
summary

Skill by ara.so — Daily 2026 Skills collection.

skill.md

agent-browser

Skill by ara.so — Daily 2026 Skills collection.

agent-browser is a headless browser automation CLI built in Rust, designed for AI agents. It wraps Chrome via the Chrome DevTools Protocol (CDP) and exposes a fast, ergonomic command-line interface for navigation, interaction, accessibility snapshots, screenshots, network interception, and more — with no Node.js or Playwright runtime required.

Installation

Recommended (npm global)

npm install -g agent-browser
agent-browser install  # Download Chrome for Testing (first time only)

macOS (Homebrew)

brew install agent-browser
agent-browser install

Rust / Cargo

cargo install agent-browser
agent-browser install

Local project dependency

npm install agent-browser
# Add to package.json scripts or invoke via npx

Linux (with system dependencies)

agent-browser install --with-deps

Quick Start

agent-browser open https://example.com
agent-browser snapshot                        # Accessibility tree with @refs (best for AI)
agent-browser click @e2                       # Click by ref from snapshot
agent-browser fill @e3 "hello@example.com"   # Fill by ref
agent-browser get text @e1                    # Get text content
agent-browser screenshot page.png
agent-browser close

Core Commands

Navigation

agent-browser open <url>           # Navigate (aliases: goto, navigate)
agent-browser get url              # Get current URL
agent-browser get title            # Get page title
agent-browser close                # Close browser (aliases: quit, exit)

Accessibility Snapshot (recommended for AI agents)

agent-browser snapshot             # Returns accessibility tree with @ref IDs
agent-browser snapshot -i          # Interactive / compact mode

Snapshot output includes @eN refs you can use directly:

@e1 [button] "Submit"
@e2 [textbox] "Email" value=""
@e3 [link] "Sign in"

Then act on them:

agent-browser fill @e2 "user@example.com"
agent-browser click @e1

Interaction

agent-browser click <sel>                     # Click element
agent-browser dblclick <sel>                  # Double-click
agent-browser fill <sel> <text>               # Clear and fill input
agent-browser type <sel> <text>               # Type into element
agent-browser press <key>                     # Press key (Enter, Tab, Control+a)
agent-browser keyboard type <text>            # Type at current focus (real keystrokes)
agent-browser keyboard inserttext <text>      # Insert text without key events
agent-browser hover <sel>                     # Hover element
agent-browser select <sel> <value>            # Select dropdown option
agent-browser check <sel>                     # Check checkbox
agent-browser uncheck <sel>                   # Uncheck checkbox
agent-browser scroll down 500                 # Scroll (up/down/left/right, optional px)
agent-browser scroll down --selector "#feed"  # Scroll within element
agent-browser scrollintoview <sel>            # Scroll element into view
agent-browser drag <src> <target>             # Drag and drop
agent-browser upload <sel> /path/file.pdf     # Upload file

Screenshots & PDF

agent-browser screenshot                          # Save to temp dir, print path
agent-browser screenshot page.png                 # Save to path
agent-browser screenshot --full page.png          # Full-page screenshot
agent-browser screenshot --annotate               # Numbered element labels overlay
agent-browser screenshot --screenshot-dir ./shots # Custom output directory
agent-browser screenshot --screenshot-format jpeg --screenshot-quality 80
agent-browser pdf output.pdf                      # Save page as PDF

Getting Element Info

agent-browser get text <sel>           # Text content
agent-browser get html <sel>           # innerHTML
agent-browser get value <sel>          # Input value
agent-browser get attr <sel> <attr>    # Attribute value
agent-browser get count <sel>          # Count matching elements
agent-browser get box <sel>            # Bounding box
agent-browser get styles <sel>         # Computed styles
agent-browser get cdp-url              # CDP WebSocket URL

State Checks

agent-browser is visible <sel>
agent-browser is enabled <sel>
agent-browser is checked <sel>

Semantic Locators (find)

agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "test@example.com"
agent-browser find placeholder "Search..." fill "rust"
agent-browser find testid "login-btn" click
agent-browser find first ".item" click
agent-browser find nth 2 "a" text
agent-browser find role textbox fill "hello" --name "Username"

Actions: click, fill, type, hover, focus, check, uncheck, text

Waiting

agent-browser wait "#modal"                          # Wait for element visible
agent-browser wait 2000                              # Wait N milliseconds
agent-browser wait --text "Welcome back"             # Wait for text
agent-browser wait --url "**/dashboard"              # Wait for URL pattern
agent-browser wait --load networkidle                # Wait for load state
agent-browser wait --fn "window.appReady === true"   # Wait for JS condition
agent-browser wait "#spinner" --state hidden         # Wait for element to disappear

Load states: load, domcontentloaded, networkidle

JavaScript Eval

agent-browser eval "document.title"
agent-browser eval "JSON.stringify(window.__STATE__)"
agent-browser eval -b "BASE64_ENCODED_JS"
echo "return document.body.innerHTML" | agent-browser eval --stdin

Batch Execution (efficient multi-step)

echo '[
  ["open", "https://example.com"],
  ["snapshot", "-i"],
  ["fill", "@e2", "user@example.com"],
  ["click", "@e1"],
  ["screenshot", "result.png"]
]' | agent-browser batch --json

# Stop on first failure
agent-browser batch --bail < commands.json

Tabs & Frames

agent-browser tab                    # List tabs
agent-browser tab new https://...    # New tab with URL
agent-browser tab 2                  # Switch to tab 2
agent-browser tab close              # Close current tab
agent-browser frame "#my-iframe"     # Switch into iframe
agent-browser frame main             # Return to main frame

Cookies & Storage

agent-browser cookies
agent-browser cookies set session_id "abc123"
agent-browser cookies clear

agent-browser storage local
agent-browser storage local set theme dark
agent-browser storage local clear
agent-browser storage session set cart '{"items":[]}'

Network

agent-browser network route "**/api/users" --body '{"users":[]}'  # Mock response
agent-browser network route "**/ads/**" --abort                    # Block requests
agent-browser network unroute                                       # Remove all routes
agent-browser network requests --filter api                        # View requests
agent-browser network har start
agent-browser network har stop recording.har

Browser Settings

agent-browser set viewport 1280 800
agent-browser set viewport 375 812 2        # With device pixel ratio (retina)
agent-browser set device "iPhone 14"
agent-browser set geo 37.7749 -122.4194
agent-browser set offline on
agent-browser set headers '{"X-Custom":"value"}'
agent-browser set credentials admin secret
agent-browser set media dark

Auth State

agent-browser state save ./auth.json    # Save cookies + localStorage
agent-browser state load ./auth.json    # Restore auth state
agent-browser state list                # List saved states
agent-browser state show auth.json      # Summary of saved state

Dialogs

agent-browser dialog accept             # Accept alert/confirm/prompt
agent-browser dialog accept "My input"  # Accept prompt with text
agent-browser dialog dismiss

Clipboard

agent-browser clipboard read
agent-browser clipboard write "Hello, World!"
agent-browser 

Discussion

Product Hunt–style comments (not star reviews)
  • No comments yet — start the thread.
general reviews

Ratings

4.528 reviews
  • Soo Bansal· Dec 16, 2024

    We added agent-browser-automation from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.

  • Hana Haddad· Nov 7, 2024

    agent-browser-automation fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Aarav Mehta· Oct 26, 2024

    agent-browser-automation has been reliable in day-to-day use. Documentation quality is above average for community skills.

  • Xiao Sharma· Sep 5, 2024

    agent-browser-automation is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.

  • Yash Thakker· Sep 1, 2024

    Keeps context tight: agent-browser-automation is the kind of skill you can hand to a new teammate without a long onboarding doc.

  • Arya Rahman· Sep 1, 2024

    Registry listing for agent-browser-automation matched our evaluation — installs cleanly and behaves as described in the markdown.

  • Sophia Diallo· Aug 24, 2024

    Solid pick for teams standardizing on skills: agent-browser-automation is focused, and the summary matches what you get after install.

  • Dhruvi Jain· Aug 20, 2024

    I recommend agent-browser-automation for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.

  • Henry Menon· Aug 20, 2024

    Useful defaults in agent-browser-automation — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.

  • Mia Lopez· Jul 15, 2024

    I recommend agent-browser-automation for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.

showing 1-10 of 28

1 / 3