Browser Automation with agent-browser
The CLI uses Chrome/Chromium via CDP directly. Install via npm i -g agent-browser, brew install agent-browser, or cargo install agent-browser. Run agent-browser install to download Chrome. Run agent-browser upgrade to update to the latest version.
Core Workflow
Every browser automation follows this pattern:
- Navigate:
agent-browser open <url>
- Snapshot:
agent-browser snapshot -i (get element refs like @e1, @e2)
- Interact: Use refs to click, fill, select
- Re-snapshot: After navigation or DOM changes, get fresh refs
agent-browser open https://example.com/form
agent-browser snapshot -i
agent-browser fill @e1 "[email protected]"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i
Command Chaining
Commands can be chained with && in a single shell invocation. The browser persists between commands via a background daemon, so chaining is safe and more efficient than separate calls.
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
agent-browser fill @e1 "[email protected]" && agent-browser fill @e2 "password123" && agent-browser click @e3
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser screenshot page.png
When to chain: Use && when you don't need to read the output of an intermediate command before proceeding (e.g., open + wait + screenshot). Run commands separately when you need to parse the output first (e.g., snapshot to discover refs, then interact using those refs).
Handling Authentication
When automating a site that requires login, choose the approach that fits:
Option 1: Import auth from the user's browser (fastest for one-off tasks)
agent-browser --auto-connect state save ./auth.json
agent-browser --state ./auth.json open https://app.example.com/dashboard
State files contain session tokens in plaintext -- add to .gitignore and delete when no longer needed. Set AGENT_BROWSER_ENCRYPTION_KEY for encryption at rest.
Option 2: Persistent profile (simplest for recurring tasks)
agent-browser --profile ~/.myapp open https://app.example.com/login
agent-browser --profile ~/.myapp open https://app.example.com/dashboard
Option 3: Session name (auto-save/restore cookies + localStorage)
agent-browser --session-name myapp open https://app.example.com/login
agent-browser close
agent-browser --session-name myapp open https://app.example.com/dashboard
Option 4: Auth vault (credentials stored encrypted, login by name)
echo "$PASSWORD" | agent-browser auth save myapp --url https://app.example.com/login --username user --password-stdin
agent-browser auth login myapp
auth login navigates with load and then waits for login form selectors to appear before filling/clicking, which is more reliable on delayed SPA login screens.
Option 5: State file (manual save/load)
agent-browser state save ./auth.json
agent-browser state load ./auth.json
agent-browser open https://app.example.com/dashboard
See references/authentication.md for OAuth, 2FA, cookie-based auth, and token refresh patterns.
Essential Commands
agent-browser open <url>
agent-browser close
agent-browser snapshot -i
agent-browser snapshot -i -C
agent-browser snapshot -s "#selector"
agent-browser click @e1
agent-browser click @e1 --new-tab
agent-browser fill @e2 "text"
agent-browser type @e2 "text"
agent-browser select @e1 "option"
agent-browser check @e1
agent-browser press Enter
agent-browser keyboard type "text"
agent-browser keyboard inserttext "text"
agent-browser scroll down 500
agent-browser scroll down 500 --selector "div.content"
agent-browser get text @e1
agent-browser get url
agent-browser get title
agent-browser get cdp-url
agent-browser wait @e1
agent-browser wait --load networkidle
agent-browser wait --url "**/page"
agent-browser wait 2000
agent-browser wait --text "Welcome"
agent-browser wait --fn "!document.body.innerText.includes('Loading...')"
agent-browser wait "#spinner" --state hidden
agent-browser download @e1 ./file.pdf
agent-browser wait --download ./output.zip
agent-browser --download-path ./downloads open <url>
agent-browser network requests
agent-browser network route "**/api/*" --abort
agent-browser network har start
agent-browser network har stop ./capture.har
agent-browser set viewport 1920 1080
agent-browser set viewport 1920 1080 2
agent-browser set device "iPhone 14"
agent-browser screenshot
agent-browser screenshot --full
agent-browser screenshot --annotate
agent-browser screenshot --screenshot-dir ./shots
agent-browser screenshot --screenshot-format jpeg --screenshot-quality 80
agent-browser pdf output.pdf
agent-browser clipboard read
agent-browser clipboard write "Hello, World!"
agent-browser clipboard copy
agent-browser clipboard paste
agent-browser diff snapshot
agent-browser diff snapshot --baseline before.txt
agent-browser diff screenshot --baseline before.png
agent-browser diff url <url1> <url2>
agent-browser diff url <url1> <url2> --wait-until networkidle
agent-browser diff url <url1> <url2> --selector "#main"
Batch Execution
Execute multiple commands in a single invocation by piping a JSON array of string arrays to batch. This avoids per-command process startup overhead when running multi-step workflows.
echo '[
["open", "https://example.com"],
["snapshot", "-i"],
["click", "@e1"],
["screenshot", "result.png"]
]' | agent-browser batch --json
agent-browser batch --bail < commands.json
Use batch when you have a known sequence of commands that don't depend on intermediate output. Use separate commands or && chaining when you need to parse output between steps (e.g., snapshot to discover refs, then interact).
Common Patterns
Form Submission
agent-browser open https://example.com/signup
agent-browser snapshot -i
agent-browser fill @e1 "Jane Doe"
agent-browser fill @e2 "[email protected]"
agent-browser select @e3 "California"
agent-browser check @e4
agent-browser click @e5
agent-browser wait --load networkidle
Authentication with Auth Vault (Recommended)
echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin
agent-browser auth login github
agent-browser auth list
agent-browser auth show github
agent-browser auth delete github