agent-browser▌
supercent-io/skills-template · updated Apr 8, 2026
Deterministic browser automation for AI agents with snapshot-based element references and multi-session support.
- ›Interact with web pages using stable element refs (@e1, @e2, etc.) generated from snapshots, enabling reliable automation across DOM changes
- ›Core commands cover navigation, form filling, clicking, waiting, screenshots, PDFs, and visual regression testing via baseline comparison
- ›Supports parallel isolated sessions, network-aware waits (networkidle), and selector-based targe
agent-browser - Browser Automation for AI Agents
When to use this skill
- Open websites and automate UI actions
- Fill forms, click controls, and verify outcomes
- Capture screenshots/PDFs or extract content
- Run deterministic web checks with accessibility refs
- Execute parallel browser tasks via isolated sessions
Core workflow
Always use the deterministic ref loop:
agent-browser open <url>agent-browser snapshot -i- interact with refs (
@e1,@e2, ...) agent-browser snapshot -iagain after page/DOM changes
agent-browser open https://example.com/form
agent-browser wait --load networkidle
agent-browser snapshot -i
agent-browser fill @e1 "user@example.com"
agent-browser click @e2
agent-browser snapshot -i
Command patterns
Use && chaining when intermediate output is not needed.
# Good chaining: open -> wait -> snapshot
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
# Separate calls when output is needed first
agent-browser snapshot -i
# parse refs
agent-browser click @e2
High-value commands:
- Navigation:
open,close - Snapshot:
snapshot -i,snapshot -i -C,snapshot -s "#selector" - Interaction:
click,fill,type,select,check,press - Verification:
diff snapshot,diff screenshot --baseline <file> - Capture:
screenshot,screenshot --annotate,pdf - Wait:
wait --load networkidle,wait <selector|@ref|ms>
Verification patterns
Use explicit evidence after actions.
# Baseline -> action -> verify structure
agent-browser snapshot -i
agent-browser click @e3
agent-browser diff snapshot
# Visual regression
agent-browser screenshot baseline.png
agent-browser click @e5
agent-browser diff screenshot --baseline baseline.png
Safety and reliability
- Refs are invalid after navigation or significant DOM updates; re-snapshot before next action.
- Prefer
wait --load networkidleor selector/ref waits over fixed sleeps. - For multi-step JS, use
eval --stdin(or base64) to avoid shell escaping breakage. - For concurrent tasks, isolate with
--session <name>. - Use output controls in long pages to reduce context flooding.
- Optional hardening in sensitive flows: domain allowlist and action policies.
Optional hardening examples:
# Wrap page content with boundaries to reduce prompt-injection risk
export AGENT_BROWSER_CONTENT_BOUNDARIES=1
# Limit output volume for long pages
export AGENT_BROWSER_MAX_OUTPUT=50000
# Restrict navigation and network to trusted domains
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
# Restrict allowed action types
export AGENT_BROWSER_ACTION_POLICY=./policy.json
Example policy.json:
{"default":"deny","allow":["navigate","snapshot","click","fill","scroll","wait","get"],"deny":["eval","download","upload","network","state"]}
CLI-flag equivalent:
agent-browser --content-boundaries --max-output 50000 --allowed-domains "example.com,*.example.com" --action-policy ./policy.json open https://example.com
Troubleshooting
command not found: install and runagent-browser install.- Wrong element clicked: run
snapshot -iagain and use fresh refs. - Dynamic SPA content missing: wait with
--load networkidleor targetedwaitselector. - Session collisions: assign unique
--sessionnames and close each session. - Large output pressure: narrow snapshots (
-i,-c,-d,-s) and extract only needed text.
References
Deep-dive docs in this skill:
Related resources:
Ready templates:
./templates/form-automation.sh./templates/capture-workflow.sh
Metadata
- Version: 1.1.0
- Last updated: 2026-02-26
- Scope: deterministic browser automation for agent workflows
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.5★★★★★33 reviews- ★★★★★Ava Mensah· Dec 28, 2024
We added agent-browser from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Evelyn Smith· Dec 20, 2024
Solid pick for teams standardizing on skills: agent-browser is focused, and the summary matches what you get after install.
- ★★★★★Shikha Mishra· Dec 8, 2024
agent-browser has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Aisha Okafor· Dec 4, 2024
I recommend agent-browser for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Yash Thakker· Nov 27, 2024
Keeps context tight: agent-browser is the kind of skill you can hand to a new teammate without a long onboarding doc.
- ★★★★★Ava Garcia· Nov 19, 2024
agent-browser fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Sakshi Patil· Nov 7, 2024
Solid pick for teams standardizing on skills: agent-browser is focused, and the summary matches what you get after install.
- ★★★★★Chaitanya Patil· Oct 26, 2024
I recommend agent-browser for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Dhruvi Jain· Oct 18, 2024
We added agent-browser from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Hassan Patel· Oct 10, 2024
agent-browser has been reliable in day-to-day use. Documentation quality is above average for community skills.
showing 1-10 of 33