How do I install agent-browser?

Run `npx skills add https://github.com/inference-sh/skills --skill agent-browser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does agent-browser support?

agent-browser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is agent-browser free to use?

Yes. agent-browser is free to install and use. It is available from the open explainx.ai skill registry published by inference-sh.

Where can I read ratings and reviews for agent-browser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

Productivity

agent-browser▌

inference-sh/skills · updated Apr 8, 2026

$npx skills add https://github.com/inference-sh/skills --skill agent-browser

0 commentsdiscussion

summary

Playwright-based browser automation with element refs and session persistence for AI agents.

›Provides 6 core functions: open (navigate + configure), snapshot (refresh element refs), interact (click/fill/drag/upload/scroll), screenshot, execute (JavaScript), and close
›Uses @e ref system for element targeting; refs invalidate after navigation and require re-snapshot to refresh
›Supports video recording with optional cursor indicator, proxy routing, file uploads, and drag-and-drop intera

skill.md

Agentic Browser

Browser automation for AI agents via inference.sh. Uses Playwright under the hood with a simple @e ref system for element interaction.

Quick Start

Requires inference.sh CLI (infsh). Install instructions

infsh login

# Open a page and get interactive elements
infsh app run agent-browser --function open --input '{"url": "https://example.com"}' --session new

Core Workflow

Every browser automation follows this pattern:

Open - Navigate to URL, get @e refs for elements
Interact - Use refs to click, fill, drag, etc.
Re-snapshot - After navigation/changes, get fresh refs
Close - End session (returns video if recording)

# 1. Start session
RESULT=$(infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com/login"
}')
SESSION_ID=$(echo $RESULT | jq -r '.session_id')
# Elements: @e1 [input] "Email", @e2 [input] "Password", @e3 [button] "Sign In"

# 2. Fill and submit
infsh app run agent-browser --function interact --session $SESSION_ID --input '{
  "action": "fill", "ref": "@e1", "text": "user@example.com"
}'
infsh app run agent-browser --function interact --session $SESSION_ID --input '{
  "action": "fill", "ref": "@e2", "text": "password123"
}'
infsh app run agent-browser --function interact --session $SESSION_ID --input '{
  "action": "click", "ref": "@e3"
}'

# 3. Re-snapshot after navigation
infsh app run agent-browser --function snapshot --session $SESSION_ID --input '{}'

# 4. Close when done
infsh app run agent-browser --function close --session $SESSION_ID --input '{}'

Functions

Function	Description
`open`	Navigate to URL, configure browser (viewport, proxy, video recording)
`snapshot`	Re-fetch page state with `@e` refs after DOM changes
`interact`	Perform actions using `@e` refs (click, fill, drag, upload, etc.)
`screenshot`	Take page screenshot (viewport or full page)
`execute`	Run JavaScript code on the page
`close`	Close session, returns video if recording was enabled

Interact Actions

Action	Description	Required Fields
`click`	Click element	`ref`
`dblclick`	Double-click element	`ref`
`fill`	Clear and type text	`ref`, `text`
`type`	Type text (no clear)	`text`
`press`	Press key (Enter, Tab, etc.)	`text`
`select`	Select dropdown option	`ref`, `text`
`hover`	Hover over element	`ref`
`check`	Check checkbox	`ref`
`uncheck`	Uncheck checkbox	`ref`
`drag`	Drag and drop	`ref`, `target_ref`
`upload`	Upload file(s)	`ref`, `file_paths`
`scroll`	Scroll page	`direction` (up/down/left/right), `scroll_amount`
`back`	Go back in history	-
`wait`	Wait milliseconds	`wait_ms`
`goto`	Navigate to URL	`url`

Element Refs

Elements are returned with @e refs:

@e1 [a] "Home" href="/"
@e2 [input type="text"] placeholder="Search"
@e3 [button] "Submit"
@e4 [select] "Choose option"
@e5 [input type="checkbox"] name="agree"

Important: Refs are invalidated after navigation. Always re-snapshot after:

Clicking links/buttons that navigate
Form submissions
Dynamic content loading

Features

Video Recording

Record browser sessions for debugging or documentation:

# Start with recording enabled (optionally show cursor indicator)
SESSION=$(infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com",
  "record_video": true,
  "show_cursor": true
}' | jq -r '.session_id')

# ... perform actions ...

# Close to get the video file
infsh app run agent-browser --function close --session $SESSION --input '{}'
# Returns: {"success": true, "video": <File>}

Cursor Indicator

Show a visible cursor in screenshots and video (useful for demos):

infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com",
  "show_cursor": true,
  "record_video": true
}'

The cursor appears as a red dot that follows mouse movements and shows click feedback.

Proxy Support

Route traffic through a proxy server:

infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com",
  "proxy_url": "http://proxy.example.com:8080",
  "proxy_username": "user",
  "proxy_password": "pass"
}'

File Upload

Upload files to file inputs:

infsh app run agent-browser --function interact --session $SESSION --input '{
  "action": "upload",
  "ref": "@e5",
  "file_paths": ["/path/to/file.pdf"]
}'

Drag and Drop

Drag elements to targets:

infsh app run agent-browser --function interact --session $SESSION --input '{
  "action": "drag",
  "ref": "@e1",
  "target_ref": "@e2"
}'

JavaScript Execution

Run custom JavaScript:

infsh app run agent-browser --function execute --session $SESSION --input '{
  "code": "document.querySelectorAll(\"h2\").length"
}'
# Returns: {"result": "5", "screenshot": <File>}

Deep-Dive Documentation

Reference	Description
references/commands.md	Full function reference with all options
references/snapshot-refs.md	Ref lifecycle, invalidation rules, troubleshooting
references/session-management.md	Session persistence, parallel sessions
references/authentication.md	Login flows, OAuth, 2FA handling
references/video-recording.md	Recording workflows for debugging
references/proxy-support.md	Proxy configuration, geo-testing

Ready-to-Use Templates

Template	Description
templates/form-automation.sh	Form filling with validation
templates/authenticated-session.sh	Login once, reuse session
templates/capture-workflow.sh	Content extraction with screenshots

Examples

Form Submission

SESSION=$(infsh app run agent-browser --function open --session new --input '{
  "url": "https://example.com/contact"
}' | jq -r '.session_id')

# Get elements: @e1 [input] "Name", @e2 [input] "Email", @e3 [textarea], @e4 [button] "Send"

infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "John Doe"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e2", "text": "john@example.com"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e3", "text": "Hello!"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "click", "ref": "@e4"}'

infsh app run agent-browser --function snapshot --session $SESSION --input '{}'
infsh app run agent-browser --function close --session $SESSION --input '{}'

Search and Extract

SESSION=$(infsh app run agent-browser --function open --session new --input '{
  "url": "https://google.com"
}' | jq -r '.session_id')

infsh app run agent-browser --function interact --session $SESSION --input '{"action": "fill", "ref": "@e1", "text": "weather today"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "press", "text": "Enter"}'
infsh app run agent-browser --function interact --session $SESSION --input '{"action": "wait", "wait_ms": 2000}'

infsh app run agent-browser --function snapshot --session $SESSION --input '{}'
infsh app run age

`Discussion`

Product Hunt–style comments (not star reviews)

No comments yet — start the thread.

general reviews

`Ratings`

4.4★★★★★35 reviews

★★★★★Sakura Rao· Dec 28, 2024
agent-browser reduced setup friction for our internal harness; good balance of opinion and flexibility.
★★★★★Xiao Harris· Dec 20, 2024
We added agent-browser from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
★★★★★Rahul Santra· Nov 15, 2024
agent-browser reduced setup friction for our internal harness; good balance of opinion and flexibility.
★★★★★Min Anderson· Nov 11, 2024
agent-browser fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
★★★★★Pratham Ware· Oct 6, 2024
agent-browser is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
★★★★★Kabir Abebe· Oct 2, 2024
agent-browser has been reliable in day-to-day use. Documentation quality is above average for community skills.
★★★★★Aditi Farah· Oct 2, 2024
Solid pick for teams standardizing on skills: agent-browser is focused, and the summary matches what you get after install.
★★★★★Chinedu Ghosh· Sep 17, 2024
I recommend agent-browser for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
★★★★★Sakshi Patil· Sep 9, 2024
Useful defaults in agent-browser — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
★★★★★Hiroshi Ndlovu· Sep 9, 2024
agent-browser has been reliable in day-to-day use. Documentation quality is above average for community skills.

showing 1-10 of 35

1 / 4