gemini-computer-use▌
am-will/codex-skills · updated Apr 8, 2026
Gemini 2.5 Computer Use browser automation with Playwright-based agent loops and safety confirmations.
- ›Implements a screenshot-to-action cycle: capture screen, send to Gemini, parse function calls, execute in Playwright, return results until task completion or turn limit
- ›Supports multiple browser options: bundled Chromium (default), Chrome/Edge channels via COMPUTER_USE_BROWSER_CHANNEL , or custom executables like Brave
- ›Includes safety confirmation workflow that prompts users before
Gemini Computer Use
Quick start
-
Source the env file and set your API key:
cp env.example env.sh $EDITOR env.sh source env.sh -
Create a virtual environment and install dependencies:
python -m venv .venv source .venv/bin/activate pip install google-genai playwright playwright install chromium -
Run the agent script with a prompt:
python scripts/computer_use_agent.py \ --prompt "Find the latest blog post title on example.com" \ --start-url "https://example.com" \ --turn-limit 6
Browser selection
- Default: Playwright's bundled Chromium (no env vars required).
- Choose a channel (Chrome/Edge) with
COMPUTER_USE_BROWSER_CHANNEL. - Use a custom Chromium-based executable (e.g., Brave) with
COMPUTER_USE_BROWSER_EXECUTABLE.
If both are set, COMPUTER_USE_BROWSER_EXECUTABLE takes precedence.
Core workflow (agent loop)
- Capture a screenshot and send the user goal + screenshot to the model.
- Parse
function_callactions in the response. - Execute each action in Playwright.
- If a
safety_decisionisrequire_confirmation, prompt the user before executing. - Send
function_responseobjects containing the latest URL + screenshot. - Repeat until the model returns only text (no actions) or you hit the turn limit.
Operational guidance
- Run in a sandboxed browser profile or container.
- Use
--excludeto block risky actions you do not want the model to take. - Keep the viewport at 1440x900 unless you have a reason to change it.
Resources
- Script:
scripts/computer_use_agent.py - Reference notes:
references/google-computer-use.md - Env template:
env.example
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.5★★★★★28 reviews- ★★★★★Anaya Shah· Dec 8, 2024
gemini-computer-use reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Ama Tandon· Dec 4, 2024
gemini-computer-use has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Ava Flores· Nov 27, 2024
I recommend gemini-computer-use for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Maya Desai· Nov 23, 2024
gemini-computer-use fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Advait Sanchez· Oct 18, 2024
Useful defaults in gemini-computer-use — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Maya Okafor· Oct 14, 2024
We added gemini-computer-use from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Yash Thakker· Sep 1, 2024
gemini-computer-use has been reliable in day-to-day use. Documentation quality is above average for community skills.
- ★★★★★Dhruvi Jain· Aug 20, 2024
Solid pick for teams standardizing on skills: gemini-computer-use is focused, and the summary matches what you get after install.
- ★★★★★Rahul Santra· Jul 19, 2024
gemini-computer-use is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Hiroshi Menon· Jul 19, 2024
Keeps context tight: gemini-computer-use is the kind of skill you can hand to a new teammate without a long onboarding doc.
showing 1-10 of 28