browser-automation

Selenium

by angiejones

Automate web browser actions efficiently using Selenium WebDriver for robust testing with Selenium on Python and seamles

Automates web browser actions with Selenium WebDriver.

github stars

373

Works with major browsers including SafariNo manual scripting required - just tell the AI what to do10+ browser interaction tools

best for

  • / AI agents performing web-based tasks and workflows
  • / Automated testing of web applications
  • / Web scraping and data extraction from interactive sites
  • / Browser-based automation without manual scripting

capabilities

  • / Launch Chrome, Firefox, Edge, or Safari browsers
  • / Navigate to URLs and click elements on web pages
  • / Fill forms and type text into input fields
  • / Extract text content from web page elements
  • / Perform drag-and-drop and hover interactions
  • / Execute right-clicks and double-clicks on elements

what it does

Automates web browsers through Selenium WebDriver, allowing AI agents to click buttons, fill forms, navigate pages, and interact with websites programmatically.

about

Selenium is a community-built MCP server published by angiejones that provides AI assistants with tools and capabilities via the Model Context Protocol. Automate web browser actions efficiently using Selenium WebDriver for robust testing with Selenium on Python and seamles It is categorized under browser automation. This server exposes 14 tools that AI clients can invoke during conversations and coding sessions.

how to install

You can install Selenium in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

license

MIT

Selenium is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

readme

MCP Selenium Server

A Model Context Protocol (MCP) server for Selenium WebDriver — browser automation for AI agents.

Watch the video

<a href="https://glama.ai/mcp/servers/@angiejones/mcp-selenium"> <img width="380" height="200" src="https://glama.ai/mcp/servers/@angiejones/mcp-selenium/badge" alt="Selenium MCP server" /> </a>

Setup

<details open> <summary><strong>Goose (Desktop)</strong></summary>

Paste into your browser address bar:

goose://extension?cmd=npx&arg=-y&arg=%40angiejones%2Fmcp-selenium%40latest&id=selenium-mcp&name=Selenium%20MCP&description=automates%20browser%20interactions
</details> <details> <summary><strong>Goose (CLI)</strong></summary>
goose session --with-extension "npx -y @angiejones/mcp-selenium@latest"
</details> <details> <summary><strong>Claude Code</strong></summary>
claude mcp add selenium -- npx -y @angiejones/mcp-selenium@latest
</details> <details> <summary><strong>Cursor / Windsurf / other MCP clients</strong></summary>
{
  "mcpServers": {
    "selenium": {
      "command": "npx",
      "args": ["-y", "@angiejones/mcp-selenium@latest"]
    }
  }
}
</details>

Example Usage

Tell the AI agent of your choice:

Open Chrome, go to github.com/angiejones, and take a screenshot.

The agent will call Selenium's APIs to start_browser, navigate, and take_screenshot. No manual scripting or explicit directions needed.

Supported Browsers

Chrome, Firefox, Edge, and Safari.

Safari note: Requires macOS. Run sudo safaridriver --enable once and enable "Allow Remote Automation" in Safari → Settings → Developer. No headless mode.


<details> <summary><strong>Tools</strong></summary>

start_browser

Launches a browser session.

ParameterTypeRequiredDescription
browserstringYeschrome, firefox, edge, or safari
optionsobjectNo{ headless: boolean, arguments: string[] }

navigate

Navigates to a URL.

ParameterTypeRequiredDescription
urlstringYesURL to navigate to

interact

Performs a mouse action on an element.

ParameterTypeRequiredDescription
actionstringYesclick, doubleclick, rightclick, or hover
bystringYesLocator strategy: id, css, xpath, name, tag, class
valuestringYesValue for the locator strategy
timeoutnumberNoMax wait in ms (default: 10000)

send_keys

Types text into an element. Clears the field first.

ParameterTypeRequiredDescription
bystringYesLocator strategy
valuestringYesLocator value
textstringYesText to enter
timeoutnumberNoMax wait in ms (default: 10000)

get_element_text

Gets the text content of an element.

ParameterTypeRequiredDescription
bystringYesLocator strategy
valuestringYesLocator value
timeoutnumberNoMax wait in ms (default: 10000)

get_element_attribute

Gets an attribute value from an element.

ParameterTypeRequiredDescription
bystringYesLocator strategy
valuestringYesLocator value
attributestringYesAttribute name (e.g., href, value, class)
timeoutnumberNoMax wait in ms (default: 10000)

press_key

Presses a keyboard key.

ParameterTypeRequiredDescription
keystringYesKey to press (e.g., Enter, Tab, a)

upload_file

Uploads a file via a file input element.

ParameterTypeRequiredDescription
bystringYesLocator strategy
valuestringYesLocator value
filePathstringYesAbsolute path to the file
timeoutnumberNoMax wait in ms (default: 10000)

take_screenshot

Captures a screenshot of the current page.

ParameterTypeRequiredDescription
outputPathstringNoSave path. If omitted, returns base64 image data.

close_session

Closes the current browser session. No parameters.

execute_script

Executes JavaScript in the browser. Use for advanced interactions not covered by other tools (e.g., drag and drop, scrolling, reading computed styles, DOM manipulation).

ParameterTypeRequiredDescription
scriptstringYesJavaScript code to execute
argsarrayNoArguments accessible via arguments[0], etc.

window

Manages browser windows and tabs.

ParameterTypeRequiredDescription
actionstringYeslist, switch, switch_latest, or close
handlestringNoWindow handle (required for switch)

frame

Switches focus to a frame or back to the main page.

ParameterTypeRequiredDescription
actionstringYesswitch or default
bystringNoLocator strategy (for switch)
valuestringNoLocator value (for switch)
indexnumberNoFrame index, 0-based (for switch)
timeoutnumberNoMax wait in ms (default: 10000)

alert

Handles browser alert, confirm, or prompt dialogs.

ParameterTypeRequiredDescription
actionstringYesaccept, dismiss, get_text, or send_text
textstringNoText to send (required for send_text)
timeoutnumberNoMax wait in ms (default: 5000)

add_cookie

Adds a cookie. Browser must be on a page from the cookie's domain.

ParameterTypeRequiredDescription
namestringYesCookie name
valuestringYesCookie value
domainstringNoCookie domain
pathstringNoCookie path
securebooleanNoSecure flag
httpOnlybooleanNoHTTP-only flag
expirynumberNoUnix timestamp

get_cookies

Gets cookies. Returns all or a specific one by name.

ParameterTypeRequiredDescription
namestringNoCookie name. Omit for all cookies.

delete_cookie

Deletes cookies. Deletes all or a specific one by name.

ParameterTypeRequiredDescription
namestringNoCookie name. Omit to delete all.

diagnostics

Gets browser diagnostics captured via WebDriver BiDi (auto-enabled when supported).

ParameterTypeRequiredDescription
typestringYesconsole, errors, or network
clearbooleanNoClear buffer after returning (default: false)
</details> <details> <summary><strong>Resources</strong></summary>

MCP resources provide read-only data that clients can access without calling a tool.

browser-status://current

Returns the current browser session status (active session ID or "no active session").

PropertyValue
MIME typetext/plain
Requires browserNo

accessibility://current

Returns an accessibility tree snapshot of the current page — a compact, structured JSON representation of interactive elements and text content. Much smaller than full HTML. Useful for understanding page layout and finding elements to interact with.

PropertyValue
MIME typeapplication/json
Requires browserYes
</details>
<details> <summary><strong>Development</strong></summary>

Setup

git clone https://github.com/angiejones/mcp-selenium.git
cd mcp-selenium
npm install

Run Tests

npm test

Requires Chrome + chromedriver on PATH. Tests run headless.

Install via Smithery

npx -y @smithery/cli install @angiejones/mcp-selenium --client claude

Install globally

npm install -g @angiejones/mcp-selenium
mcp-selenium
</details>

License

MIT