browser-automationdeveloper-tools

WebDriverIO MCP Server

by webdriverio

WebDriverIO MCP Server enables Claude Desktop to automate browsers and iOS/Android apps with WebDriverIO — offering brow

Enables Claude Desktop to automate web browsers and mobile applications (iOS/Android) using WebDriverIO. Supports browser automation, mobile app testing, touch gestures, app lifecycle management, and hybrid app context switching through natural language.

github stars

15

Supports all major browsers and mobile platformsNatural language automation commandsUnified interface for web and mobile testing

best for

  • / QA engineers automating test scenarios
  • / Developers testing web and mobile applications
  • / Teams doing cross-platform automation testing

capabilities

  • / Automate web browser interactions and clicks
  • / Control mobile app testing on iOS and Android
  • / Execute touch gestures and app lifecycle management
  • / Switch between native and web contexts in hybrid apps
  • / Take screenshots and capture automation results
  • / Run cross-platform tests through unified interface

what it does

Automates web browsers (Chrome, Firefox, Safari, Edge) and mobile apps (iOS/Android) through natural language commands using WebDriverIO.

about

WebDriverIO MCP Server is an official MCP server published by webdriverio that provides AI assistants with tools and capabilities via the Model Context Protocol. WebDriverIO MCP Server enables Claude Desktop to automate browsers and iOS/Android apps with WebDriverIO — offering brow It is categorized under browser automation, developer tools.

how to install

You can install WebDriverIO MCP Server in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

license

MIT

WebDriverIO MCP Server is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

readme

WebDriverIO MCP Server

A Model Context Protocol (MCP) server that enables Claude Desktop to interact with web browsers and mobile applications using WebDriverIO. Automate Chrome, Firefox, Edge, and Safari browsers plus iOS and Android apps—all through a unified interface.

Installation

Setup

Option 1: Configure Claude Desktop or Claude Code (Recommended)

Add the following configuration to your Claude MCP settings:

{
  "mcpServers": {
    "wdio-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "@wdio/mcp"
      ]
    }
  }
}

Option 2: Global Installation

npm i -g @wdio/mcp

Then configure MCP:

{
  "mcpServers": {
    "wdio-mcp": {
      "command": "wdio-mcp"
    }
  }
}

Note: The npm package is @wdio/mcp, but the executable binary is wdio-mcp.

Restart Claude Desktop

⚠️ You may need to fully restart Claude Desktop. On Windows, use Task Manager to ensure it's completely closed before restarting.

📖 Need help? Read the official MCP configuration guide

Prerequisites For Mobile App Automation

  • Appium Server: Install globally with npm install -g appium
  • Platform Drivers:
    • iOS: appium driver install xcuitest (requires Xcode on macOS)
    • Android: appium driver install uiautomator2 (requires Android Studio)
  • Devices/Emulators:
    • iOS Simulator (macOS) or physical device
    • Android Emulator or physical device
  • For iOS Real Devices: You'll need the device's UDID (Unique Device Identifier)
    • Find UDID on macOS: Connect device → Open Finder → Select device → Click device name/model to reveal UDID
    • Find UDID on Windows: Connect device → iTunes or Apple Devices app → Click device icon → Click "Serial Number" to reveal UDID
    • Xcode method: Window → Devices and Simulators → Select device → UDID shown as "Identifier"

Start the Appium server before using mobile features:

appium
# Server runs at http://127.0.0.1:4723 by default

Features

Browser Automation

  • Session Management: Start and close browser sessions (Chrome, Firefox, Edge, Safari) with headless/headed modes
  • Navigation & Interaction: Navigate URLs, click elements, fill forms, and retrieve content
  • Page Analysis: Get visible elements, accessibility trees, take screenshots
  • Cookie Management: Get, set, and delete cookies
  • Scrolling: Smooth scrolling with configurable distances

Mobile App Automation (iOS/Android)

  • Native App Testing: Test iOS (.app/.ipa) and Android (.apk) apps via Appium
  • Touch Gestures: Tap, swipe, long-press, drag-and-drop
  • App Lifecycle: Launch, background, terminate, check app state
  • Context Switching: Seamlessly switch between native and webview contexts for hybrid apps
  • Device Control: Rotate, lock/unlock, geolocation, keyboard control, notifications
  • Cross-Platform Selectors: Accessibility IDs, XPath, UiAutomator (Android), Predicates (iOS)

Available Tools

Session Management

ToolDescription
start_browserStart a browser session (Chrome, Firefox, Edge, Safari; headless/headed, custom dimensions)
start_app_sessionStart an iOS or Android app session via Appium (supports state preservation via noReset)
close_sessionClose or detach from the current browser or app session (supports detach mode)

Navigation & Page Interaction (Web & Mobile)

ToolDescription
navigateNavigate to a URL
get_visible_elementsGet visible, interactable elements on the page. Supports inViewportOnly (default: true) to filter viewport elements, and includeContainers (default: false) to include layout containers on mobile
get_accessibilityGet accessibility tree with semantic element information
scrollScroll in a direction (up/down) by specified pixels
take_screenshotCapture a screenshot

Element Interaction (Web & Mobile)

ToolDescription
click_elementClick an element
set_valueType text into input fields

Cookie Management (Web)

ToolDescription
get_cookiesGet all cookies or a specific cookie by name
set_cookieSet a cookie with name, value, and optional attributes
delete_cookiesDelete all cookies or a specific cookie

Mobile Gestures (iOS/Android)

ToolDescription
tap_elementTap an element by selector or coordinates
swipeSwipe in a direction (up/down/left/right)
drag_and_dropDrag from one location to another

App Lifecycle (iOS/Android)

ToolDescription
get_app_stateCheck app state (installed, running, background, foreground)

Context Switching (Hybrid Apps)

ToolDescription
get_contextsList available contexts (NATIVE_APP, WEBVIEW_*)
get_current_contextShow the currently active context
switch_contextSwitch between native and webview contexts

Device Control (iOS/Android)

ToolDescription
rotate_deviceRotate to portrait or landscape
hide_keyboardHide on-screen keyboard
get_geolocation / set_geolocationGet or set device GPS location

Usage Examples

Real-World Test Cases

Example 1: Testing Demo Android App (Book Scanning)

Test the Demo Android app at C:\Users\demo-liveApiGbRegionNonMinifiedRelease-3018788.apk on emulator-5554:
1. Start the app with auto-grant permissions
2. Get visible elements on the onboarding screen
3. Tap "Skip" to bypass onboarding
4. Verify main screen loads
5. Take a screenshot

Example 2: Testing World of Books E-commerce Site

You are a Testing expert, and want to assess the basic workflows of worldofbooks.com:
- Open World of Books (accept all cookies)
- Get visible elements to see navigation structure
- Search for a fiction book
- Choose one and validate if there are NEW and used book options
- Report your findings at the end

Browser Automation

Basic web testing prompt:

You are a Testing expert, and want to assess the basic workflows of a web application:
- Open World of Books (accept all cookies)
- Search for a fiction book
- Choose one and validate if there are NEW and used book options
- Report your findings at the end

Browser configuration options:

// Default settings (headed mode, 1280x1080)
start_browser()

// Firefox
start_browser({browser: 'firefox'})

// Edge
start_browser({browser: 'edge'})

// Safari (headed only; requires macOS)
start_browser({browser: 'safari'})

// Headless mode
start_browser({headless: true})

// Custom dimensions
start_browser({windowWidth: 1920, windowHeight: 1080})

// Headless with custom dimensions
start_browser({headless: true, windowWidth: 1920, windowHeight: 1080})

// Pass custom capabilities (e.g. Chrome extensions, profile, prefs)
start_browser({
  headless: false,
  capabilities: {
    'goog:chromeOptions': {
      args: ['--user-data-dir=/tmp/wdio-mcp-profile', '--load-extension=/path/to/unpacked-extension']
    }
  }
})

Mobile App Automation

Testing an iOS app on simulator:

Test my iOS app located at /path/to/MyApp.app on iPhone 15 Pro simulator:
1. Start the app session
2. Tap the login button
3. Enter "testuser" in the username field
4. Take a screenshot of the home screen
5. Close the session

Preserving app state between sessions:

Test my Android app without resetting data:
1. Start app session with noReset: true and fullReset: false
2. App launches with existing login state and user data preserved
3. Run test scenarios
4. Close session (app remains installed with data intact)

Testing an iOS app on real device:

Test my iOS

---