Playwright Visual Testing & Browser Automation
A comprehensive skill for browser automation and visual testing using Playwright MCP server integration. This skill enables rapid UI testing, visual regression detection, automated browser interactions, and cross-browser validation for modern web applications.
When to Use This Skill
Use this skill when:
- Testing web applications across multiple browsers (Chromium, Firefox, WebKit)
- Implementing visual regression testing for UI changes
- Automating user interactions for QA and testing
- Validating responsive designs across different viewports
- Taking screenshots for documentation or bug reports
- Testing form submissions and user workflows
- Verifying accessibility of web interfaces
- Debugging browser-specific issues
- Creating automated E2E test suites
- Validating web applications before deployment
- Testing PWAs and single-page applications
- Capturing visual states for design reviews
Core Concepts
Playwright Browser Automation Philosophy
Playwright provides reliable end-to-end testing for modern web apps:
- Auto-wait: Automatically waits for elements to be actionable before interacting
- Web-first assertions: Retry assertions until they pass or timeout
- Cross-browser: Test on Chromium, Firefox, and WebKit with single API
- Accessibility snapshots: Navigate pages using semantic structure, not visual rendering
- Visual testing: Compare screenshots to detect visual regressions
- Network control: Intercept and mock network requests
- Multi-context: Test multiple scenarios in isolated browser contexts
Key Playwright Entities
- Browser: The browser instance (Chromium, Firefox, WebKit)
- Page: A single page/tab in the browser
- Locator: Element selector using accessibility tree
- Snapshot: Accessibility tree representation of page state
- Screenshot: Visual capture of page or element
- Network Request: HTTP requests made by the page
- Console Messages: Browser console output
- Dialog: Browser prompts, alerts, confirms
Visual Testing Workflow
- Navigate to the target page
- Wait for page to stabilize (animations, loading)
- Capture accessibility snapshot for context
- Take screenshot of page or specific elements
- Compare against baseline (optional)
- Validate visual appearance and functionality
- Document results and issues
Playwright MCP Server Tools Reference
Browser Lifecycle Management
browser_navigate
Navigate to a URL in the current page.
Parameters:
url: The URL to navigate to (required)
Example:
url: "https://example.com"
Best Practices:
- Use full URLs including protocol (https://)
- Wait for navigation to complete before taking actions
- Handle redirects and page transitions
browser_navigate_back
Navigate back to the previous page in history.
Parameters: None
Example:
Use Cases:
- Testing navigation flows
- Verifying back button behavior
- Multi-step form navigation
browser_close
Close the current browser page.
Parameters: None
When to Use:
- Clean up after testing
- Free system resources
- Reset browser state
browser_resize
Resize the browser viewport.
Parameters:
width: Width in pixels (required)
height: Height in pixels (required)
Common Viewports:
width: 375, height: 667
width: 414, height: 896
width: 768, height: 1024
width: 1280, height: 720
width: 1920, height: 1080
Example:
width: 375
height: 667
Page Inspection & Snapshots
browser_snapshot
Capture accessibility snapshot of the current page.
Parameters: None
Returns:
- Accessibility tree with semantic structure
- Element references (ref) for interactions
- Text content and roles
- Interactive elements and states
Why Use Snapshots:
- Better than screenshots for automation
- Semantic understanding of page structure
- Element references for precise interactions
- Faster than visual parsing
- Works without visual rendering
Example Snapshot Structure:
heading "Welcome" [ref=123]
text "to our site"
button "Sign In" [ref=456]
textbox "Email" [ref=789]
value: ""
browser_take_screenshot
Take a screenshot of the current page or element.
Parameters:
filename: Output filename (optional, defaults to page-{timestamp}.png)
type: Image format - "png" or "jpeg" (default: png)
fullPage: Capture full scrollable page (default: false)
element: Human-readable element description (optional)
ref: Element reference from snapshot (optional, requires element)
Screenshot Types:
- Viewport Screenshot (default):
filename: "homepage-viewport.png"
- Full Page Screenshot:
filename: "homepage-full.png"
fullPage: true
- Element Screenshot:
filename: "header.png"
element: "main header navigation"
ref: "123"
Best Practices:
- Use descriptive filenames with context
- PNG for UI elements (lossless)
- JPEG for photos/images (smaller size)
- Full page for documentation
- Element screenshots for focused testing
Browser Interaction
browser_click
Perform click on an element.
Parameters:
element: Human-readable element description (required)
ref: Element reference from snapshot (required)
button: "left", "right", or "middle" (default: left)
doubleClick: true for double-click (default: false)
modifiers: Array of modifier keys ["Alt", "Control", "ControlOrMeta", "Meta", "Shift"]
Examples:
- Basic Click:
element: "Submit button"
ref: "456"
- Right Click:
element: "Context menu trigger"
ref: "789"
button: "right"
- Click with Modifier:
element: "Link to open in new tab"
ref: "123"
modifiers: ["ControlOrMeta"]
- Double Click:
element: "Word to select"
ref: "321"
doubleClick: true
browser_type
Type text into an editable element.
Parameters:
element: Human-readable element description (required)
ref: Element reference from snapshot (required)
text: Text to type (required)
slowly: Type one character at a time (default: false)
submit: Press Enter after typing (default: false)
Examples:
- Form Input:
element: "Email textbox"
ref: "123"
text: "[email protected]"
- Search with Submit:
element: "Search field"
ref: "456"
text: "playwright testing"
submit: true
- Character-by-Character (triggers key handlers):
element: "Auto-complete input"
ref: "789"
text: "New York"
slowly: true
browser_press_key
Press a keyboard key.
Parameters:
key: Key name or character (required)
Common Keys:
ArrowLeft, ArrowRight, ArrowUp, ArrowDown
Enter, Escape, Tab, Backspace, Delete
Home, End, PageUp, PageDown
F1-F12
Control, Alt, Shift, Meta
Examples:
key: "ArrowDown"
key: "Enter"
key: "Escape"
key: "Tab"
browser_fill_form
Fill multiple form fields at once.
Parameters:
fields: Array of field objects (required)
- name: Human-readable field name
- type: "textbox", "checkbox", "radio", "combobox", "slider"
- ref: Element reference from snapshot
- value: Value to set (string, "true"/"false" for checkboxes)
Example:
fields: [
{
name: "Username",
type: "textbox",
ref: "123",
value: "john_doe"
},
{
name: "Password",
type: "textbox",
ref: "456",
value: "secretpass123"
},
{
name: "Remember me",
type: "checkbox",
ref: "789",
value: "true"
}
]
browser_select_option
Select option from dropdown.
Parameters:
element: Human-readable element description (required)
ref: Element reference from snapshot (required)
values: Array of values to select (required)