QA Testing (Playwright)
High-signal, cost-aware E2E testing for web applications.
Core docs:
Defaults (2026)
- Keep E2E thin: protect critical user journeys only; push coverage down (unit/integration/contract).
- Locator priority:
getByRole โ getByLabel/getByText โ getByTestId (fallback).
- Waiting: rely on Playwright auto-wait + web-first assertions; no sleeps/time-based waits.
- Isolation: tests must run alone, in parallel, and in any order; eliminate shared mutable state.
- Flake posture: retries are a debugging tool; treat rerun-pass as a failure signal and fix root cause.
- CI posture: smoke gate on PRs; shard/parallelize regression on schedule; always keep artifacts (trace/video/screenshot).
Quick Start
| Command |
Purpose |
npm init playwright@latest |
Initialize Playwright |
npx playwright test |
Run all tests |
npx playwright test --grep @smoke |
Run smoke tests |
npx playwright test --project=chromium |
Run a single project |
npx playwright test --ui |
Debug with UI mode |
npx playwright test --debug |
Step through a test |
npx playwright show-trace trace.zip |
Inspect trace artifacts |
npx playwright show-report |
Inspect HTML report |
When to Use
- E2E tests for web applications
- Test user authentication flows
- Verify form submissions
- Test responsive designs
- Automate browser interactions
- Set up Playwright in CI/CD
When NOT to Use
| Scenario |
Use Instead |
| Unit testing |
Jest, Vitest, pytest |
| API contracts |
qa-api-testing-contracts |
| Load testing |
k6, Locust, Artillery |
| Mobile native |
Appium |
Authoring Rules
Locator Strategy
await page.getByRole('button', { name: 'Sign in' }).click();
await page.getByLabel('Email').fill('[email protected]');
await page.getByTestId('user-avatar').click();
Flake Control
- Avoid sleeps; use Playwright auto-wait
- Use retries as signal, not a crutch
- Capture trace/screenshot/video on failure
- Prefer user-like interactions; avoid
force: true
Workflow
- Write the smallest test that proves the user outcome (intent + oracle).
- Stabilize locators and assertions before adding more steps.
- Make state explicit: seed per test/worker, clean up deterministically, mock third-party boundaries.
- In CI: shard/parallelize, capture artifacts, and fail fast on rerun-pass flakes.
Debugging Checklist
If something is flaky:
- Open trace first; identify whether it is selector ambiguity, missing wait, or state leakage.
- Replace brittle selectors with semantic locators; replace sleeps with
expect(...) or a targeted wait.
- Reduce global timeouts; add scoped timeouts only when the product truly needs it.
- If it only fails in CI, look for concurrency, cold-start, CPU starvation, and environment differences.
Do / Avoid
-
Make tests independent and deterministic
-
Use network mocking for third-party deps
-
Run smoke E2E on PRs; full regression on schedule
-
"Test everything E2E" as default
-
Weakening assertions to "fix" flakes
-
Auto-healing that weakens assertions
Execution Preflight (High ROI)
Run this preflight before expensive E2E runs to prevent avoidable failures.
Preflight Checklist
- Repository shape:
- Confirm working directory and expected app root exist.
- Verify spec paths before execution (
rg --files tests/e2e | rg <target>).
- Port/process hygiene:
- Check and clear stale dev server port before run (example:
lsof -i :3001).
- Avoid parallel local servers colliding with Playwright
webServer.
- Command validity:
- Validate CLI flags for current tool versions before batch runs.
- Prefer exact spec paths or
--grep over broad globs during triage.
- Artifact expectations:
- Confirm result artifact paths exist before reading (
test -f <error-context.md>).
- If artifact path missing, inspect latest
test-results index first.
Mandatory Sandbox/Port Decisions
Before running Playwright in constrained environments (sandboxed terminals, CI containers, shared dev hosts), decide and document:
- Bind host/port: confirm whether app server must use
127.0.0.1 or 0.0.0.0, and verify selected port is free.
- Escalation path: if bind attempts fail with
EPERM/EACCES, escalate immediately instead of retry loops.
- Long-flow timeout budget: set explicit per-test timeout for API-heavy flows (generation/checkout/report) instead of inflating global timeout.
- Build lock hygiene: clear stale
.next/lock and terminate stale build/dev PIDs before rerun.
Triage Sequence (Fastest Signal)
- Reproduce one failing test with
--workers=1.
- Capture trace/video/screenshot for that single failure.
- Fix determinism root cause.
- Re-run targeted suite.
- Only then run broad regression.
Failure Patterns to Treat as Environment, Not Product Bugs
EADDRINUSE on Playwright web server port
- Missing spec/result paths from stale assumptions
- Shell glob expansion failures for bracketed route segments
Resources
Templates
Related Skills
Fact-Checking
- Use web search/web fetch to verify current external facts, versions, pricing, deadlines, regulations, or platform behavior before final answers.
- Prefer primary sources; report source links and dates for volatile information.
- If web access is unavailable, state the limitation and mark guidance as unverified.