QA Testing Strategy (Jan 2026)
Risk-based quality engineering strategy for modern software delivery.
Core references: curated links in data/sources.json (SLOs/error budgets, contracts, E2E, OpenTelemetry). Start with references/operational-playbook.md for a compact, navigable overview.
Scope
- Create or update a risk-based test strategy (what to test, where, and why)
- Define quality gates and release criteria (merge vs deploy)
- Select the smallest effective layer (unit โ integration โ contract โ E2E)
- Make failures diagnosable (artifacts, logs/traces, ownership)
- Operationalize reliability (flake SLO, quarantines, suite budgets)
Use Instead
Quick Reference
| Test Type |
Goal |
Typical Use |
| Unit |
Prove logic and invariants fast |
Pure functions, core business rules |
| Component |
Validate UI behavior in isolation |
UI components and state transitions |
| Integration |
Validate boundaries with real deps |
API + DB, queues, external adapters |
| Contract |
Prevent breaking changes cross-team |
OpenAPI/AsyncAPI/JSON Schema/Protobuf |
| E2E |
Validate critical user journeys |
1โ2 โmoney pathsโ per product area |
| Performance |
Enforce budgets and capacity |
Load, stress, soak, regression trends |
| Visual |
Catch UI regressions |
Layout/visual diffs on stable pages |
| Accessibility |
Automate WCAG checks |
axe smoke + targeted manual audits |
| Security |
Catch common web vulns early |
DAST smoke + critical checks in CI |
Default Workflow
- Clarify scope and risk: critical journeys, failure modes, and non-functional risks (latency, data loss, auth).
- Define quality signals: SLOs/error budgets, contract/schema checks, and what blocks merge vs blocks deploy.
- Choose the smallest effective layer (unit โ integration โ contract โ E2E).
- Make failures diagnosable: artifacts + correlation IDs (logs/traces/screenshots), clear ownership, deflake runbook.
- Operationalize: flake SLO, quarantine with expiry, suite budgets (PR gate vs scheduled), dashboards.
Test Pyramid
/\
/E2E\ 5-10% - Critical journeys
/------\
/Integr. \ 15-25% - API, DB, queues
/----------\
/Component \ 20-30% - UI modules
/------------\
/ Unit \ 40-60% - Logic and invariants
/--------------\
Decision Tree: Test Strategy
Need to test: [Feature Type]
โ
โโ Pure business logic/invariants? โ Unit tests (mock boundaries)
โ
โโ UI component/state transitions? โ Component tests
โ โโ Cross-page user journey? โ E2E tests
โ
โโ API Endpoint?
โ โโ Single service boundary? โ Integration tests (real DB/deps)
โ โโ Cross-service compatibility? โ Contract tests (schema/versioning)
โ
โโ Event-driven/API schema evolution? โ Contract + backward-compat tests
โ
โโ Performance-critical? โ k6 load testing
Core QA Principles
Definition of Done
- Strategy is risk-based: critical journeys + failure modes explicit
- Test portfolio is layered: fast checks catch most defects
- CI is economical: fast pre-merge gates, heavy suites scheduled
- Failures are diagnosable: actionable artifacts (logs/trace/screenshots)
- Flakes managed with SLO and deflake runbook
Shift-Left Gates (Pre-Merge)
- Contracts: OpenAPI/AsyncAPI/JSON Schema validation
- Static checks: lint, typecheck, secret scanning
- Fast tests: unit + key integration (avoid full E2E as PR gate)
Shift-Right (Post-Deploy)
- Synthetic checks for critical paths (monitoring-as-tests)
- Canary analysis: compare SLO signals and key metrics before ramping
- Feature flags for safe rollouts and fast rollback
- Convert incidents into regression tests (prefer lower layers first)
CI Economics
| Budget |
Target |
| PR gate |
p50 โค 10 min, p95 โค 20 min |
| Mainline health |
โฅ 99% green builds/day |
Flake Management
- Define: test fails without product change, passes on rerun
- Track weekly:
flaky_failures / total_test_executions (where flaky_failure = fail_then_pass_on_rerun)
- SLO: Suite flake rate โค 1% weekly
- Quarantine policy with owner and expiry
- Use the deflake runbook: template-flaky-test-triage-deflake-runbook.md
Common Patterns
AAA Pattern
it('should apply discount', () => {
const order = { total: 150 };
const result = calculateDiscount(order);
expect(result.discount).toBe(15);
});
Page Object Model (E2E)
class LoginPage {
async login(email: string, password: string) {
await this.page.fill('[data-testid="email"]', email);
await this.page.fill('[data-testid="password"]', password);
await this.page.click('[data-testid="submit"]');
}
}
Anti-Patterns
| Anti-Pattern |
Problem |
Solution |
| Testing implementation |
Breaks on refactor |
Test behavior |
| Shared mutable state |
Flaky tests |
Isolate test data |
| sleep() in tests |
Slow, unreliable |
Use proper waits |
| Everything E2E |
Slow, expensive |
Use test pyramid |
| Ignoring flaky tests |
False confidence |
Fix or quarantine |
Do / Avoid
Do
- Write tests against stable contracts and user-visible behavior
- Treat flaky tests as P1 reliability work
- Make "how to debug this failure" part of every suite
Avoid
- "Everything E2E" as default
- Sleeps/time-based waits (use event-based)
- Coverage % as primary quality KPI
Feature Matrix vs Test Matrix Gate (Release Blocking)
Before release, run a coverage audit that maps product features/backlog IDs to direct test evidence.
Gate Rules
- Every release-scoped feature must map to at least one direct automated test, or an explicit waiver with owner/date.
- Evidence must include file path and test identifier (suite/spec/case).
- "Covered indirectly" is not accepted without written rationale and risk acknowledgment.
- If critical features have no direct evidence, release is blocked.
Minimal Audit Output
- feature/backlog id
- coverage status (
direct, indirect, none)
- evidence reference
- risk level
- owner and due date for gaps
Resources
| Resource |
Purpose |
| comprehensive-testing-guide.md |
End-to-end playbook across layers |
| operational-playbook.md |
Testing pyramid, BDD, CI gates |
| shift-left-testing.md |
Contract-first, BDD, continuous testing |
| test-automation-patterns.md |
Reliable patterns and anti-patterns |
| playwright-webapp-testing.md |
Playwright patterns |
| chaos-resilience-testing.md |
Chaos engineering |
| observability-driven-testing.md |
OpenTelemetry, trace-based |
| contract-testing-2026.md |
Pact, Specmatic |
| synthetic-test-data.md |
Privacy-safe, ephemeral test data |
| test-environment-management.md |
Environment provisioning and lifecycle |
| quality-metrics-dashboard.md |
Quality metrics and dashboards |
| compliance-testing.md |
SOC2, HIPAA, GDPR, PCI-DSS testing |
| feature-matrix-vs-test-matrix-gate.md |
Release-blocking feature-to-test coverage audit |
Templates
Data
Related Skills
Ops Gate: Release-Safe Verification Sequence
Use this sequence for feature branches that touch user flows, pricing, localization, or analytics.
npm run lint
npm run typecheck
npm run test:unit
npm run test:e2e -- --grep "@critical"
npm run test:analytics-gate
npm run build
If a Gate Fails
- Capture exact failing command and first error line.
- Classify: environment issue, baseline known failure, or regression.
- Re-run only the failed gate once after fix.
- Do not continue to later gates while earlier required gates are red.
Agent Output Contract for QA Handoff
Always report: