Boris Cherny from Anthropic reveals how engineers ship 8x more code by building iterative loops instead of single prompts. Learn harness engineering, the approach behind Claude authoring 80%+ of production code at Anthropic.
TL;DR: Single prompts are obsolete for serious software engineering. Anthropic's Boris Cherny explains that production AI coding requires harness engineering—building systems that run iterative loops where Claude observes, plans, acts, and reflects over hours or days. This approach helped Anthropic engineers ship 8x more code daily, with Claude authoring over 80% of merged production code by May 2026.
The Paradigm Shift: From Prompts to Loops
"You're not supposed to prompt Claude. You're supposed to build a system that prompts itself."
This statement from Boris Cherny, engineer at Anthropic, has sparked a fundamental rethinking of how developers should use AI coding assistants.
No iteration: AI can't learn from mistakes or refine approach
Context loss: Each prompt starts fresh without accumulated knowledge
Human bottleneck: Developer must manually orchestrate every step
The solution: harness engineering—systems that autonomously prompt AI agents in iterative loops.
What is Harness Engineering?
Harness engineering is the practice of building frameworks that orchestrate AI agents through repeated observe-plan-act-reflect cycles.
The Core Loop
1. OBSERVE → Analyze current codebase, test results, error logs
2. PLAN → Determine next action based on observations
3. ACT → Execute code changes, run tests, make commits
4. REFLECT → Evaluate results, identify gaps, adjust strategy
5. REPEAT → Loop until task complete or timeout
This isn't a single prompt like "Add user authentication." It's a system that:
Breaks down complex tasks into sub-tasks
Executes each step autonomously
Validates results before proceeding
Adapts strategy based on outcomes
Runs for hours or days without human intervention
Example: Adding Authentication (Traditional vs Loop-Based)
Traditional Single Prompt:
User: "Add JWT authentication to the API"
Claude: [Generates auth code in one file]
User: [Realizes it needs database migration, middleware, tests, docs]
User: "Now add the migration"
Claude: [Generates migration]
User: "Add middleware"
... 15 more manual prompts ...
Loop-Based Harness Engineering:
# Simplified harness pseudocode
task = "Add JWT authentication to the API"
max_turns = 50
context = CodebaseContext()
for turn inrange(max_turns):
# OBSERVE
status = context.analyze_codebase()
test_results = context.run_tests()
# PLAN
plan = claude.plan_next_action(task, status, test_results)
# ACTif plan.action == "modify_file":
context.edit_file(plan.file_path, plan.changes)
elif plan.action == "run_migration":
context.execute_migration(plan.migration_file)
elif plan.action == "write_tests":
context.create_test_file(plan.test_code)
# REFLECTif plan.task_complete:
break# Update context for next iteration
context.commit_changes(plan.commit_message)
Result: Claude autonomously:
Adds JWT library dependencies
Creates auth middleware
Writes database migration for user tokens
Updates API routes to use auth
Writes integration tests
Updates documentation
Runs tests and fixes failing cases
Commits with proper messages
All without human intervention beyond initial task specification.
The Anthropic Results: 8x Productivity, 80% AI-Authored Code
By May 2026, Anthropic engineers using harness engineering:
8x daily code output compared to traditional development
80%+ of merged production code authored by Claude
Hours to days of autonomous execution per task
76% success rate on open-ended software tasks
What Changed?
Before (Single Prompts - Q1 2025):
Engineer writes detailed spec
Claude generates code
Engineer manually integrates, tests, debugs
Repeat 10-20 times per feature
Result: 60% AI-generated code, 40% human
After (Harness Engineering - Q2 2026):
Engineer specifies high-level goal
Harness loop runs autonomously
Claude observes, plans, acts, reflects
Human reviews and approves final PR
Result: 80% AI-generated code, 20% human (architecture + review)
How to Build Your Own Harness (Practical Guide)
Level 1: Simple Loop (1 hour implementation)
Start with a basic observe-act loop for repetitive tasks:
Use cases: Full feature development, complex refactors, multi-service changes
The 14% Claude.md Tax and How to Fix It
Boris Cherny highlighted a critical insight: 14% of developer productivity is lost to poorly structured CLAUDE.md files (or equivalent project context files).
The Problem
Bad CLAUDE.md:
# My Project
This is a web app for users.
## Stack- React
- Node
- Postgres
## Instructions
Be helpful!
Result: Claude wastes turns asking basic questions about:
Project structure
Code style preferences
Testing approach
Deployment process
Business logic context
The Solution: Structured Context
Good CLAUDE.md for harness engineering:
# Project Context for AI Agents## Architecture Map-`/app/*` - Next.js App Router (React Server Components)
-`/lib/db/*` - Prisma ORM, PostgreSQL schemas
-`/lib/api/*` - tRPC API routes
-`/components/*` - React components (shadcn/ui + Tailwind)
## Code Style (CRITICAL - Follow Exactly)- Server components by default; 'use client' only when needed
- Prefer server actions over API routes for mutations
- Database queries only in server components or server actions
- All async functions must handle errors with try-catch
- Use Zod for all input validation
## Testing Strategy- Unit tests: Vitest for pure functions
- Integration tests: Playwright for user flows
- Run `pnpm test` before any commit
- Coverage requirement: 70%+
## Common Patterns### Adding a new API endpoint1. Define Zod schema in `/lib/schemas`2. Create tRPC procedure in `/lib/api/routers`3. Write integration test in `__tests__/api`4. Update OpenAPI docs if public endpoint
### Database changes1. Modify schema in `prisma/schema.prisma`2. Run `pnpm db:migrate:dev` to create migration
3. Update seed data if needed
4. Test migration rollback works
## Deployment- Production: Vercel (auto-deploy on main branch)
- Staging: Railway (auto-deploy on develop branch)
- Never commit secrets - use `.env.local` and Vercel env vars
## Business Context- Users are B2B SaaS companies (SMB to mid-market)
Average deal size: $50K-200K/year
Security/compliance critical: SOC2, GDPR
Performance target: p95 page load < 2s
Impact: Reduces wasted turns by 60%, allows Claude to make informed decisions without asking.
Real-World Success Stories
1. Developer Reports 76% Success Rate
Early adopters of harness engineering on Twitter report:
76% task completion on open-ended software projects
3-5x faster than manual development for complex features
Reduced context-switching: Set task, review final PR hours later
2. Tutorials Going Viral
The community has created extensive guides:
24-minute workshop on harness engineering fundamentals
Last updated: June 8, 2026 | Research sources: Boris Cherny (Anthropic), developer community reports, harness engineering tutorials, production deployment data