Context vs Prompt vs Loop vs Harness Engineering: The Four-Layer Agent Stack
Prompt, context, loop, and harness engineering are four different layers β not synonyms. This guide maps the stack, shows where each lever lives, and links to deep dives on Claude Code, MCP, and production agent design.
Every week in mid-2026, a new term lands on Hacker News β context engineering, loop engineering, harness engineering β and teams treat them as interchangeable upgrades to "prompt engineering."
They are not interchangeable. They are four layers of the same stack, each with different units of work, failure modes, and tools.
Prompt engineering asks: How do I word this message?
Context engineering asks: What does the model see on this call?
Loop engineering asks: What autonomous workflow repeats until a goal is met?
Harness engineering asks: What code runs the loop, tools, and verification reliably?
Confusing them is expensive. A team that rewrites prompts when the harness has no verification step will never fix silent failure loops. A team that builds a sophisticated harness with vague goals will burn tokens forever.
This guide maps the full stack β with diagrams, diagnostics, and links to explainx.ai's deeper guides on each layer.
TL;DR β the four layers at a glance
Layer
Unit of work
You designβ¦
Typical artifacts
When it dominates
Prompt
One message
Wording, format, CoT, few-shot
System prompt text, user template
Single-turn tasks, prototyping
Context
One model call
Full context package
CLAUDE.md, RAG chunks, tool list, history prune rules
Unit of work: One message pair (system + user) or one turn in a chat.
Example β prompt-level fix
# Before (vague)
Summarize this doc.
# After (prompt-engineered)
You are a staff engineer writing release notes for developers.
Output: 3 bullets, each β€25 words, past tense, no marketing language.
If the doc lacks version numbers, say "Version unclear" β do not invent one.
That improves a single call when the right document is already in context.
When prompt engineering is enough
One-shot translation, classification, or formatting
Early prototyping before you know the workflow shape
Failures are clearly about misunderstood instructions, not missing files
When it is not enough
The model "doesn't know" your codebase β context problem
The task needs 40 tool calls β loop problem
Tool calls hang or duplicate writes β harness problem
Definition: Designing the autonomous workflow that decides when to call the model, what goal must be satisfied, and how to know the run is done β without you typing each turn.
Addy Osmani popularized loop engineering in June 2026, building on Boris Cherny at Anthropic:
"I don't prompt Claude anymore. I have loops that are running."
That quote is about layer 3, not layer 1. The loop still contains prompts β something generates them each iteration. You stop being that something.
Definition: Building or configuring the orchestration code that executes loops β parsing model output, calling tools safely, managing retries, assembling context each turn, and enforcing exit conditions.
Boris Cherny and Anthropic engineers use harness engineering for the systems that prompt Claude iteratively β observe, plan, act, reflect β over hours.
Prompt, context, loop, and harness engineering are not four names for the same job. They are four layers of one stack:
Prompt β message wording
Context β per-call assembly
Loop β autonomous workflow
Harness β reliable execution
Karpathy named the context crisis. Cherny named the loop shift. Production teams name the harness when benchmarks move without new models.
When someone says "we need better prompts" on a long-running agent, ask: Which layer is actually failing? The answer determines whether you edit a paragraph, redesign retrieval, rewrite the goal spec, or fix retry logic.