What is the difference between prompt, context, loop, and harness engineering?

Prompt engineering words individual messages. Context engineering assembles everything the model sees per call — system prompt, history, RAG, tools. Loop engineering designs autonomous workflows — triggers, goals, verification across many turns. Harness engineering is the runtime code that executes loops, runs tools, manages retries, and feeds context back to the model. They nest: harness implements loops; each loop iteration needs context; context includes prompts.

Which layer should I fix when my agent fails?

Wrong intent or tone in one reply → prompt. Missing docs, wrong tools, or bloated history → context. Never finishes, repeats work, or stops too early → loop. Crashes, timeouts, bad retries, no sandbox → harness. Long sessions often fail at context even when prompts look fine.

Is prompt engineering dead in 2026?

No — it is the innermost layer. Loop and harness engineering sit above it. Boris Cherny's "don't prompt Claude, build loops" quote means developers should not manually drive every turn; it does not mean wording stops mattering. Bad prompts inside a great harness still produce bad steps.

How does context engineering relate to CLAUDE.md and MCP?

CLAUDE.md, SKILL.md, and MCP are context-engineering surfaces — static rules, on-demand skills, and live connectors that populate what the model sees. See the three-layer agent stack guide. The harness (Claude Code) loads them each turn according to your loop design.

What is harness engineering?

Harness engineering is building or configuring the orchestration runtime — tool execution, loop control, verification, memory, checkpoints, human gates. Claude Code, LangGraph, and custom orchestrators are harnesses. LangChain reported Terminal-Bench gains from harness changes alone, same underlying model.

Where should I start learning the stack?

Start with prompt basics, then context assembly for your use case. When tasks span more than a few turns, add loop design (trigger, goal, verify). When shipping to production, invest in harness hardening — retries, checkpoints, observability. explainx.ai pathways cover context and loop engineering in depth.

Context vs prompt vs loop vs harness engineering | explainx.ai Blog

explainx.ainewsletter3.5k

workshops ↗

Context vs prompt vs loop vs harness engineering | explainx.ai Blog | explainx.ai

Every week in mid-2026, a new term lands on Hacker News — context engineering, loop engineering, harness engineering — and teams treat them as interchangeable upgrades to "prompt engineering."

They are not interchangeable. They are four layers of the same stack, each with different units of work, failure modes, and tools.

Prompt engineering asks: How do I word this message?
Context engineering asks: What does the model see on this call?
Loop engineering asks: What autonomous workflow repeats until a goal is met?
Harness engineering asks: What code runs the loop, tools, and verification reliably?

Confusing them is expensive. A team that rewrites prompts when the harness has no verification step will never fix silent failure loops. A team that builds a sophisticated harness with vague goals will burn tokens forever.

This guide maps the full stack — with diagrams, diagnostics, and links to explainx.ai's deeper guides on each layer.

TL;DR — the four layers at a glance

Layer	Unit of work	You design…	Typical artifacts	When it dominates
Prompt	One message

text

┌─────────────────────────────────────────────────────────────┐
│  HARNESS ENGINEERING                                        │
│  Runtime: tool exec, sandbox, retries, checkpoints, logs    │
│  ┌───────────────────────────────────────────────────────┐  │
│  │  LOOP ENGINEERING                                     │  │
│  │  Workflow: trigger, goal, actions, verification, memory │  │
│  │  ┌─────────────────────────────────────────────────┐  │  │
│  │  │  CONTEXT ENGINEERING (per iteration)            │  │  │
│  │  │  Assembly: history, RAG, tools, CLAUDE.md, MCP  │  │  │
│  │  │  ┌───────────────────────────────────────────┐  │  │  │
│  │  │  │  PROMPT ENGINEERING (messages inside)     │  │  │  │
│  │  │  │  Wording: role, format, constraints, CoT  │  │  │  │
│  │  │  └───────────────────────────────────────────┘  │  │  │
│  │  └─────────────────────────────────────────────────┘  │  │
│  └───────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────┘
         ▲                              ▲
         │                              │
    Boris Cherny:                  Andrej Karpathy:
    "build loops"                  "context engineering"
    (loop + harness)               (context + prompt)

Agent stack	Web analogy
Prompt	Copy on a single button label
Context	Full page layout + data fetched for this view
Loop	User journey / multi-step checkout flow
Harness	Browser, server, DB, auth, error handling

markdown

# Before (vague)
Summarize this doc.

# After (prompt-engineered)
You are a staff engineer writing release notes for developers.
Output: 3 bullets, each ≤25 words, past tense, no marketing language.
If the doc lacks version numbers, say "Version unclear" — do not invent one.

Component	Context decision	Not a prompt decision
System prompt	Yes (wording)	Also placement — constraints at top
Conversation history	—	Keep, summarize, or drop turns
Retrieved docs	—	Which chunks, how many tokens
Tool definitions	—	Expose 3 tools or 30?
Tool outputs	—	Full stdout vs truncated summary
CLAUDE.md	—	Always-loaded project rules
SKILL.md	—	Load on trigger only
MCP results	—	Live data injected at query time

text

[SYSTEM — always loaded]
Project: explainx.ai monorepo. Package manager: pnpm. Tests: pnpm test --filter web.
Never edit apps/mobile without explicit ask.

[RETRIEVED — grep hit, apps/web/lib/pathway-data.ts, lines 40-95]
{relevant_snippet_only}

[TOOLS — this task only]
Read, Edit, Bash(pnpm test:*)

[USER]
Fix the failing pathway progress test.

Surface	Layer	Loads when
`~/.claude/CLAUDE.md`	Context	Every session
`./CLAUDE.md`	Context	Project session
`SKILL.md`	Context	Task trigger
MCP servers	Context + Harness	Tool call time
Context Mode / sandbox MCP	Context + Harness	Isolated file reads

Component	Question it answers	Bad design symptom
Trigger	What starts the run?	You still paste prompts manually
Goal	What verifiable state ends it?	Agent "finishes" but tests fail
Actions	What tools can it use?	Agent can't reach GitHub/DB
Verification	How do we check progress?	Infinite loops or premature stop
Memory	What persists across steps?	Re-reads same files, repeats edits

yaml

name: morning_p1_triage
trigger: cron "0 8 * * 1-5"
goal: zero open GitHub issues labeled P1 without assignee
actions: [github_mcp.list_issues, github_mcp.comment, github_mcp.assign]
verify: script checks assignee field on all P1 issues
memory: log file of triaged issue IDs this week
max_iterations: 20
human_gate: none  # read/write on issues only

Dimension	Prompt	Loop
Who drives turns	You	System
Duration	Seconds	Minutes to hours
Output	Text	Verified outcome
Leverage	1×	10–100×
Primary skill	Phrasing	Systems design

Symptom	Loop fix
Runs forever	Add `max_iterations` + no-progress detector
Stops after one file	Tighten goal; add test verification
Does wrong work confidently	Goal too vague — use verifiable criteria
Repeats same edit	Memory / checkpoint missing

Component	What it does	Loop vs harness
Task definition	Converts goal to first prompt	Loop designs; harness encodes
Context manager	Prunes history, injects memory	Context rules; harness implements
Tool executor	Sandboxed bash, MCP, file I/O	Harness
Loop controller	Iteration limits, exit signals	Harness
Verification	Runs tests, scripts, diff checks	Loop specifies; harness runs
Retry / checkpoint	Idempotent retries, resume	Harness
Observability	Logs, traces, cost meters	Harness

Product	Harness features
Claude Code	Tools, hooks, permission modes, sessions, subagents
Cursor / Codex	IDE integration, model routing, agent modes
LangGraph	Stateful graphs, checkpoints, human-in-the-loop
OpenCode	Open-source coding agent harness
Custom	Your retry logic, your verification scripts

yaml

trigger: developer runs /goal
goal: pnpm test --filter web exits 0 AND auth routes use Clerk SDK
verify: test command + grep for legacy auth imports
max_iterations: 50
memory: PROGRESS.md updated each checkpoint

Symptom	Likely layer	First fix
Model misunderstands instruction wording	Prompt	Rewrite system prompt; add few-shot
Model lacks facts not in prompt text	Context	Add RAG, CLAUDE.md, or file read
Model ignores constraints mid-session	Context	Move constraints to top; repeat before user msg
Wrong tool selected	Context	Reduce tool surface; improve schemas
Quality degrades after turn 15	Context	Summarize/prune history
Never completes task	Loop	Add verifiable goal + test verification
Completes but wrongly	Loop	Strengthen verify step
Repeats same action	Loop	Add memory + no-progress detector
Duplicate emails / double writes	Harness	Idempotent retries, checkpoints
Hangs on subprocess	Harness	Timeouts, kill switches
Can't debug what happened	Harness	Structured logging, traces

Role focus	Primary layers	Secondary
Content / marketing AI	Prompt, Context	—
Support bot with KB	Context, Prompt	Loop (escalation)
Internal coding assistant	Context, Loop	Harness (CI integration)
Autonomous coding agent	Loop, Harness	Context
Platform / agent infra	Harness	Loop, Context

Context vs Prompt vs Loop vs Harness Engineering: The Four-Layer Agent Stack

TL;DR — the four layers at a glance

Related posts

Loop Engineering Is Now the Most-Discussed AI Skill on Developer Twitter

Fable 5 Advisor + Sonnet 5 Executor: Claude Code Setup, Prompts, and When to Consult

Claude Code Loops Official Guide: Turn-Based, /goal, /loop, and /schedule (July 2026)

The stack diagram

Layer 1 — Prompt engineering (innermost)

Example — prompt-level fix

When prompt engineering is enough

When it is not enough

Layer 2 — Context engineering (per call)

What lives in the context package

The four context levers

Example — context-level fix (same user message)

Context engineering surfaces in Claude Code

When context engineering dominates

Layer 3 — Loop engineering (workflow)

The five loop components

Example — loop spec (not a prompt)

Loop engineering vs prompt engineering

When loops fail (and it's not the prompt)

Layer 4 — Harness engineering (runtime)

Harness components

Why harness beats model upgrades on benchmarks

Products as harnesses

How the layers interact on one real task

Prompt layer (insufficient alone)

+ Context layer

+ Loop layer

+ Harness layer

Diagnostic — which layer is broken?

The 2026 career map

Practical learning path

Bottom line

TL;DR — the four layers at a glance

Related posts

Loop Engineering Is Now the Most-Discussed AI Skill on Developer Twitter

Fable 5 Advisor + Sonnet 5 Executor: Claude Code Setup, Prompts, and When to Consult

Claude Code Loops Official Guide: Turn-Based, /goal, /loop, and /schedule (July 2026)

The stack diagram

Layer 1 — Prompt engineering (innermost)

Example — prompt-level fix

When prompt engineering is enough

When it is not enough

Layer 2 — Context engineering (per call)

What lives in the context package

The four context levers

Example — context-level fix (same user message)

Context engineering surfaces in Claude Code

When context engineering dominates

Layer 3 — Loop engineering (workflow)

The five loop components

Example — loop spec (not a prompt)

Loop engineering vs prompt engineering

When loops fail (and it's not the prompt)

Layer 4 — Harness engineering (runtime)

Harness components

Why harness beats model upgrades on benchmarks

Products as harnesses

How the layers interact on one real task

Prompt layer (insufficient alone)

+ Context layer

+ Loop layer

+ Harness layer

Diagnostic — which layer is broken?

The 2026 career map

Practical learning path

Bottom line

Related reading