Purpose
Guide product managers through diagnosing whether they're doing context stuffing (jamming volume without intent) or context engineering (shaping structure for attention). Use this to identify context boundaries, fix "Context Hoarding Disorder," and implement tactical practices like bounded domains, episodic retrieval, and the ResearchβPlanβResetβImplement cycle.
Key Distinction: Context stuffing assumes volume = quality ("paste the entire PRD"). Context engineering treats AI attention as a scarce resource and allocates it deliberately.
This is not about prompt writingβit's about designing the information architecture that grounds AI in reality without overwhelming it with noise.
Key Concepts
The Paradigm Shift: Parametric β Contextual Intelligence
The Fundamental Problem:
- LLMs have parametric knowledge (encoded during training) = static, outdated, non-attributable
- When asked about proprietary data, real-time info, or user preferences β forced to hallucinate or admit ignorance
- Context engineering bridges the gap between static training and dynamic reality
PM's Role Shift: From feature builder β architect of informational ecosystems that ground AI in reality
Context Stuffing vs. Context Engineering
| Dimension |
Context Stuffing |
Context Engineering |
| Mindset |
Volume = quality |
Structure = quality |
| Approach |
"Add everything just in case" |
"What decision am I making?" |
| Persistence |
Persist all context |
Retrieve with intent |
| Agent Chains |
Share everything between agents |
Bounded context per agent |
| Failure Response |
Retry until it works |
Fix the structure |
| Economic Model |
Context as storage |
Context as attention (scarce resource) |
Critical Metaphor: Context stuffing is like bringing your entire file cabinet to a meeting. Context engineering is bringing only the 3 documents relevant to today's decision.
The Anti-Pattern: Context Stuffing
Five Markers of Context Stuffing:
- Reflexively expanding context windows β "Just add more tokens!"
- Persisting everything "just in case" β No clear retention criteria
- Chaining agents without boundaries β Agent A passes everything to Agent B to Agent C
- Adding evaluations to mask inconsistency β "We'll just retry until it's right"
- Normalized retries β "It works if you run it 3 times" becomes acceptable
Why It Fails:
- Reasoning Noise: Thousands of irrelevant files compete for attention, degrading multi-hop logic
- Context Rot: Dead ends, past errors, irrelevant data accumulate β goal drift
- Lost in the Middle: Models prioritize beginning (primacy) and end (recency), ignore middle
- Economic Waste: Every query becomes expensive without accuracy gains
- Quantitative Degradation: Accuracy drops below 20% when context exceeds ~32k tokens
The Hidden Costs:
- Escalating token consumption
- Diluted attention across irrelevant material
- Reduced output confidence
- Cascading retries that waste time and money
Real Context Engineering: Core Principles
Five Foundational Principles:
- Context without shape becomes noise
- Structure > Volume
- Retrieve with intent, not completeness
- Small working contexts (like short-term memory)
- Context Compaction: Maximize density of relevant information per token
Quantitative Framework:
Efficiency = (Accuracy Γ Coherence) / (Tokens Γ Latency)
Key Finding: Using RAG with 25% of available tokens preserves 95% accuracy while significantly reducing latency and cost.
The 5 Diagnostic Questions (Detect Context Hoarding Disorder)
Ask these to identify context stuffing:
- What specific decision does this support? β If you can't answer, you don't need it
- Can retrieval replace persistence? β Just-in-time beats always-available
- Who owns the context boundary? β If no one, it'll grow forever
- What fails if we exclude this? β If nothing breaks, delete it
- Are we fixing structure or avoiding it? β Stuffing context often masks bad information architecture
Memory Architecture: Two-Layer System
Short-Term (Conversational) Memory:
- Immediate interaction history for follow-up questions
- Challenge: Space management β older parts summarized or truncated
- Lifespan: Single session
Long-Term (Persistent) Memory:
- User preferences, key facts across sessions β deep personalization
- Implemented via vector database (semantic retrieval)
- Two types:
- Declarative Memory: Facts ("I'm vegan")
- Procedural Memory: Behavioral patterns ("I debug by checking logs first")
- Lifespan: Persistent across sessions
LLM-Powered ETL: Models generate their own memories by identifying signals, consolidating with existing data, updating database automatically.
The Research β Plan β Reset β Implement Cycle
The Context Rot Solution:
- Research: Agent gathers data β large, chaotic context window (noise + dead ends)
- Plan: Agent synthesizes into high-density SPEC.md or PLAN.md (Source of Truth)
- Reset: Clear entire context window (prevents context rot)
- Implement: Fresh session using only the high-density plan as context
Why This Works: Context rot is eliminated; agent starts clean with compressed, high-signal context.
Anti-Patterns (What This Is NOT)
- Not about choosing AI tools β Claude vs. ChatGPT doesn't matter; architecture matters
- Not about writing better prompts β This is systems design, not copywriting
- Not about adding more tokens β "Infinite context" narratives are marketing, not engineering reality
- Not about replacing human judgment β Context engineering amplifies judgment, doesn't eliminate it
When to Use This Skill
β
Use this when:
- You're pasting entire PRDs/codebases into AI and getting vague responses
- AI outputs are inconsistent ("works sometimes, not others")
- You're burning tokens without seeing accuracy improvements
- You suspect you're "context stuffing" but don't know how to fix it
- You need to design context architecture for an AI product feature
β Don't use this when:
- You're just getting started with AI (start with basic prompts first)
- You're looking for tool recommendations (this is about architecture, not tooling)
- Your AI usage is working well (if it ain't broke, don't fix it)
Facilitation Source of Truth
Use workshop-facilitation as the default interaction protocol for this skill.
It defines:
- session heads-up + entry mode (Guided, Context dump, Best guess)
- one-question turns with plain-language prompts
- progress labels (for example, Context Qx/8 and Scoring Qx/5)
- interruption handling and pause/resume behavior
- numbered recommendations at decision points
- quick-select numbered response options for regular questions (include
Other (specify) when useful)
This file defines the domain-specific assessment content. If there is a conflict, follow this file's domain logic.
Application
This interactive skill uses adaptive questioning to diagnose context stuffing, identify boundaries, and provide tactical implementation guidance.
Step 0: Gather Context
Agent asks:
Before we diagnose your context practices, let's gather information:
Current AI Usage:
- What AI tools/systems do you use? (ChatGPT, Claude, custom agents, etc.)
- What PM tasks do you use AI for? (PRD writing, user research synthesis, discovery, etc.)
- How do you provide context? (paste docs, reference files, use projects/memory)
Symptoms:
- Are AI outputs inconsistent? (works sometimes, not others)
- Are you retrying prompts multiple times to get good results?
- Are responses vague or hedged despite providing "all the context"?
- Are token costs escalating without accuracy improvements?
System Architecture (if applicable):
- Do you have custom AI agents or workflows?
- How is context shared between agents?
- Do you use RAG, vector databases, or memory systems?
You can describe briefly or paste examples.
Step 1: Diagnose Context Stuffing Symptoms
Agent asks:
Let's assess whether you're experiencing context stuffing. Which of these symptoms do you recognize?
Select all that apply:
- "I paste entire documents into AI" β Full PRDs, complete user interview transcripts, entire codebases
- "AI gives vague, hedged responses despite having 'all the context'" β Responses like "it depends," "consider these options," non-committal
- "I have to retry prompts 3+ times to get usable output" β Inconsistency is normalized
- "Token costs are escalating but accuracy isn't improving" β Spending more, getting same or worse results
- "I keep adding more context hoping it'll help" β Reflexive expansion without strategy
- "My agents pass everything to each other" β Agent A β Agent B β Agent C with full context chain
- "I don't have clear criteria for what to include/exclude" β No context boundary definitions
- "None of theseβmy AI usage is working well" β Skip to advanced optimization
User response: [Select symptoms]
Agent analyzes:
Based on your selections:
- 0-1 symptoms: Healthy context practices; proceed to optimization
- 2-3 symptoms: Early context stuffing; address before it scales
- 4+ symptoms: Active Context Hoarding Disorder; immediate intervention needed
Agent proceeds to diagnostic questions.
Step 2: Diagnostic Question 1 β What Specific Decision Does This Support?
Agent asks:
Let's start with the most critical question: For each piece of context you're including, what specific decision does it support?
Example Context Analysis:
Imagine you're asking AI to "help with discovery planning." You're providing:
- Entire PRD (20 pages)
- 50 user interview transcripts (full)
- Competitive analysis doc (15 pages)
- Team meeting notes from last 3 months
Question: What decision are you making right now?
Offer 3 scenarios:
-
"I'm deciding which user segment to interview first"
- Context needed: User segments from PRD (2 paragraphs), prior interview themes (1 page synthesis), not full transcripts
- Context NOT needed: Meeting notes, full competitive analysis, full PRD
-
"I'm deciding which discovery questions to ask in interviews"
- Context needed: Research objectives (from PRD), past interview insights (synthesis), Jobs-to-be-Done framework
- Context NOT needed: Full competitive analysis, full meeting notes
-
"I'm not sure what decision I'm makingβI just want AI to 'understand my product'"
- Problem: No specific decision = context stuffing trap
- Fix: Define the decision first, then select context
Agent recommends:
Best Practice: Before adding context, complete this sentence:
"I need this context because I'm deciding [specific decision], and without [specific information], I can't make that decision."
If you can't complete that sentence, you don't need the context.
User response: [Describe their decision + context]
Agent validates: Does the context directly support the stated decision? If not, recommend trimming.
Step 3: Diagnostic Question 2 β Can Retrieval Replace Persistence?
Agent asks:
Second question: Is this information you always need, or something you can retrieve just-in-time?
The Distinction:
Always-Needed (Persist):
- Core product constraints (technical, regulatory, strategic)
- User preferences that apply to every interaction
- Critical definitions (operational glossary)
- Non-negotiable rules
Episodic (Retrieve on-demand):
- Project-specific details (this epic, this sprint)
- Historical data (past PRDs, old interview transcripts)
- Contextual facts (competitive analysis, market research)
- Temporary decisions
Key Insight: Just-in-time retrieval beats always-available. Don't persist what you can retrieve.
Offer 3 options:
-
"Most of my context is always-needed (core constraints, user prefs)"
- Assessment: Good instinct; verify with Question 4 (what fails if excluded?)
- Recommendation: Build constraints registry and operational glossary (persist these)
-
"Most of my context is episodic (project details, historical data)"
- Assessment: Perfect candidate for RAG or retrieval
- Recommendation: Implement semantic search; retrieve only relevant chunks for each query
-
"I'm not sure which is whichβI persist everything to be safe"
- Assessment: Classic Context Hoarding Disorder symptom
- Fix: Apply Question 4 test to each piece of context
Agent recommends:
Rule of Thumb:
- Persist: Information referenced in 80%+ of interactions
- Retrieve: Information referenced in <20% of interactions
- Gray zone (20-80%): Depends on retrieval latency vs. context window cost
User response: [Categorize their context]
Agent provides: Specific recommendations on what to persist vs. retrieve.
Step 4: Diagnostic Question 3 β Who Owns the Context Boundary?
Agent asks:
Third question: Who is responsible for defining what belongs in vs. out of your AI's context?
The Ownership Problem:
If no one owns the context boundary, it will grow indefinitely. Every PM will add "just one more thing," and six months later, you're stuffing 100k tokens per query.
Offer 3 options:
-
"I own the boundary (solo PM or small team)"
- Assessment: Goodβyou can make fast decisions
- Recommendation: Document your boundary criteria (use Questions 1-5 as framework)
-
"My team shares ownership (collaborative boundary definition)"
- Assessment: Can work if formalized
- Recommendation: Create a "Context Manifest" doc: what's always included, what's retrieved, what's excluded (and why)
-
"No one owns itβit's ad-hoc / implicit"
- Assessment: Critical risk; boundary will expand uncontrollably
- Fix: Assign explicit ownership; schedule quarterly context audits
Agent recommends:
Best Practice: Create a Context Manifest
# Context Manifest: [Product/Feature Name]
## Always Persisted (Core Context)
- Product constraints (technical, regulatory)
- User preferences (role, permissions, preferences)
- Operational glossary (20 key terms)
## Retrieved On-Demand (Episodic Context)
- Historical PRDs (retrieve via semantic search)
- User interview transcripts (retrieve relevant quotes)
- Competitive analysis (retrieve when explicitly needed)
## Excluded (Out of Scope)
- Meeting notes older than 30 days (no longer relevant)
- Full codebase (use code search instead)
- Marketing materials (not decision-relevant)
## Boundary Owner: [Name]
## Last Reviewed: [Date]
## Next Review: [Date + 90 days]
User response: [Describe current ownership model]
Agent provides: Recommendation on formalizing ownership + template for Context Manifest.
Step 5: Diagnostic Question 4 β What Fails if We Exclude This?
Agent asks:
Fourth question: For each piece of context, what specific failure mode occurs if you exclude it?
This is the falsification test. If you can't identify a concrete failure, you don't need the context.
Offer 3 scenarios:
-
"If I exclude product constraints, AI will recommend infeasible solutions"
- Failure Mode: Clear and concrete
- Assessment: Valid reason to persist constraints
-
"If I exclude historical PRDs, AI won't understand our product evolution"
- Failure Mode: Vague and hypothetical
- Assessment: Historical context rarely needed for current decisions
- Fix: Retrieve PRDs only when explicitly referencing past decisions
-
"If I exclude this, I'm not sure anything would breakβI just include it to be thorough"
- Failure Mode: None identified
- Assessment: Context stuffing; delete immediately
Agent recommends:
The Falsification Protocol:
For each context element, complete this statement:
"If I exclude [context element], then [specific failure] will occur in [specific scenario]."
Examples:
- β
Good: "If I exclude GDPR constraints, AI will recommend features that violate EU privacy law."
- β Bad: "If I exclude this PRD, AI might not fully understand the product." (Vague)
User response: [Apply falsification test to their context]
Agent provides: List of context elements to delete (no concrete failure identified).
Step 6: Diagnostic Question 5 β Are We Fixing Structure or Avoiding It?
Agent asks:
Fifth question: Is adding more context solving a problem, or masking a deeper structural issue?
The Root Cause Question:
Context stuffing often hides bad information architecture. Instead of fixing messy, ambiguous documents, teams add more documents hoping AI will "figure it out."
Offer 3 options:
-
"I'm adding context because our docs are poorly structured/ambiguous"
- Assessment: You're masking a structural problem
- Fix: Clean up the docs first (remove ambiguity, add constraints, define terms)
- Example: Instead of pasting 5 conflicting PRDs, reconcile them into 1 Source of Truth
-
"I'm adding context because we don't have a shared operational glossary"