Plan Skill
Quick Ref: Decompose goal into trackable issues with waves. Output: .agents/plans/*.md + bd issues.
YOU MUST EXECUTE THIS WORKFLOW. Do not just describe it.
CLI dependencies: bd (issue creation). If bd is unavailable, write the plan to .agents/plans/ as markdown with issue descriptions, and use TaskList for tracking instead. The plan document is always created regardless of bd availability.
Flags
| Flag |
Default |
Description |
--auto |
off |
Skip human approval gate. Used by /rpi --auto for fully autonomous lifecycle. |
--fast-path |
off |
Force Minimal detail template (see Step 3.2) |
--skip-symbol-check |
off |
Skip symbol verification in Step 3.6 (for greenfield plans) |
--skip-audit-gate |
off |
Skip baseline audit gate in Step 6 (for documentation-only plans) |
Execution Steps
Given /plan <goal> [--auto]:
Step 1: Setup
mkdir -p .agents/plans
Step 2: Check for Prior Research
Look for existing research on this topic:
ls -la .agents/research/ 2>/dev/null | head -10
Use Grep to search .agents/ for related content. If research exists, read it with the Read tool to understand the context before planning.
Search knowledge flywheel for prior planning patterns:
if command -v ao &>/dev/null; then
ao search "<topic> plan decomposition patterns" 2>/dev/null | head -10
ao lookup --query "<goal>" --limit 5 2>/dev/null | head -30
fi
Apply retrieved knowledge (mandatory when results returned):
If ao returns relevant learnings or patterns, do NOT just load them as passive context. For each returned item:
- Check: does this learning apply to the current planning goal? (answer yes/no)
- If yes: incorporate as a planning constraint β does it warn about scope? suggest decomposition? flag a known pitfall?
- Cite applicable learnings by filename when they influence a planning decision
After reviewing, record each citation with the correct type:
ao metrics cite "<learning-path>" --type applied 2>/dev/null || true
ao metrics cite "<learning-path>" --type retrieved 2>/dev/null || true
Section evidence: When lookup results include section_heading, matched_snippet, or match_confidence fields, prefer the matched section over the whole file β it pinpoints the relevant portion. Higher match_confidence (>0.7) means the section is a strong match; lower values (<0.4) are weaker signals. Use the matched_snippet as the primary context rather than reading the full file.
Skip silently if ao is unavailable or returns no results.
Step 2.1: Load Compiled Prevention First (Mandatory)
Before decomposition, load compiled planning rules from .agents/planning-rules/*.md when they exist. This is the primary prevention surface for /plan in the compiler-enabled flow.
Use the tracked contracts in docs/contracts/finding-compiler.md and docs/contracts/finding-registry.md:
- prefer compiled planning rules first
- match by finding ID,
applicable_when overlap, language overlap, and literal goal-text overlap
- when file inventory is known, rank by changed-file overlap before falling back to weaker textual matches
- cap the injected set at top 5 findings / rule files
- if compiled planning rules are missing, incomplete, or fewer than the matched finding set, fall back to
.agents/findings/registry.jsonl
- fail open:
- missing compiled directory or registry -> skip silently
- empty compiled directory or registry -> skip silently
- malformed line -> warn and ignore that line
- unreadable file -> warn once and continue without findings
Use the selected planning rules / active findings as hard planning context before issue decomposition. Record the applied finding IDs and how they changed the plan. These become required context for the written plan, not optional side notes.
Ranked packet contract: Treat compiled planning rules, active findings, and matching high-severity next-work.jsonl items as one ranked packet, not three unrelated lookups. The packet must prefer the strongest overlap in this order:
- literal goal-text overlap
applicable_when / issue-type overlap
- language overlap
- changed-file overlap (once the file table exists)
- backlog severity / repo affinity for next-work items
Step 2.2: Read and Validate Research Content
If research files exist, read the most recent one and verify it contains substantive findings before proceeding:
LATEST_RESEARCH=$(ls -t .agents/research/*.md 2>/dev/null | head -1)
if [ -n "$LATEST_RESEARCH" ]; then
if grep -qE '^## (Summary|Key Files|Findings|Key Findings|Architecture|Executive Summary|Recommendations|Part [0-9])' "$LATEST_RESEARCH"; then
echo "Research validated: $LATEST_RESEARCH"
else
echo "WARNING: Research file exists but lacks standard sections (Summary, Key Files, Findings, Key Findings, Architecture, Executive Summary, or Recommendations)."
echo "Consider running /research first for a thorough exploration."
fi
fi
Read the validated research file with the Read tool before proceeding to Step 3. Do not plan based solely on file existence β understanding the research content is essential for accurate decomposition.
Step 3: Explore the Codebase (if needed)
USE THE TASK TOOL to dispatch an Explore agent. The explore prompt MUST request symbol-level detail:
Tool: Task
Parameters:
subagent_type: "Explore"
description: "Understand codebase for: <goal>"
prompt: |
Explore the codebase to understand what's needed for: <goal>
1. Find relevant files and modules
2. Understand current architecture
3. Identify what needs to change
For EACH file that needs modification, return:
- Exact function/method signatures that need changes
- Struct/type definitions that need new fields
- Key functions to reuse (with file:line references)
- Existing test file locations and naming conventions (e.g., TestFoo_Bar)
- Import paths and package relationships
Return: file inventory, per-file symbol details, reuse points with line numbers, test patterns
Pre-Planning Baseline Audit (Mandatory)
Before decomposing into issues, run a quantitative baseline audit to ground the plan in verified numbers. This is mandatory for ALL plans β not just cleanup/refactor. Any plan that makes quantitative claims (counts, sizes, coverage) must verify them mechanically.
Run grep/wc/ls commands to count the current state of what you're changing:
- Files to change: count with
ls/find/wc -l
- Sections to add/remove: count with
grep -l/grep -L
- Code to modify: count LOC, packages, import references
- Coverage gaps: count missing items with
grep -L or find
Record the verification commands alongside their results. These become pre-mortem evidence and acceptance criteria.
| Bad |
Good |
| "14 missing refs/" |
"14 missing refs/ (verified: ls -d skills/*/references/ | wc -l = 20 of 34)" |
| "clean up dead code" |
"Delete 3,003 LOC across 3 packages (verified: find src/old -name '*.go' | xargs wc -l)" |
| "update stale docs" |
"Rewrite 4 specs (verified: ls docs/specs/*.md | wc -l = 4)" |
| "add missing sections" |
"Add Examples to 27 skills (verified: grep -L '## Examples' skills/*/SKILL.md | wc -l = 27)" |
- File size limits: check
wc -l on files near size limits (especially SKILL.md files with the 800-line lint limit). If a planned change will push a file past the limit, split or refactor before implementation.
- Test fixtures affected: count test fixtures upstream of any filter/gate/hook being added or modified with
grep -rn 'func Test' <test-dir>/ | wc -l. Changing a gate without updating its test fixtures causes false-green CI.
Ground truth with numbers prevents scope creep and makes completion verifiable. In ol-571, the audit found 5,752 LOC to remove β without it, the plan would have been vague. In ag-dnu, wrong counts (11 vs 14, 0 vs 7) caused a pre-mortem FAIL that a simple grep audit would have prevented.
Step 3.2: Scale Detail by Complexity
Auto-select plan detail level based on issue count and goal complexity:
| Level |
Criteria |
Template |
Description |
| Minimal |
1-2 issues, fast complexity |
Bullet points per issue |
Title, 2-line description, acceptance criteria, files list |
| Standard |
3-6 issues, standard complexity |
Current plan format |
Full implementation specs, tests, verification |
| Deep |
7+ issues, full complexity, or --deep |
Extended format |
Symbol-level specs, data transformation tables, design briefs, cross-wave registry |
Read references/detail-templates.md for the template definitions.
Override: --deep forces Deep regardless of issue count. --fast-path forces Minimal.
Step 3.5: Generate Implementation Detail (Mandatory)
After exploring the codebase, generate symbol-level implementation detail for EVERY file in the plan. This is what separates actionable specs from vague descriptions. A worker reading the plan should know exactly what to write without rediscovering function names, parameters, or code locations.
File Inventory Table
Start with a ## Files to Modify table listing EVERY file the plan touches:
## Files to Modify
|------|--------|
| `src/auth/middleware.go` | Add rate limit check to `AuthMiddleware` |
| `src/config/config.go` | Add `RateLimit` section to `Config` struct |
| `src/auth/middleware_test.go` | **NEW** β rate limit middleware tests |
Mark new files with **NEW**. This table gives the implementer the full blast radius in 30 seconds.
Per-Section Implementation Specs
For each logical change group, provide symbol-level detail:
-
Exact function signatures β name the function, its parameters, and what changes:
- "Add
worktreePath string parameter to classifyRunStatus"
- "Create new
RPIConfig struct with WorktreeMode string field"
-
Key functions to reuse β with file:line references from the explore step:
- "Reuse
readRunHeartbeat() at rpi_phased.go:1963"
- "Call existing
parsePhasedState() at rpi_phased.go:1924"
-
Inline code blocks β for non-obvious constructs (struct definitions, CLI flags, config snippets). Verify all inline snippets compile with go build ./... before including them in issue descriptions β workers copy them verbatim:
type RPIConfig struct {
WorktreeMode string `yaml:"worktree_mode" json:"worktree_mode"`
}
-
New struct fields with tags β exact field names and JSON/YAML tags
-
CLI flag definitions β exact flag names, types, defaults, and help text
Named Test Functions
For each test file, list specific test functions with one-line descriptions:
**`src/auth/middleware_test.go`** β add:
- `TestRateLimitMiddleware_UnderLimit`: Request within limit returns 200
- `TestRateLimitMiddleware_OverLimit`: Request exceeding limit returns 429
- `TestRateLimitMiddleware_ResetAfterWindow`: Counter resets after time window
Test Level Classification
For each test in the plan, classify its pyramid level per the test pyramid standard (test-pyramid.md in the standards skill):
| Test |
Level |
Rationale |
TestRateLimitMiddleware_UnderLimit |
L1 (Unit) |
Single function behavior in isolation |
TestRateLimitMiddleware_Integration |
L2 (Integration) |
Middleware + config store interaction |
TestRateLimitMiddleware_E2E |
L3 (Component) |
Full request pipeline with mocked Redis |
Include test_levels metadata in each issue's validation block:
{
"test_levels": {
"required": ["L0", "L1"],
"recommended": ["L2"],
"rationale": "Reason for level selection"
}