Create Skill Command
This command provides guidance for creating effective skills.
Overview
Writing skills IS Test-Driven Development applied to process documentation.
Personal skills live in agent-specific directories (~/.claude/skills for Claude Code, ~/.codex/skills for Codex)
You write test cases (pressure scenarios with subagents), watch them fail (baseline behavior), write the skill (documentation), watch tests pass (agents comply), and refactor (close loopholes).
Core principle: If you didn't watch an agent fail without the skill, you don't know if the skill teaches the right thing.
REQUIRED BACKGROUND: You MUST understand Test-Driven Development before using this skill. That skill defines the fundamental RED-GREEN-REFACTOR cycle. This skill adapts TDD to documentation.
Official guidance: The Anthropic's official skill authoring best practices provided at the /customaize-agent:apply-anthropic-skill-best-practices command, they enhance customize-agent:prompt-engineering skill. Use skill and the document, as they not copy but add to each other. These document provides additional patterns and guidelines that complement the TDD-focused approach in this skill.
About Skills
Skills are modular, self-contained packages that extend Claude's capabilities by providing
specialized knowledge, workflows, and tools. Think of them as "onboarding guides" for specific
domains or tasksโthey transform Claude from a general-purpose agent into a specialized agent
equipped with procedural knowledge that no model can fully possess.
What is a Skill?
A skill is a reference guide for proven techniques, patterns, or tools. Skills help future Claude instances find and apply effective approaches.
Skills are: Reusable techniques, patterns, tools, reference guides
Skills are NOT: Narratives about how you solved a problem once
What Skills Provide
- Specialized workflows - Multi-step procedures for specific domains
- Tool integrations - Instructions for working with specific file formats or APIs
- Domain expertise - Company-specific knowledge, schemas, business logic
- Bundled resources - Scripts, references, and assets for complex and repetitive tasks
TDD Mapping for Skills
| TDD Concept |
Skill Creation |
| Test case |
Pressure scenario with subagent |
| Production code |
Skill document (SKILL.md) |
| Test fails (RED) |
Agent violates rule without skill (baseline) |
| Test passes (GREEN) |
Agent complies with skill present |
| Refactor |
Close loopholes while maintaining compliance |
| Write test first |
Run baseline scenario BEFORE writing skill |
| Watch it fail |
Document exact rationalizations agent uses |
| Minimal code |
Write skill addressing those specific violations |
| Watch it pass |
Verify agent now complies |
| Refactor cycle |
Find new rationalizations โ plug โ re-verify |
The entire skill creation process follows RED-GREEN-REFACTOR.
When to Create a Skill
Create when:
- Technique wasn't intuitively obvious to you
- You'd reference this again across projects
- Pattern applies broadly (not project-specific)
- Others would benefit
Don't create for:
- One-off solutions
- Standard practices well-documented elsewhere
- Project-specific conventions (put in CLAUDE.md)
Skill Types
Technique
Concrete method with steps to follow (condition-based-waiting, root-cause-tracing)
Pattern
Way of thinking about problems (flatten-with-flags, test-invariants)
Reference
API docs, syntax guides, tool documentation (office docs)
Directory Structure
skills/
skill-name/
SKILL.md # Main reference (required)
supporting-file.* # Only if needed
Flat namespace - all skills in one searchable namespace
Separate files for:
- Heavy reference (100+ lines) - API docs, comprehensive syntax
- Reusable tools - Scripts, utilities, templates
Keep inline:
- Principles and concepts
- Code patterns (< 50 lines)
- Everything else
Anatomy of a Skill
Every skill consists of a required SKILL.md file and optional bundled resources:
skill-name/
โโโ SKILL.md (required)
โ โโโ YAML frontmatter metadata (required)
โ โ โโโ name: (required)
โ โ โโโ description: (required)
โ โโโ Markdown instructions (required)
โโโ Bundled Resources (optional)
โโโ scripts/ - Executable code (Python/Bash/etc.)
โโโ references/ - Documentation intended to be loaded into context as needed
โโโ assets/ - Files used in output (templates, icons, fonts, etc.)
SKILL.md (required)
Metadata Quality: The name and description in YAML frontmatter determine when Claude will use the skill. Be specific about what the skill does and when to use it. Use the third-person (e.g. "This skill should be used when..." instead of "Use this skill when...").
SKILL.md Structure
Frontmatter (YAML):
- Only two fields supported:
name and description
- Max 1024 characters total
name: Use letters, numbers, and hyphens only (no parentheses, special chars)
description: Third-person, includes BOTH what it does AND when to use it
- Start with "Use when..." to focus on triggering conditions
- Include specific symptoms, situations, and contexts
- Keep under 500 characters if possible
---
name: Skill-Name-With-Hyphens
description: Use when [specific triggering conditions and symptoms] - [what the skill does and how it helps, written in third person]
---
# Skill Name
## Overview
What is this? Core principle in 1-2 sentences.
## When to Use
[Small inline flowchart IF decision non-obvious]
Bullet list with SYMPTOMS and use cases
When NOT to use
## Core Pattern (for techniques/patterns)
Before/after code comparison
## Quick Reference
Table or bullets for scanning common operations
## Implementation
Inline code for simple patterns
Link to file for heavy reference or reusable tools
## Common Mistakes
What goes wrong + fixes
## Real-World Impact (optional)
Concrete results
Bundled Resources (optional)
Scripts (scripts/)
Executable code (Python/Bash/etc.) for tasks that require deterministic reliability or are repeatedly rewritten.
- When to include: When the same code is being rewritten repeatedly or deterministic reliability is needed
- Example:
scripts/rotate_pdf.py for PDF rotation tasks
- Benefits: Token efficient, deterministic, may be executed without loading into context
- Note: Scripts may still need to be read by Claude for patching or environment-specific adjustments
References (references/)
Documentation and reference material intended to be loaded as needed into context to inform Claude's process and thinking.
- When to include: For documentation that Claude should reference while working
- Examples:
references/finance.md for financial schemas, references/mnda.md for company NDA template, references/policies.md for company policies, references/api_docs.md for API specifications
- Use cases: Database schemas, API documentation, domain knowledge, company policies, detailed workflow guides
- Benefits: Keeps SKILL.md lean, loaded only when Claude determines it's needed
- Best practice: If files are large (>10k words), include grep search patterns in SKILL.md
- Avoid duplication: Information should live in either SKILL.md or references files, not both. Prefer references files for detailed information unless it's truly core to the skillโthis keeps SKILL.md lean while making information discoverable without hogging the context window. Keep only essential procedural instructions and workflow guidance in SKILL.md; move detailed reference material, schemas, and examples to references files.
Assets (assets/)
Files not intended to be loaded into context, but rather used within the output Claude produces.
- When to include: When the skill needs files that will be used in the final output
- Examples:
assets/logo.png for brand assets, assets/slides.pptx for PowerPoint templates, assets/frontend-template/ for HTML/React boilerplate, assets/font.ttf for typography
- Use cases: Templates, images, icons, boilerplate code, fonts, sample documents that get copied or modified
- Benefits: Separates output resources from documentation, enables Claude to use files without loading them into context
Progressive Disclosure Design Principle
Skills use a three-level loading system to manage context efficiently:
- Metadata (name + description) - Always in context (~100 words)
- SKILL.md body - When skill triggers (<5k words)
- Bundled resources - As needed by Claude (Unlimited*)
*Unlimited because scripts can be executed without reading into context window.
Claude Search Optimization (CSO)
Critical for discovery: Future Claude needs to FIND your skill
1. Rich Description Field
Purpose: Claude reads description to decide which skills to load for a given task. Make it answer: "Should I read this skill right now?"
Format: Start with "Use when..." to focus on triggering conditions, then explain what it does
Content:
- Use concrete triggers, symptoms, and situations that signal this skill applies
- Describe the problem (race conditions, inconsistent behavior) not language-specific symptoms (setTimeout, sleep)
- Keep triggers technology-agnostic unless the skill itself is technology-specific
- If skill is technology-specific, make that explicit in the trigger
- Write in third person (injected into system prompt)
description: For async testing
description: I can help you with async tests when they're flaky
description: Use when tests use setTimeout/sleep and are flaky
description: Use when tests have race conditions, timing dependencies, or pass/fail inconsistently - replaces arbitrary timeouts with condition polling for reliable async tests
description: Use when using React Router and handling authentication redirects - provides patterns for protected routes and auth state management
2. Keyword Coverage
Use words Claude would search for:
- Error messages: "Hook timed out", "ENOTEMPTY", "race condition"
- Symptoms: "flaky", "hanging", "zombie", "pollution"
- Synonyms: "timeout/hang/freeze", "cleanup/teardown/afterEach"
- Tools: Actual commands, library names, file types
3. Descriptive Naming
Use active voice, verb-first:
- โ
creating-skills not skill-creation
- โ
testing-skills-with-subagents not subagent-skill-testing
4. Token Efficiency (Critical)
Problem: getting-started and frequently-referenced skills load into EVERY conversation. Every token counts.
Target word counts:
- getting-started workflows: <150 words each
- Frequently-loaded skills: <200 words total
- Other skills: <500 words (still be concise)
Techniques:
Move details to tool help:
search-conversations supports --text, --both, --after DATE, --before DATE, --limit N
search-conversations supports multiple modes and filters. Run --help for details.
Use cross-references:
# โ BAD: Repeat workflow details
When searching, dispatch subagent with template...
[20 lines of repeated instructions]
# โ
GOOD: Reference other skill
Always use subagents (50-100x context savings). REQUIRED: Use [other-skill-name] for workflow.
Compress examples:
# โ BAD: Verbose example (42 words)
your human partner: "How did we handle authentication errors in React Router before?"
You: I'll search past conversations for React Router authentication patterns.
[Dispatch subagent with search query: "React Router authentication error handling 401"]
# โ
GOOD: Minimal example (20 words)
Partner: "How did we handle auth errors in React Router?"
You: Searching...
[Dispatch subagent โ synthesis]
Eliminate redundancy:
- Don't repeat what's in cross-referenced skills
- Don't explain what's obvious from command
- Don't include multiple examples of same pattern
Verification:
wc -w skills/path/SKILL.md
Name by what you DO or core insight:
- โ
condition-based-waiting > async-test-helpers
- โ
using-skills not skill-usage
- โ
flatten-with-flags > data-structure-refactoring
- โ
root-cause-tracing > debugging-techniques
Gerunds (-ing) work well for processes:
creating-skills, testing-skills, debugging-with-logs
- Active, describes the action you're taking
4. Cross-Referencing Other Skills
When writing documentation that references other skills:
Use skill name only, with explicit requirement markers:
- โ
Good:
**REQUIRED SUB-SKILL:** Use superpowers:test-driven-development
- โ
Good:
**REQUIRED BACKGROUND:** You MUST understand superpowers:systematic-debugging
- โ Bad:
See skills/testing/test-driven-development (unclear if required)
- โ Bad:
@skills/testing/test-driven-development/SKILL.md (force-loads, burns context)
Why no @ links: @ syntax force-loads files immediately, consuming 200k+ context before you need them.
Flowchart Usage
digraph when_flowchart {
"Need to show information?" [shape=diamond];
"Decision where I might go wrong?" [shape=diamond];
"Use markdown" [shape=box];
"Small inline flowchart" [shape=box];