skill-test▌
Donchitos/Claude-Code-Game-Studios · updated Apr 16, 2026
MDX-style export adds YAML metadata + attribution linking explainx.ai and this canonical listing URL.
### Skill Test
- ›description: "Validate skill files for structural compliance and behavioral correctness. Three modes: static (linter), spec (behavioral), audit (coverage report)."
- ›argument-hint: "static [skill-name | all] | spec [skill-name] | category [skill-name | all] | audit"
- ›allowed-tools: Read, Glob, Grep, Write
| name | skill-test |
| description | "Validate skill files for structural compliance and behavioral correctness. Three modes: static (linter), spec (behavioral), audit (coverage report)." |
| argument-hint | "static [skill-name | all] | spec [skill-name] | category [skill-name | all] | audit" |
| user-invocable | true |
| allowed-tools | Read, Glob, Grep, Write |
Skill Test
Validates .claude/skills/*/SKILL.md files for structural compliance and
behavioral correctness. No external dependencies — runs entirely within the
existing skill/hook/template architecture.
Four modes:
| Mode | Command | Purpose | Token Cost |
|---|---|---|---|
static | /skill-test static [name|all] | Structural linter — 7 compliance checks per skill | Low (~1k/skill) |
spec | /skill-test spec [name] | Behavioral verifier — evaluates assertions in test spec | Medium (~5k/skill) |
category | /skill-test category [name|all] | Category rubric — checks skill against its category-specific metrics | Low (~2k/skill) |
audit | /skill-test audit | Coverage report — skills, agent specs, last test dates | Low (~3k total) |
Phase 1: Parse Arguments
Determine mode from the first argument:
static [name]→ run 7 structural checks on one skillstatic all→ run 7 structural checks on all skills (Glob.claude/skills/*/SKILL.md)spec [name]→ read skill + test spec, evaluate assertionscategory [name]→ run category-specific rubric fromCCGS Skill Testing Framework/quality-rubric.mdcategory all→ run category rubric for every skill that has acategory:in catalogaudit(or no argument) → read catalog, list all skills and agents, show coverage
If argument is missing or unrecognized, output usage and stop.
Phase 2A: Static Mode — Structural Linter
For each skill being tested, read its SKILL.md fully and run all 7 checks:
Check 1 — Required Frontmatter Fields
The file must contain all of these in the YAML frontmatter block:
name:description:argument-hint:user-invocable:allowed-tools:
FAIL if any are absent.
Check 2 — Multiple Phases
The skill must have ≥2 numbered phase headings. Look for patterns like:
## Phase Nor## Phase N:## N.(numbered top-level sections)- At least 2 distinct
##headings if phases aren't explicitly numbered
FAIL if fewer than 2 phase-like headings are found.
Check 3 — Verdict Keywords
The skill must contain at least one of: PASS, FAIL, CONCERNS, APPROVED,
BLOCKED, COMPLETE, READY, COMPLIANT, NON-COMPLIANT
FAIL if none are present.
Check 4 — Collaborative Protocol Language
The skill must contain ask-before-write language. Look for:
"May I write"(canonical form)"before writing"or"approval"near file-write instructions"ask"+"write"in close proximity (within same section)
WARN if absent (some read-only skills legitimately skip this).
FAIL if allowed-tools includes Write or Edit but no ask-before-write language is found.
Check 5 — Next-Step Handoff
The skill must end with a recommended next action or follow-up path. Look for:
- A final section mentioning another skill (e.g.,
/story-done,/gate-check) - "Recommended next" or "next step" phrasing
- A "Follow-Up" or "After this" section
WARN if absent.
Check 6 — Fork Context Complexity
If frontmatter contains context: fork, the skill should have ≥5 phase headings
(## level or numbered Phase N headers). Fork context is for complex multi-phase
skills; simple skills should not use it.
WARN if context: fork is set but fewer than 5 phases found.
Check 7 — Argument Hint Plausibility
argument-hint must be non-empty. If the skill body mentions multiple modes
(e.g., "Mode A | Mode B"), the hint should reflect them. Cross-reference the
hint against the first phase's "Parse Arguments" section.
WARN if hint is "" or if documented modes don't match hint.
Static Mode Output Format
For a single skill:
=== Skill Static Check: /[name] ===
Check 1 — Frontmatter Fields: PASS
Check 2 — Multiple Phases: PASS (7 phases found)
Check 3 — Verdict Keywords: PASS (PASS, FAIL, CONCERNS)
Check 4 — Collaborative Protocol: PASS ("May I write" found)
Check 5 — Next-Step Handoff: WARN (no follow-up section found)
Check 6 — Fork Context Complexity: PASS (8 phases, context: fork set)
Check 7 — Argument Hint: PASS
Verdict: WARNINGS (1 warning, 0 failures)
Recommended: Add a "Follow-Up Actions" section at the end of the skill.
For static all, produce a summary table then list any non-compliant skills:
=== Skill Static Check: All 52 Skills ===
Skill | Result | Issues
-----------------------|--------------|-------
gate-check | COMPLIANT |
design-review | COMPLIANT |
story-readiness | WARNINGS | Check 5: no handoff
...
Summary: 48 COMPLIANT, 3 WARNINGS, 1 NON-COMPLIANT
Aggregate Verdict: N WARNINGS / N FAILURES
Phase 2B: Spec Mode — Behavioral Verifier
Step 1 — Locate Files
Find skill at .claude/skills/[name]/SKILL.md.
Look up the spec path from CCGS Skill Testing Framework/catalog.yaml — use the
spec: field for the matching skill entry.
If either is missing:
- Missing skill: "Skill '[name]' not found in
.claude/skills/." - Missing spec path in catalog: "No spec path set for '[name]' in catalog.yaml."
- Spec file not found at path: "Spec file missing at [path]. Run
/skill-test auditto see coverage gaps."
Step 2 — Read Both Files
Read the skill file and test spec file completely.
Step 3 — Evaluate Assertions
For each Test Case in the spec:
- Read the Fixture description (assumed state of project files)
- Read the Expected behavior steps
- Read each Assertion checkbox
For each assertion, evaluate whether the skill's written instructions, if followed correctly given the fixture state, would satisfy it. This is a Claude-evaluated reasoning check, not code execution.
Mark each assertion:
- PASS — skill instructions clearly satisfy this assertion
- PARTIAL — skill instructions partially address it, but with ambiguity
- FAIL — skill instructions would NOT satisfy this assertion given the fixture
For Protocol Compliance assertions (always present):
- Check whether the skill requires "May I write" before file writes
- Check whether the skill presents findings before requesting approval
- Check whether the skill ends with a recommended next step
- Check whether the skill avoids auto-creating files without approval
Step 4 — Build Report
=== Skill Spec Test: /[name] ===
Date: [date]
Spec: CCGS Skill Testing Framework/skills/[category]/[name].md
Case 1: [Happy Path — name]
Fixture: [summary]
Assertions:
[PASS] [assertion text]
[FAIL] [assertion text]
Reason: The skill's Phase 3 says "..." but the fixture state means "..."
Case Verdict: FAIL
Case 2: [Edge Case — name]
...
Case Verdict: PASS
Protocol Compliance:
[PASS] Uses "May I write" before file writes
[PASS] Presents findings before asking approval
[WARN] No explicit next-step handoff at end
Overall Verdict: FAIL (1 case failed, 1 warning)
Step 5 — Offer to Write Results
"May I write these results to CCGS Skill Testing Framework/results/skill-test-spec-[name]-[date].md
and update CCGS Skill Testing Framework/catalog.yaml?"
If yes:
- Write results file to
CCGS Skill Testing Framework/results/ - Update the skill's entry in
CCGS Skill Testing Framework/catalog.yaml:last_spec: [date]last_spec_result: PASS|PARTIAL|FAIL
Phase 2D: Category Mode — Rubric Evaluation
Step 1 — Locate Skill and Category
Find skill at .claude/skills/[name]/SKILL.md.
Look up category: field in CCGS Skill Testing Framework/catalog.yaml.
If skill not found: "Skill '[name]' not found."
If no category: field: "No category assigned for '[name]' in catalog.yaml.
Add category: [name] to the skill entry first."
For category all: collect all skills with a category: field and process each.
category: utility skills are evaluated against U1 (static checks pass) and U2
(gate mode correct if applicable) only — skip to the static mode for U1.
Step 2 — Read Rubric Section
Read CCGS Skill Testing Framework/quality-rubric.md.
Extract the section matching the skill's category (e.g., ### gate, ### team).
Step 3 — Read Skill
Read the skill's SKILL.md fully.
Step 4 — Evaluate Rubric Metrics
For each metric in the category's rubric table:
- Check whether the skill's written instructions clearly satisfy the criterion
- Mark PASS, FAIL, or WARN
- For FAIL/WARN, identify the exact gap in the skill text (quote the relevant section or note its absence)
Step 5 — Output Report
=== Skill Category Check: /[name] ([category]) ===
Metric G1 — Review mode read: PASS
Metric G2 — Full mode directors: FAIL
Gap: Phase 3 spawns only CD-PHASE-GATE; TD-PHASE-GATE, PR-PHASE-GATE, AD-PHASE-GATE absent
Metric G3 — Lean mode: PHASE-GATE only: PASS
Metric G4 — Solo mode: no directors: PASS
Metric G5 — No auto-advance: PASS
Verdict: FAIL (1 failure, 0 warnings)
Fix: Add TD-PHASE-GATE, PR-PHASE-GATE, and AD-PHASE-GATE to the full-mode director
panel in Phase 3.
Step 6 — Offer to Update Catalog
"May I update CCGS Skill Testing Framework/catalog.yaml to record this category check
(last_category, last_category_result) for [name]?"
Phase 2C: Audit Mode — Coverage Report
Step 1 — Read Catalog
Read CCGS Skill Testing Framework/catalog.yaml. If missing, note that catalog doesn't exist
yet (first-run state).
Step 2 — Enumerate All Skills and Agents
Glob .claude/skills/*/SKILL.md to get the complete list of skills.
Extract skill name from each path (directory name).
Also read the agents: section from CCGS Skill Testing Framework/catalog.yaml to get the
complete list of agents.
Step 3 — Build Skill Coverage Table
For each skill:
- Check if a spec file exists (use the
spec:path from catalog, or globCCGS Skill Testing Framework/skills/*/[name].md) - Look up
last_static,last_static_result,last_spec,last_spec_result,last_category,last_category_result,categoryfrom catalog (or mark as "never" / "—" if not in catalog) - Priority comes from catalog
priority:field (critical/high/medium/low)
Step 3b — Build Agent Coverage Table
For each agent in catalog's agents: section:
- Check if a spec file exists (use the
spec:path from catalog, or globCCGS Skill Testing Framework/agents/*/[name].md) - Look up
last_spec,last_spec_result,categoryfrom catalog
Step 4 — Output Report
=== Skill Test Coverage Audit ===
Date: [date]
SKILLS (72 total)
Specs written: 72 (100%) | Never static tested: 72 | Never category tested: 72
Skill | Cat | Has Spec | Last Static | S.Result | Last Cat | C.Result | Priority
-----------------------|----------|----------|-------------|----------|----------|----------|----------
gate-check | gate | YES | never | — | never | — | critical
design-review | review | YES | never | — | never | — | critical
...
AGENTS (49 total)
Agent specs written: 49 (100%)
Agent | Category | Has Spec | Last Spec | Result
-----------------------|------------|----------|-------------|--------
creative-director | director | YES | never | —
technical-director | director | YES | never | —
...
Top 5 Priority Gaps (skills with no spec, critical/high priority):
(none if all specs are written)
Skill coverage: 72/72 specs (100%)
Agent coverage: 49/49 specs (100%)
No file writes in audit mode.
Offer: "Would you like to run /skill-test static all to check structural
compliance across all skills? /skill-test category all to run category rubric
checks? Or /skill-test spec [name] to run a specific behavioral test?"
Phase 3: Recommended Next Steps
After any mode completes, offer contextual follow-up:
- After
static [name]: "Run/skill-test spec [name]to validate behavioral correctness if a test spec exists." - After
static allwith failures: "Address NON-COMPLIANT skills first. Run/skill-test static [name]individually for detailed remediation guidance." - After
spec [name]PASS: "UpdateCCGS Skill Testing Framework/catalog.yamlto record this pass date. Consider running/skill-test auditto find the next spec gap." - After
spec [name]FAIL: "Review the failing assertions and update the skill or the test spec to resolve the mismatch." - After
audit: "Start with the critical-priority gaps. Use the spec template atCCGS Skill Testing Framework/templates/skill-test-spec.mdto create new specs."
How to use skill-test on Cursor
AI-first code editor with Composer
Prerequisites
Before installing skills in Cursor, ensure your development environment meets these requirements:
- ›Cursor installed and configured on your development machine
- ›Node.js version 16.0+ with npm package manager (verify with
node --version) - ›Active project directory or workspace where you want to add skill-test
Execute installation command
Execute the skills CLI command in your project's root directory to begin installation:
The skills CLI fetches skill-test from GitHub repository Donchitos/Claude-Code-Game-Studios and configures it for Cursor.
Select Cursor when prompted
The CLI will show a list of available agents. Use arrow keys to navigate and space to select Cursor:
Verify installation
Confirm successful installation by checking the skill directory location:
Reload or restart Cursor to activate skill-test. Access the skill through slash commands (e.g., /skill-test) or your agent's skill management interface.
Security & Verification Notice
We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.
Skills execute code in your development environment. Always verify the publisher's identity, review recent commits, and test in isolated environments before production deployment.
List & Monetize Your Skill
Submit your Claude Code skill and start earning
Use Cases▌
Task Automation & Efficiency
Automate repetitive workflows and reduce manual effort
Example
Generate reports, summarize documents, draft communications
Save 3-5 hours per week on routine tasks
Knowledge Enhancement
Learn new skills, understand complex topics, get expert guidance
Example
Explain concepts, provide examples, suggest learning resources
Accelerate learning and skill development by 2x
Quality Improvement
Enhance output quality through reviews, suggestions, and refinements
Example
Review drafts, suggest improvements, catch errors
Improve work quality by 30-40% with less effort
Implementation Guide▌
Prerequisites
- ›Claude Desktop or compatible AI client with skill support
- ›Clear understanding of task or problem to solve
- ›Willingness to iterate and refine outputs
Time Estimate
15-45 minutes depending on use case complexity
Installation Steps
- 1.Install skill using provided installation command
- 2.Test with simple use case relevant to your work
- 3.Evaluate output quality and relevance
- 4.Iterate on prompts to improve results
- 5.Integrate into regular workflow if valuable
Common Pitfalls
- ⚠Expecting perfect results without iteration
- ⚠Not providing enough context in prompts
- ⚠Using skill for tasks outside its intended scope
- ⚠Accepting outputs without review and validation
Best Practices▌
✓ Do
- +Start with clear, specific prompts
- +Provide relevant context and constraints
- +Review and refine all outputs before using
- +Iterate to improve output quality
- +Document successful prompt patterns
✗ Don't
- −Don't use without understanding skill limitations
- −Don't skip validation of outputs
- −Don't share sensitive information in prompts
- −Don't expect skill to replace human judgment
💡 Pro Tips
- ★Be specific about desired format and style
- ★Ask for multiple options to choose from
- ★Request explanations to understand reasoning
- ★Combine AI efficiency with human expertise
When to Use This▌
✓ Use When
Use when skill capabilities match your task, clear ROI on time saved, and you can validate outputs. Best for repetitive tasks, learning, and quality improvement.
✗ Avoid When
Avoid when task requires deep expertise you can't validate, involves sensitive decisions, or when learning process is more valuable than speed of completion.
Learning Path▌
- 1Familiarize yourself with skill capabilities and limitations
- 2Start with low-risk, non-critical tasks
- 3Progress to more complex and valuable use cases
- 4Build expertise through regular use and experimentation
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.8★★★★★53 reviews- ★★★★★Ren Ndlovu· Dec 20, 2024
We added skill-test from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Noor Shah· Dec 20, 2024
skill-test reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Anika Patel· Dec 12, 2024
Registry listing for skill-test matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Ren Dixit· Nov 15, 2024
Solid pick for teams standardizing on skills: skill-test is focused, and the summary matches what you get after install.
- ★★★★★Sophia Lopez· Nov 11, 2024
skill-test is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Rahul Santra· Nov 3, 2024
We added skill-test from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Anaya Park· Nov 3, 2024
Keeps context tight: skill-test is the kind of skill you can hand to a new teammate without a long onboarding doc.
- ★★★★★Pratham Ware· Oct 22, 2024
skill-test fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★William Anderson· Oct 22, 2024
skill-test is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Ren Kapoor· Oct 6, 2024
skill-test has been reliable in day-to-day use. Documentation quality is above average for community skills.
showing 1-10 of 53