← Blog
explainx / blog

Karpathy-inspired Claude Code guidelines: andrej-karpathy-skills explained (2026)

What forrestchang/andrej-karpathy-skills adds to Claude Code: four principles from Andrej Karpathy’s LLM pitfalls post, plugin vs CLAUDE.md install, and how to combine with agent skills on explainx.ai.

5 min readExplainX Team
Claude CodeAgent SkillsAndrej KarpathyDeveloper ToolsAI Coding AssistantsBest Practices
Karpathy-inspired Claude Code guidelines: andrej-karpathy-skills explained (2026)

andrej-karpathy-skills is one of the most visible Claude Code community packages that answers a blunt question: how do you stop an AI coding agent from confidently doing the wrong thing? Maintainer forrestchang distilled Andrej Karpathy’s observations on LLM failure modes into a single CLAUDE.md—and wrapped it as a Claude Code plugin so the same guardrails can follow you across projects. The repository has attracted on the order of 30k+ GitHub stars and thousands of forks (exact counts change daily), which is a useful signal that teams want shared agent etiquette, not ad-hoc prompts.

This article is a field guide: what problem it solves, the four principles, how to install it, and how it fits next to agent skills and registries like explainx.ai.

TL;DR

QuestionShort answer
What is it?A portable policy file (CLAUDE.md) + plugin that steers Claude Code toward calmer, smaller, test-backed edits.
Primary sourcegithub.com/forrestchang/andrej-karpathy-skills (MIT).
AttributionPrinciples trace to Karpathy’s thread on model behavior in coding workflows (X post).
Install (plugin)/plugin marketplace add forrestchang/andrej-karpathy-skills then /plugin install andrej-karpathy-skills@karpathy-skills.
Install (file only)curl the raw CLAUDE.md from the repo into your project root (see below).
Related ecosystemPair with domain skills (e.g. MCP, marketing, security) and our agent skills guide.

The README also points readers who want a managed agents platform to Multica (open source).

The problem Karpathy named

The README quotes three recurring failure modes—paraphrased here with the same intent as the upstream post:

  1. Silent assumptions — Models “make wrong assumptions on your behalf” instead of surfacing uncertainty, tradeoffs, or inconsistencies.
  2. Overengineering — A tendency toward bloated APIs, unnecessary abstraction, and large diffs when a smaller change would do.
  3. Collateral edits — Touching comments or code that was not part of the task, sometimes removing context the model did not fully understand.

Those bullets are not academic gripes; they show up as bad PRs, reverted commits, and lost trust in agent-assisted workflows. Packaging a counter-policy in CLAUDE.md makes the expectations durable and diffable like any other repo convention.

The four principles (and what each fixes)

The project organizes the remedy into four principles. This table mirrors the README’s mapping:

PrincipleWhat it pushes back against
Think Before CodingWrong assumptions, hidden confusion, missing tradeoffs
Simplicity FirstOvercomplicated designs and speculative “flexibility”
Surgical ChangesDrive-by refactors and unrelated edits
Goal-Driven ExecutionVague tasks with no verification loop

1. Think Before Coding

Intent: force explicit reasoning before keystrokes—state assumptions, spell out multiple interpretations when the prompt is ambiguous, push back when a simpler path exists, and stop to ask when something is unclear rather than guessing.

2. Simplicity First

Intent: minimum code that solves the request—no extra features, no abstraction for a one-off, no configuration surface nobody asked for, and no elaborate error handling for scenarios that will not occur. The README’s blunt test: Would a senior engineer call this overcomplicated?

3. Surgical Changes

Intent: touch only what the task requires; match local style; do not “clean up” unrelated code or comments; if you spot dead code outside scope, mention it instead of deleting it. Exception: remove orphans your change created (unused imports, dead helpers).

4. Goal-Driven Execution

Intent: replace fuzzy asks with checkable outcomes—for example, “add validation” becomes “write failing tests for invalid inputs, then make them pass.” The README cites Karpathy directly: models are strong at looping until a specific goal is met, so weak criteria (“make it work”) waste cycles.

For multi-step work, the template is simple: each step names a verify hook (command, test, or observable check).

Install paths (plugin vs per-project file)

Option A — Claude Code plugin (recommended in the README)

/plugin marketplace add forrestchang/andrej-karpathy-skills
/plugin install andrej-karpathy-skills@karpathy-skills

That path treats the guidelines like a cross-project capability inside Claude Code’s plugin model.

Option B — CLAUDE.md only

New project:

curl -o CLAUDE.md https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md

Existing CLAUDE.md (append):

echo "" >> CLAUDE.md
curl https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md >> CLAUDE.md

After merging, add a project-specific section (TypeScript strictness, test commands, error-handling patterns) so global etiquette and local law do not fight each other—the README explicitly encourages that pattern.

How you know it is working

The README lists four practical signals:

  • Smaller diffs aligned with the actual request
  • Less churn from overbuilt first drafts
  • Clarifying questions before the wrong implementation lands
  • Cleaner PRs without cosmetic or “while we are here” edits

Those line up with what engineering managers usually measure: rework rate, review noise, and time-to-merge.

How this pairs with explainx.ai skills

CLAUDE.md is an excellent repo-wide temperament layer. Agent skills (see what are agent skills?) are often domain playbooks: MCP integration, SEO/GEO content, PDF tooling, and hundreds of other specialties in the skills registry.

A pragmatic stack:

  1. Root policy — Karpathy-style principles in CLAUDE.md (or the plugin equivalent).
  2. Domain packages — Install only the skills you need from the registry so progressive disclosure stays lean.
  3. Publishing discipline — If you ship public pages or listings, stack seo-geo-style guidance so human readers and citation-style AI surfaces get clear structure.

Tradeoffs (from the upstream README)

The guidelines bias toward caution over speed. For a one-line typo, full ceremony is wasted; for refactors touching money paths, auth, or data integrity, the same bias prevents expensive mistakes. Treat the file as a default stance, not a religion.

Bottom line

andrej-karpathy-skills turns a widely cited Karpathy critique into actionable agent policy: think first, keep changes small and purposeful, and define success in tests and checks so the model’s strength—iterating—works for you instead of against you.

Primary repo: forrestchang/andrej-karpathy-skills on GitHub · Source thread: Karpathy on X

Next reads on explainx.ai: Agent skills: complete guide → · MCP explained → · Browse ranked skills →


Concepts and install commands summarized here follow the upstream README and CLAUDE.md as of 2026; star and fork counts on GitHub change over time. Verify plugin and CLI syntax against your current Claude Code release notes.

Related posts