← Blog
explainx / blog

Karpathy LLM Wiki: The Pattern Behind Agent Memory (Complete Guide)

Andrej Karpathy's LLM Wiki gist (5K+ stars) replaces RAG re-retrieval with a persistent, agent-maintained markdown wiki. Three layers, ingest/query/lint ops, vs OKF and RAG—plus the ecosystem it spawned.

10 min readYash Thakker
Andrej KarpathyLLM WikiAgent MemoryRAGClaude CodeObsidian

MDX restores the committed source plus an HTML comment attribution; plain text bundles the rendered markdown body with the explainx.ai attribution footer.

Karpathy LLM Wiki: The Pattern Behind Agent Memory (Complete Guide)

Andrej Karpathy's LLM Wiki gist is one of the most influential agent-memory documents of 2026—5,000+ stars and 5,000+ forks on a single markdown file that fits on one screen. It is not a product. It is an idea file meant to be copy-pasted into Claude Code, Codex, OpenCode, or Pi so your agent builds the specifics with you.

The core reframe:

Most LLM + document workflows are RAG: upload files, retrieve chunks, generate an answer. The LLM rediscovers knowledge from scratch every time. Nothing accumulates.

LLM Wiki is different: the LLM incrementally builds and maintains a persistent wiki—structured, interlinked markdown between you and raw sources. Knowledge is compiled once and kept current, not re-derived on every query.

Google formalized a vendor-neutral version as Open Knowledge Format (OKF) days after the gist went viral. This guide covers Karpathy's original pattern, the three layers, the ingest/query/lint loop, when wiki beats RAG, and the ecosystem the gist spawned.


TL;DR

QuestionAnswer
Gist URLkarpathy/llm-wiki.md
Core ideaPersistent, compounding wiki—not per-query retrieval
Layer 1Raw sources (immutable)
Layer 2Wiki (LLM-owned markdown graph)
Layer 3Schema (CLAUDE.md / AGENTS.md)
OperationsIngest → Query → Lint
Special filesindex.md (catalog), log.md (timeline)
Human roleCurate sources, ask questions, steer analysis
LLM roleSummarize, cross-reference, file, bookkeeping
vs RAGWiki wins below ~50K–100K tokens; RAG for millions+

The Problem With RAG-Only Workflows

NotebookLM, ChatGPT file uploads, and most enterprise RAG share a limitation Karpathy names explicitly:

Ask a subtle question requiring synthesis across five documents, and the LLM must find and piece together fragments every time. Cross-references are not pre-built. Contradictions are not pre-flagged. Synthesis does not compound.

LLM Wiki treats maintenance as the LLM's job:

  • When you add a source, the agent integrates it—updates entity pages, revises summaries, notes contradictions
  • Cross-references already exist when you query
  • The wiki gets richer with every source and every good answer you file back

Karpathy's workflow metaphor: Obsidian is the IDE; the LLM is the programmer; the wiki is the codebase. You browse graph view and links on one side; the agent edits on the other.


Three Layers

1. Raw sources (immutable)

Your curated collection: articles, papers, images, data files, meeting transcripts, code exports. The LLM reads them but never modifies them. This is source of truth.

Typical layout:

raw/
├── articles/
├── papers/
├── assets/          # local images (Obsidian clipper + download)
└── ...

Tip from the gist: Obsidian Web Clipper converts web articles to markdown. Set a fixed attachment folder and bind "Download attachments for current file" so the LLM can reference images locally.

2. The wiki (LLM-owned)

A directory of LLM-generated markdown:

Page typePurpose
Source summariesOne page per ingested document
Entity pagesPeople, companies, concepts
Topic summariesEvolving synthesis
Comparisons / analysesFiled from query operations
overview.mdHigh-level map of the domain

You read the wiki; the LLM writes and maintains it—including cross-links, consistency, and updates when new sources arrive.

3. The schema (CLAUDE.md / AGENTS.md)

The configuration file that makes the LLM a disciplined wiki maintainer rather than a generic chatbot:

  • Directory structure conventions
  • Page formats and naming
  • Ingest workflow (steps after new source)
  • Query workflow (search, cite, file good answers back)
  • Lint workflow (health checks)

You and the LLM co-evolve this file as you learn what works for your domain. See What is CLAUDE.md? for how this fits Claude Code specifically.


Three Operations

Ingest

Drop a source in raw/, tell the LLM to process it.

Typical flow:

  1. LLM reads the source
  2. Discusses key takeaways with you (optional but Karpathy prefers staying involved)
  3. Writes a summary page in the wiki
  4. Updates index.md
  5. Updates relevant entity and concept pages (often 10–15 files per source)
  6. Appends entry to log.md

Karpathy prefers one source at a time with human review. Batch ingest with less supervision is possible—document your choice in the schema.

Query

Ask questions against the wiki, not raw files:

  1. LLM reads index.md to locate relevant pages
  2. Drills into pages
  3. Synthesizes answer with citations

Answers can take many forms: markdown, comparison tables, Marp slide decks, matplotlib charts. Critical insight: good answers should be filed back into the wiki as new pages—explorations compound like ingested sources.

Lint

Periodically health-check the wiki:

CheckAction
Contradictions between pagesFlag or reconcile (domain-dependent)
Stale claims superseded by newer sourcesUpdate or mark superseded
Orphan pages (no inbound links)Link or merge
Concepts mentioned but no dedicated pageCreate stub pages
Missing cross-referencesAdd links
Data gapsSuggest web search or new sources

The LLM suggests new questions to investigate—lint is proactive, not just cleanup.


index.md vs log.md

FileOrientationPurpose
index.mdContentCatalog of all pages—link, one-line summary, optional metadata (date, source count). Updated on every ingest. Query entry point.
log.mdChronologicalAppend-only timeline of ingests, queries, lint passes

Greppable log tip from the gist:

## [2026-04-02] ingest | Article Title

Then: grep "^## \[" log.md | tail -5 for recent activity.

At moderate scale (~100 sources, hundreds of pages), index-first navigation works surprisingly well—no embedding infrastructure required.


Use Cases (From the Gist)

DomainExample
PersonalGoals, health, psychology—journal + articles → structured self-model
ResearchPapers over months → evolving thesis wiki
Reading a bookChapter-by-chapter filing → personal Tolkien Gateway
Business/teamSlack, meetings, docs → LLM-maintained internal wiki
Competitive analysisDue diligence, market maps
Trip planning, courses, hobbiesAny accumulating knowledge

Karpathy links the idea to Vannevar Bush's Memex (1945)—private, curated knowledge with associative trails. Bush couldn't solve maintenance; the LLM handles that.


LLM Wiki vs RAG: The Magnitude Question

@Shilren's interview-doc-agent articulates the decision tree the gist community converged on:

Corpus sizeBest approach
< ~50K–100K tokens (~150–200 dense pages)LLM Wiki / full context — 100% retrieval reliability, no vector DB, global reasoning
Millions of tokens+RAG — won't fit in context
In between / productionHybrid — stable core in wiki, dynamic mass in RAG

Important nuance: index.md is not RAG. It does not vector-match or chunk—it lets the agent open fewer whole files. Even reading the entire library often fits modern 200K–1M token windows; the index is an optimization.


Optional: CLI Tools

When the wiki outgrows index-only search:

  • qmd — local hybrid BM25/vector search + LLM re-ranking; CLI + MCP
  • Custom search scripts — "vibe-code a naive search script as the need arises" (Karpathy's words)

The gist is modular: skip image handling if text-only; skip Marp if you only want markdown; skip search if small.


Ecosystem: What People Built

The gist comments section became a design space catalog. Selected implementations:

ProjectFocusLink
AutoSciResearch agent; contradiction edges; self-evolving wiki; 3 papers end-to-endgithub.com/skyllwt/AutoSci
memwikiCoding-agent memory (.memory/ + hooks for Claude/Cursor/Copilot)github.com/hereisSwapnil/memwiki
secure-llm-wikiUntrusted-source isolation; four-eyes review; provenancegithub.com/NicoBleh/secure-llm-wiki
interview-doc-agentPersonal career library; context vs RAG proofgithub.com/Shilren/interview-doc-agent
Dense-MemMCP memory server; typed claims, conflicts, graphgithub.com/markhuangai/dense-mem
LLM-Wiki-MCPWiki as MCP-accessible system; provenance-aware ingestgithub.com/Electro-resonance/LLM-WIKI-MCP
synthadocWeb chat UI, lint, scheduled ingestgithub.com/axoviq-ai/synthadoc
syntoLocal-first; per-role providers; Ollama-friendlygithub.com/kytmanov/synto
my-llm-wikiAgentic arXiv/GitHub/YouTube ingest + D3 graph demogithub.com/MuhammadSaqlainAslam/my-llm-wiki
Google OKF v0.1Vendor-neutral spec formalizing the patternOKF guide

Community design debates (worth knowing)

  • @pursultani: Contradiction-as-defect fits science; in humanities, contradiction is information—needs typed edges (contradicts, extends) and lint policy changes
  • @NicoBleh: Autonomous ingest is an indirect prompt-injection surface—untrusted sources must not become trusted wiki pages without gates
  • @watsonrm: Multi-writer wikis need append-only, partitioned writes—git merge solves text conflicts, not semantic duplicates
  • @Archimondstat: Socratic–Plato–Bayes variant—only promote ideas to the wiki after user refinement, not raw AI summaries

How to Start (Minimal)

  1. Copy the gist into your agent session or repo docs
  2. Scaffold directories: raw/, wiki/, plus root CLAUDE.md (schema)
  3. Create empty wiki/index.md and wiki/log.md
  4. Ingest one source with the agent; review updates together
  5. Query against the wiki; file a good answer as a new page
  6. Lint after ~10 sources or when links feel stale

For Claude Code specifically:

/init                    # starter CLAUDE.md — extend with wiki schema
/memory                  # edit project memory
/plan "set up LLM wiki"  # agent proposes directory layout

See Claude Code commands reference.


LLM Wiki vs OKF vs CLAUDE.md

Karpathy LLM WikiGoogle OKFCLAUDE.md
WhatPattern / idea fileFormal spec v0.1Agent convention file
ScopeAny domain you defineOrg knowledge graphsSingle-repo instructions
Required metadataYou define in schematype in YAML frontmatterNone required
InteroperabilityBespoke per wikiCross-vendor bundlesTool-specific
Best forPersonal/team wikis, researchEnterprise catalogs, BigQueryCoding agent behavior

They stack: CLAUDE.md can point agents at an OKF bundle or LLM wiki directory and define ingest/query/lint rules.


Why This Works (Karpathy's Argument)

The tedious part of maintaining a knowledge base is not the reading or the thinking — it's the bookkeeping. Updating cross-references, keeping summaries current, noting when new data contradicts old claims.

Humans abandon wikis because maintenance cost grows faster than value. LLMs don't get bored, don't forget cross-references, and can touch 15 files in one pass.

Your job: curate sources, direct analysis, ask good questions, think about meaning.
The LLM's job: everything else.


Summary

Karpathy's LLM Wiki is the clearest statement yet that agent memory should compound—not re-retrieve. Three layers (raw → wiki → schema), three operations (ingest → query → lint), two navigation files (index.md, log.md), and a git repo of markdown as the artifact.

Copy the gist into your agent. Let it build the rest with you. For organizational interoperability, layer Google's OKF on top when you need cross-team bundles.

The wiki is just markdown in git. Version history, branching, and collaboration come free.


Related Reading

Pattern and operations cited from Karpathy's LLM Wiki gist and gist comment ecosystem as of June 14, 2026.

Related posts