Karpathy's LLM Wiki: What It Means & How to Build One
Decision Card
Effort: A weekend afternoon — in Claude Code (or any agentic CLI) with an Opus-class model, make raw/ and wiki/ folders, write a CLAUDE.md schema, drop in a few PDFs/transcripts, and tell the agent to ingest; the video estimates ~1 hour for a basic version.
Honest take: The “build” is really just prompting an agent in plan mode — the heavy lifting is the schema design, which the video shows for ~2 seconds and never scrutinizes. It also quietly ignores cost: if one source “touches 10–15 wiki pages” and you ingest hundreds of articles at Opus pricing ($5/$25 per Mtok), maintenance is “near zero” in human effort but not in tokens, and the demo wiki (8 transcripts) is far too small to prove the lint/maintenance loop actually keeps a 100-article wiki coherent — which is the whole point Karpathy is selling.
Concrete next steps:
- Read Karpathy’s source gist before copying a YouTuber’s version — llm-wiki.md (~15 min).
- Scaffold a tiny wiki:
raw/,wiki/, aCLAUDE.mdschema with Obsidian[[wiki-links]], then ingest 3–5 sources and run a “lint” pass to see if cross-references hold (~1–2 hrs). - Skip if you only ask one-off questions about a handful of docs — plain RAG / file-upload in ChatGPT or NotebookLM is less setup and the “compounding” benefit only pays off over weeks of accumulation.
TL;DR
The video walks through Andrej Karpathy’s “LLM wiki” pattern: instead of RAG retrieving chunks at query time, an LLM incrementally builds and maintains a persistent, interlinked set of markdown files (raw sources → wiki → schema) that compounds over time. The creator then live-builds one for trading strategies in Claude Code with Opus 4.6, demonstrating ingest, query, and web-search backfill.
Key Points
- The whole idea originates from a Karpathy tweet/gist on using LLMs to build personal knowledge bases as a research tool 00:02
- The problem with RAG: nothing accumulates — the LLM rediscovers and re-pieces knowledge from scratch on every query, with no memory or cross-references 01:13
- The wiki pattern flips this — synthesis, cross-references, and flagged contradictions are built up front and persist 01:35
- Three-layer architecture: immutable raw sources → an LLM-owned markdown wiki → a schema/config file (like a CLAUDE.md) that you and the LLM co-evolve 02:34
- Analogy: the wiki is a codebase, Obsidian is the IDE, the LLM is the programmer, the schema is the style guide 03:36
- Three core operations: ingest (drop a source, LLM summarizes + cross-links), query (answers can be filed back as new pages), and lint (health-check for contradictions, stale claims, orphan pages) 03:52
- Ingesting one source is a 7-step pass — read, extract, write summary, update entity/concept pages, flag contradictions, update index, append to log 04:48
- Why it works where human wikis fail: LLMs don’t get bored or forget cross-references, so maintenance cost drops to near zero 06:05
- Four principles: it’s explicit (visible, no opaque embeddings), it’s yours (local files), file-over-app (universal markdown), and bring-your-own-AI (Claude, GPT, Codex, even fine-tuning) 06:34
- The live build uses Claude Code with Opus 4.6 in plan mode; when asked something outside the wiki, it web-searches and backfills new pages (e.g., order blocks, breaker blocks) automatically 13:57
Notable Quotes
“Nothing accumulates. Every time you ask a question, the LLM is rediscovering knowledge from scratch. It’s re-piecing together fragments every single time.” 01:15
“And the critical thing is you never write the wiki yourself, the LM writes and maintains all of it. You’re in charge of the important stuff, finding the good sources, exploring, asking the right questions.” 02:11
“Humans abandon wikis because the maintenance burden grows faster than the value… But nicely, the LMs don’t get bored. They don’t forget to update a cross-reference. They can touch 15 files in a single pass.” 06:05
Verified Claims
- The pattern originates from an Andrej Karpathy post about an LLM-maintained persistent wiki of interlinked markdown files. 00:02 — Confirmed by Karpathy’s own gist llm-wiki.md and coverage in Denser.ai and DataScienceDojo. Confirmed.
- The architecture is three layers — raw sources (immutable), an LLM-owned wiki, and a schema/config. 02:34 — Matches Karpathy’s gist, which states the LLM “reads from them but never modifies them. This is your source of truth,” and the MindStudio writeup. Confirmed.
- The three core operations are ingest, query, and lint. 03:52 — These exact operation names and definitions appear in Karpathy’s gist. Confirmed.
- “File over app” is a real, named philosophy underpinning the markdown-first design. 07:05 — Coined by Obsidian CEO Steph Ango; see stephango.com (referenced via rishikeshs.com summary). Confirmed.
- Claude Opus 4.6 exists and is what the demo uses. 08:46 — Anthropic released Opus 4.6 on Feb 5, 2026, with agent teams and a 1M-token context. Confirmed. (Note: a newer Opus 4.8 has since shipped.)
- RAG retrieves chunks at query time and re-synthesizes each time, with nothing persisting between queries. 01:13 — This is an accurate description of standard retrieval-augmented generation; the “knowledge doesn’t compound” framing is corroborated by Denser.ai’s RAG-vs-LLM-wiki comparison. Confirmed (the limitation is real, though hybrid RAG systems with persistent memory do exist — so it’s a fair but somewhat simplified contrast).
- A real Karpathy wiki reportedly grew to ~100 articles / ~400,000 words with no human writing. (Implied background to 00:11) — This figure appears in secondary coverage (agentpedia) attributed to Karpathy, but I could not confirm it from a primary source. Inconclusive.
Tools, Papers & Standards Mentioned
- Andrej Karpathy — llm-wiki.md gist — the canonical source for the pattern
- Claude Code — the agentic CLI used in the build
- Claude Opus 4.6 — the model used (now superseded by Opus 4.8)
- Obsidian — markdown editor used as the wiki “IDE,” with
[[wiki-links]] - “File over app” — Steph Ango — the design philosophy cited
- RAG (Retrieval-Augmented Generation) — the baseline being contrasted; see the original Lewis et al. paper
- Existing community re-implementations: toolboxmd/karpathy-wiki (Claude Code skills for the pattern)
- Codex, “open claw” (OpenClaw), and Hermes agent — mentioned as alternative agents 15:12
Follow-up Questions
- What does the schema (
CLAUDE.md) actually need to contain to keep a wiki coherent past ~100 pages — and what are the failure modes when lint runs on a large, contradictory corpus? - What is the real token/dollar cost of maintaining a compounding wiki over months, given each ingest “touches 10–15 pages,” and how does it compare to a hosted RAG/memory system?
- Does the “file it back as a new page” loop cause drift or duplication over time (the same concept re-derived slightly differently), and how is deduplication enforced?
Sources
- https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
- https://denser.ai/blog/llm-wiki-karpathy-knowledge-base/
- https://datasciencedojo.com/blog/llm-wiki-tutorial/
- https://www.mindstudio.ai/blog/andrej-karpathy-llm-wiki-knowledge-base-claude-code
- https://agentpedia.codes/blog/karpathy-llm-wiki-idea-file
- https://github.com/toolboxmd/karpathy-wiki
- https://stephango.com/file-over-app
- https://rishikeshs.com/file-over-app/
- https://www.anthropic.com/news/claude-opus-4-6
- https://docs.claude.com/en/docs/claude-code/overview
- https://obsidian.md/
- https://arxiv.org/abs/2005.11401