Why LLM Wiki? π§ Future Of Knowledge For Agentic AI & Humans
Decision Card
Effort: Weekend project β install Obsidian (free), clip a dozen sources into a vault with the Web Clipper, and point an AI agent at a wiki/ folder with a “read sources β write interlinked markdown pages” prompt. A first useful version is a few hours; the payoff only shows after weeks of consistent note-taking.
Honest take: The video conflates three distinct things β Obsidian’s auto-generated link graph, formal knowledge-graph RAG (Microsoft’s GraphRAG), and Karpathy’s LLM Wiki β under one “knowledge graph” banner, when only the last is what the title actually promises, and the “agent that builds and maintains it automatically” is asserted as already-working but never demonstrated on screen. It’s also a soft sell: the practical setup (the part you’d actually need) is deferred to a hypothetical future video.
Concrete next steps:
- Read Karpathy’s original LLM Wiki gist (~10 min) to get the pattern from the source, not the summary: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
- Build a throwaway 10-note vault in Obsidian and write with
[[wiki-links]]deliberately for a week (~1 hr setup + daily habit) to feel whether the graph actually surfaces connections for you. - Skip if you don’t already take notes regularly β the video’s own thesis is that the graph is a byproduct of consistent note-taking, so there’s nothing to automate over an empty vault.
TL;DR
A former IP lawyer explains knowledge graphs (nodes, edges, triples) using Google, Wikipedia, and Obsidian, then pitches Andrej Karpathy’s “LLM Wiki”: a separate AI-maintained vault of interlinked markdown that gives all your AI tools a shared, persistent brain instead of siloed per-tool memory. It’s a conceptual overview grounded in real sources (GraphRAG, Karpathy’s gist) but stops short of showing the actual automated build.
Key Points
- A knowledge graph reduces to three primitives: a node (a thing/concept), an edge (a named relationship), and a triple (subjectβrelationshipβobject) 01:07
- Google’s search side-panel and Wikipedia’s inter-article links are real-world knowledge graphs you already use daily 01:50
- In Obsidian, wrapping text in
[[double brackets]]auto-creates a node and edge, so the graph emerges from note-taking rather than being deliberately drawn 03:36 - The author keeps two separate vaults β a “human vault” for his own thinking and an “LLM vault” for AI-generated knowledge β to track provenance (what came from him vs. AI) 09:38
- Standard RAG (retrieval augmented generation) converts notes to vectors and retrieves similar chunks; it works for “what is X” but fails when answers live between documents 07:18
- Graph RAG follows relationships between sources instead of retrieving thousands of chunks, and the author claims it “significantly outperforms RAG” on large complex datasets 07:58
- Each AI tool’s built-in memory is siloed and “fails when you’re switching between tools” β an LLM Wiki is the shared structure that fixes this 08:21
- The LLM Wiki concept gained traction via an article by Andrej Karpathy (ex-OpenAI, coiner of “vibe coding”) 08:48
- The LLM Wiki has three layers: untouched raw sources, the AI-compiled interlinked wiki, and periodic AI maintenance that checks for contradictions, outdated info, and orphan pages 09:56
Notable Quotes
“A triple is the atom of a knowledge graph: subject, relationship, and object. That’s the entire model: two things and one connector.” 01:20
“I didn’t try to build the graph. I just wrote about the relationship between different concepts. The knowledge graph is just what happens when you’re specific about how you take notes.” 04:27
“The knowledge isn’t lost, it’s just trapped in the silos of each tool.” 08:33
Verified Claims
Karpathy described the LLM Wiki as an LLM incrementally building/maintaining a persistent, interlinked collection of markdown files that sits between you and raw sources 08:57
- Sources: Karpathy’s llm-wiki GitHub gist
- Verdict: Confirmed β the quoted framing in the video matches the gist.
Andrej Karpathy coined the term “vibe coding” 08:55
- Sources: Vibe coding β Wikipedia, Simon Willison on vibe coding
- Verdict: Confirmed β coined by Karpathy in a Feb 2025 post on X.
Karpathy is formerly from OpenAI 08:53
- Sources: Vibe coding β Wikipedia
- Verdict: Confirmed β co-founder of OpenAI and former Director of AI at Tesla.
On larger, complex datasets, graph RAG significantly outperforms naive RAG 07:58
- Sources: Microsoft Research: GraphRAG blog, GraphRAG project site
- Verdict: Confirmed β Microsoft reports 70β80% win rates over naive RAG on comprehensiveness/diversity for complex, multi-document tasks. (Caveat: the video presents this as universal; the advantage is dataset- and query-dependent, and GraphRAG is more expensive to build.)
RAG converts notes into numbers (vectors) and retrieves the chunks most similar to your question 07:22
- Sources: IBM: What is GraphRAG?
- Verdict: Confirmed β accurate description of standard embedding-based retrieval.
Google and Wikipedia function as knowledge graphs 01:55
- Sources: Google Knowledge Graph (Google blog)
- Verdict: Confirmed β Google’s Knowledge Graph powers the side-panel entity cards the video shows.
Tools, Papers & Standards Mentioned
- Obsidian (markdown note app with graph view and
[[wiki-links]]) β https://obsidian.md ; Web Clipper: https://obsidian.md/clipper - Karpathy’s LLM Wiki gist (canonical source for the concept) β https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
- Microsoft GraphRAG (open-source graph-based RAG) β https://microsoft.github.io/graphrag/
- Google Knowledge Graph β https://blog.google/products/search/introducing-knowledge-graph-things-not/
- RAG (Retrieval Augmented Generation) β see IBM overview
- Hungry Minds / “Rebuilding Civilization” (sponsor, Kickstarter book) β no official canonical doc verified; mentioned as paid sponsorship 04:50
Follow-up Questions
- What does the “agent that automatically builds and maintains” the LLM Wiki actually look like in practice β what prompts, MCP servers, or scripts drive the read-extract-integrate loop, and how reliable is the contradiction/orphan detection?
- For a personal-scale vault (hundreds to low thousands of notes), does Karpathy’s compile-once LLM Wiki genuinely outperform just feeding raw markdown into a long-context model β or is the GraphRAG advantage only material at corpus sizes most individuals never reach?
- How do you prevent the AI-maintained vault from drifting β accumulating subtly wrong syntheses or hallucinated cross-links that then poison every downstream AI tool that trusts the shared brain?
Sources
- https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
- https://en.wikipedia.org/wiki/Vibe_coding
- https://simonwillison.net/2025/Mar/19/vibe-coding/
- https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
- https://microsoft.github.io/graphrag/
- https://www.ibm.com/think/topics/graphrag
- https://blog.google/products/search/introducing-knowledge-graph-things-not/
- https://obsidian.md
- https://obsidian.md/clipper