Knowledge Compilation: Why I Don't Use RAG in Obsidian

Andrey Kovtun atApril 22, 20264 min read

For a few months I tried to bolt vector search onto my Obsidian vault. It worked, but it always felt like a crutch. Andrej Karpathy's LLM Wiki method flipped the whole thing upside down — and I ripped out the entire RAG setup.

What RAG is and why it's inconvenient

RAG stands for Retrieval-Augmented Generation. The idea: your notes get chopped into chunks, each chunk becomes a vector — a long list of numbers that describes its meaning. When you ask a question, the question becomes a vector too, and the system finds the chunks that are closest in meaning. Those chunks get fed to the LLM along with your question, and the model answers.

Sounds great. In practice the agent pulls three cosine-closest chunks and pretends it understood something. Context between chunks is lost, and so are the links between notes. If the answer needs to stitch thoughts together from five different places, you're lucky if it pulls it off. If you're unlucky, you get a confident hallucination.

Then there's the infrastructure hassle: a vector database, an embedding model, a pipeline to refresh the index every time a note changes. Another pain point — syncing across machines. I keep my vault on several computers and sync them through Syncthing. Markdown files sync trivially, but a vector database isn't something you can just hand over — you'd need a third-party service with MCP. Building heavy infrastructure around a personal knowledge base isn't something I want to do.

What knowledge compilation is

Karpathy's idea is simple. You take a folder and drop raw material into it: articles, PDFs, transcripts, exports. The LLM thinks about each source once and writes a coherent wiki page for it — with concepts, relationships, and wikilinks. After that, any question boils down to "read the wiki and answer with links to the pages." It's not vector search — it's knowledge compilation.

The mental model is like working with code:

Source — raw material in raw/, immutable
Compiler — the LLM that processes and writes
Executable — the finished wiki in markdown
Linting — checks for contradictions, broken links, gaps

The LLM thinks about the source once — after that, it reads the result, not a vector. The understanding work is done once, at ingest time.

Why this beats RAG for personal knowledge

The main thing I was missing before: Karpathy doesn't build one global wiki. He makes one wiki per topic. And that changes the economics.

A global wiki across a 2,700-note vault is a dead idea — the tokens run out before the agent even starts thinking. But a local wiki of 50–200 pages on one specific topic fits into the context window in one shot and is cheap to read. Vector search stops being necessary — the agent just reads everything and sees the links directly.

Plus the bonuses. Knowledge lives in plain markdown — readable without an LLM and synced across machines through Syncthing with no extra plumbing. The wiki graph plugs into my Obsidian graph. There's no infrastructure to maintain.

Obsidian graph of one LLM wiki — LLM wiki from the inside: every dot on the graph is a page, every line is a wikilink the model set up itself when ingesting a source.

My use cases

Right now I have two wikis on this scheme.

The first is for a work project. Meetings, ADRs, technical documentation, Slack exports. Questions like "why did we drop Medusa for Stripe" or "who's in charge of governance" get answered with links to specific pages, not hallucinations from general context. When a new meeting comes in, I just say "ingest this file" once, and the agent updates the affected concepts and wires up the connections.

The second is a psychology course of 40 lectures. Slides, summaries, transcripts — a three-layer ingest with a source marker on every statement: [slides], [transcript], [summaries]. When I need to "pull something from the PDFs only," I filter by markers and get exactly what was on the slides, without the LLM layering its own synthesis on top. For study material, where it's critical to tell "what the lecturer said" apart from "what the agent came up with," this really matters.

Where this doesn't work

The method isn't universal. If the topic is fuzzy or short-lived, the wiki bloats and the cheap-context advantage disappears. For scattered interests, there's no point. It also takes discipline: sources stay immutable, and ingest happens by hand, one source at a time — otherwise the wiki drifts and you lose track of what's in it.

But for a topic that's narrow and long-lived — a course, a work project, a research direction — knowledge compilation has beaten every RAG attempt for me. And the nicest part: no separate infrastructure at all. It's all regular markdown I can read with my eyes and edit with my hands.

How I Implemented an LLM Wiki in Obsidian

2025 Personal and Professional Year in Review

Knowledge Compilation: Why I Don't Use RAG in Obsidian

What RAG is and why it's inconvenient#

What knowledge compilation is#

Why this beats RAG for personal knowledge#

My use cases#

Where this doesn't work#

What RAG is and why it's inconvenient

What knowledge compilation is

Why this beats RAG for personal knowledge

My use cases

Where this doesn't work