How I Implemented an LLM Wiki in Obsidian

How I Implemented an LLM Wiki in Obsidian

In the previous article, I explained why I moved away from RAG in Obsidian. This is the practical part: how to reproduce the same setup yourself, which skill to install, which folders to create, how to run ingest, how to ask questions, and how to avoid turning the wiki into mush after the third source.

I use Astro-Han/karpathy-llm-wiki. It is an Agent Skills-compatible skill for Claude Code, Cursor, Codex, and other tools that support the Agent Skills format. It implements the LLM wiki idea: sources go into raw/, the agent compiles durable markdown pages into wiki/, answers questions with citations, and periodically checks the wiki through lint.

Below I will use a fictional example: Atlas Wiki. Imagine it is a personal wiki for researching urban spaces. The name does not matter. What matters is that the topic is narrow and long-lived.

Install the skill

Installing the library looks like this:

npx add-skill Astro-Han/karpathy-llm-wiki

For Claude Code, Cursor, and OpenCode, this is the main path. For Codex CLI, the project's README mentions the manual option: copy the skill into .agents/skills/karpathy-llm-wiki/. For other agents, the principle is the same: the agent's skills directory needs SKILL.md and the references/ folder.

After installation, the agent gets three operations:

  • Ingest — fetch a source, save it in raw/, and compile knowledge into wiki/.
  • Query — read wiki/index.md, open relevant pages, and answer with links.
  • Lint — check wiki health: index entries, links, raw references, orphan pages, and possible contradictions.

This is not an app and not an Obsidian plugin. It is an instruction set for a coding agent. All the work happens through normal files in your vault.

Create a project folder

I do not create one global LLM wiki for the whole Obsidian vault. That is a bad idea: too many topics, too much noise, too expensive a context. Instead, I create a separate wiki for a specific domain.

For example:

Wiki/Atlas/

You do not have to create the first structure by hand. According to the skill spec, it creates raw/, wiki/, wiki/index.md, and wiki/log.md during the first ingest if they do not exist yet. But I prefer knowing what the result should look like:

Wiki/Atlas/
├── raw/
│   ├── articles/
│   ├── books/
│   ├── notes/
│   └── interviews/
└── wiki/
    ├── index.md
    ├── log.md
    ├── concepts/
    ├── places/
    └── people/

raw/ is immutable source material. It holds articles, PDFs, notes, transcripts, exported threads, documentation snippets, and anything else you want to compile. After ingest, avoid editing or renaming sources. If you need a new version, add a new file.

wiki/ contains compiled articles. The agent maintains them: creates new pages, updates old ones, adds links, and writes the index.

wiki/index.md is the map of the wiki. The agent starts queries from it.

wiki/log.md is the operation log. Ingest, archived query, and lint events go there.

Add local rules

The skill itself is generic. But generic is not enough. For the wiki to work well, the agent needs to know what a good page means for this specific topic.

I add a local instruction file at the root:

Wiki/Atlas/CLAUDE.md

Minimal example:

# Atlas Wiki

This is a topic-specific LLM wiki about urban spaces, city observation, architecture, walking routes, and visual research.

## Scope

Use this wiki only for sources related to urban spaces, photography walks, city patterns, architecture, public transport, signage, and local history.

Ignore unrelated personal notes, work notes, and generic productivity material.

## Wiki Structure

- `wiki/concepts/` — reusable concepts, methods, patterns, and recurring observations.
- `wiki/places/` — city districts, streets, buildings, and locations.
- `wiki/people/` — photographers, architects, authors, and researchers.

## Writing Rules

- Write wiki pages in English.
- Use standard Markdown.
- Use relative Markdown links inside `wiki/`.
- Prefer durable concepts over source-by-source summaries.
- Keep sources traceable through the `Raw` field.
- If sources contradict each other, mark the conflict explicitly.

## Query Rules

- Always start from `wiki/index.md`.
- Prefer wiki pages over model memory.
- Answer in English.
- Cite wiki pages with Markdown links.

Yes, this looks like paperwork. But without a local instruction, the agent guesses the structure. An LLM guessing the structure of a knowledge base is a fast path to entropy.

Run the first ingest

You can give the agent a URL, a file path, or pasted text. The skill should save the source into raw/, then compile it into wiki/.

Example with a URL:

Ingest this article into Atlas Wiki:
https://example.com/how-cities-use-signage

Example with a local file:

Ingest file `Wiki/Atlas/raw/articles/2026-04-30-city-signage.md` into Atlas Wiki.

During ingest, the skill performs two separate operations:

  1. Fetch into raw/ — save the source as a markdown file.
  2. Compile into wiki/ — create or update compiled knowledge pages.

The raw file template in the skill looks roughly like this:

# Source Title

> Source: URL or origin description
> Collected: 2026-04-30
> Published: 2026-04-25

Original content below.

The point of raw/ is to keep the source close enough to the original. Do not rewrite the author's opinion. Do not turn the article into a summary. Clean up formatting noise and preserve the text that lets you trace where the knowledge came from.

What a compiled article looks like

After ingest, the agent creates or updates pages in wiki/. Important detail: the file is named after the concept, not the source.

For example, how-cities-use-signage.md may produce:

wiki/concepts/wayfinding-signage.md

The page follows the skill's template:

# Wayfinding Signage

> Sources: Example Magazine, 2026-04-25
> Raw: [How Cities Use Signage](../../raw/articles/2026-04-30-how-cities-use-signage.md)

## Overview

Wayfinding signage is the layer of visual navigation that helps people understand where they are, where they can go, and how to move through a city without asking for help.

## Key Principles

...

## See Also

- [Public Space Legibility](public-space-legibility.md)
- [Transit Maps](../places/transit-maps.md)

This is where the difference from RAG becomes concrete. A vector database would store chunks of the source article. An LLM wiki stores processed knowledge: a concept, relationships, sources, conflicts, and links to neighboring pages.

One source can update several pages. An article about city signage may touch wayfinding-signage, public-space-legibility, street-furniture, and a page about a specific district. That is expected. The wiki should accumulate connections, not put every source into a separate box.

Check index.md and log.md

After ingest, I always check two files.

wiki/index.md should get a new or updated row:

# Knowledge Base Index

## Concepts

Concepts and recurring patterns in urban research.

| Article | Summary | Updated |
|---------|---------|---------|
| [Wayfinding Signage](concepts/wayfinding-signage.md) | Visual navigation systems that help people move through the city. | 2026-04-30 |

wiki/log.md should get an operation entry:

## [2026-04-30] ingest | How Cities Use Signage
- Updated: wiki/concepts/wayfinding-signage.md
- Updated: wiki/concepts/public-space-legibility.md

If those two traces are missing, the ingest was not good enough. Without the index, the agent has nothing reliable to read during query. Without the log, you will not understand where changes came from a week later.

Ask questions against the wiki

A query should explicitly target a specific wiki. Not:

What do I know about navigation?

but:

What do I know about wayfinding signage in Atlas Wiki?

A properly behaving agent should:

  1. Read wiki/index.md.
  2. Find relevant pages.
  3. Open those pages.
  4. Synthesize an answer.
  5. Link to wiki pages.
  6. Avoid changing files unless explicitly asked.

The last point matters. Query should not rewrite the knowledge base. It is a read operation. If the answer is useful, save it separately:

Save this answer as an archived query in Atlas Wiki.

Then the skill creates a separate archive page, updates index.md, and adds an entry to log.md.

Run lint

Once the wiki grows past twenty or thirty pages, it starts to drift a little. That is normal. This is what lint is for:

Lint Atlas Wiki.

According to the skill spec, auto-fix is only for deterministic issues:

  • a file exists in wiki/, but is missing from index.md;
  • an index entry points to a missing file;
  • a markdown link is broken, but the correct target can be found unambiguously;
  • a Raw link points to the wrong place;
  • See Also contains clearly broken links.

But ambiguous findings should only be reported:

  • contradictions between pages;
  • stale claims;
  • missing conflict annotations;
  • orphan pages;
  • concepts that are mentioned often but do not have their own page.

That split is correct. The agent can fix mechanics. Meaning-level decisions should not run on autopilot.

What not to automate

The biggest mistake is putting a file watcher on raw/ and ingesting automatically every time a new file appears. It sounds convenient. It is a bad idea.

Ingest is not format conversion. It is editorial work. The agent chooses which concepts to create, which pages to update, which links to add, and where to mark source conflicts. If this runs in the background, the knowledge base starts drifting. You only notice the problem a month later, when half the pages are written with weird assumptions.

My working loop is:

  1. Pick one source.
  2. Run ingest.
  3. Review the diff.
  4. Check index.md and log.md.
  5. Only then mark the source as processed.

Yes, it is slower. It keeps the system manageable.

Minimal recipe

Condensed version:

1. Install the skill:
   npx add-skill Astro-Han/karpathy-llm-wiki

2. Create a folder:
   Wiki/Atlas/

3. Add local instructions:
   Wiki/Atlas/CLAUDE.md

4. Run the first ingest:
   Ingest this article into Atlas Wiki: <URL>

5. Check:
   wiki/index.md
   wiki/log.md
   wiki/<topic>/<article>.md

6. Ask questions:
   What do I know about X in Atlas Wiki?

7. Clean up periodically:
   Lint Atlas Wiki.

That is the whole system. No vector database, no separate server, no embeddings pipeline, and no index sync between machines. Just markdown, an agent skill, and discipline around ingest.

For a personal knowledge base, this turned out to be much more practical than RAG. Not because RAG is bad in general, but because my task is different. I do not need to retrieve random chunks from a huge corpus as fast as possible. I need knowledge around a narrow topic to accumulate, connect, and stay readable without the agent.