Back to Projects
Bloom Vault

Bloom Vault

A living knowledge vault managed by Claude. Inspired by Karpathy's wiki pattern — raw sources are ingested and compiled into a wiki of interconnected concepts, unresolved questions, and emergent themes.

Technology Stack

Python
Obsidian
Claude Code
Trafilatura
YouTube Transcript API
Mermaid

Key Challenges

  • Designing a 2-source rule to prevent wiki bloat from half-baked ideas
  • Building an automated ingestion pipeline for URLs, PDFs, and YouTube transcripts
  • Creating a graph visualization system for vault connections
  • Managing cross-references and backlink integrity across hundreds of notes

Key Learnings

  • Knowledge management system design
  • Automated content extraction and parsing
  • Graph-based relationship visualization
  • LLM-assisted content synthesis
  • Obsidian vault architecture

Bloom Vault: A Living Knowledge Vault

Overview

Bloom Vault is an autonomous knowledge management system inspired by Andrej Karpathy's LLM wiki pattern. It follows a simple but powerful loop: you drop raw material into an inbox, and Claude progressively distils it into a wiki of interconnected concept articles with dense backlinks.

The vault turns unprocessed documents into fertile ground for research and discovery. Sources stack up as the raw substrate. Concepts harden where patterns repeat. Every question asked enriches the whole.

Core Philosophy

The Contract

You decide what enters. Claude owns everything downstream. That boundary is deliberate — your judgement filters what deserves attention, while Claude handles the synthesis that would exhaust a human at scale.

The 2-Source Rule

A single source is never enough for a concept. Themes are parked as candidates until another source independently corroborates them. This stops the wiki from bloating with half-baked ideas.

What Compounds

  • Sources stack up as the raw substrate
  • Concepts harden where patterns repeat
  • Queries generate research reports filed back into the wiki
  • Sessions capture working conversations as narrative wiki pages
  • Graphs visualise the entire vault as a connection network
  • Research threads surface once three or more concepts cluster around the same keywords

The Three Layers

| Directory | Purpose | Who writes | |---|---|---| | inbox/ | Staging for unprocessed drops (URLs, clippings, PDFs, pasted text) | You | | sources/ | Detailed source notes — one per article/paper/transcript | Claude | | wiki/ | Concept articles, people pages, query reports, index, log, and health | Claude | | residuals/ | Processed inbox items kept as originals; never edited | Claude |

The Six Commands

| Command | What it does | |---|---| | /bloom-ingest | Turn inbox items into source notes, then move originals to residuals/. Auto-fetches YouTube transcripts and extracts web content from URLs | | /bloom-compile | Forge or expand concept articles from un-compiled sources | | /bloom-ask | Probe the vault with a question and write up the findings | | /bloom-lint | Audit the vault: statistics, orphans, keyword drift | | /bloom-graph | Generate a mermaid connection graph of the entire vault | | /save | Capture the current Claude conversation as a narrative wiki page |

Smart Ingest

YouTube URLs

Transcripts fetched automatically via youtube-transcript-api. Video titles resolved via YouTube's oembed API.

Non-YouTube URLs

Content extracted via trafilatura through a custom fetch script (scripts/bloom-fetch.py). Falls back to manual processing if extraction fails.

The Standard for a Good Source Note

A reader should be able to understand and use the ideas in the source without going back to the original. That means reconstructing arguments, not just cataloguing topics. Favour paragraphs over bullets. Preserve distinctions the author makes.

Connection Graph

/bloom-graph runs python3 scripts/bloom-graph.py and writes a mermaid diagram to wiki/_meta/graph.md with:

  • Nodes styled by type (concept, source, person, query, meta)
  • Edges from backlinks and keyword overlap
  • Orphan detection (nodes with < 2 connections)
  • Hub identification (most connected nodes)
  • Keyword clusters (groups of nodes sharing keywords)

Front-Matter Schema

Plain key-value lines (not YAML). One #type/ per note:

  • source — an ingested external source (in sources/)
  • concept — a synthesised wiki page built from 2+ sources
  • query — a research report answering a question
  • person — an entity page for a thinker/author
  • session — a narrative reconstruction of a Claude conversation
  • meta — vault infrastructure (index, log, health)

Technical Architecture

Core Scripts

  • scripts/bloom-fetch.py — Web article extraction using trafilatura. Outputs markdown with title, date, and source in frontmatter.
  • scripts/bloom-graph.py — Graph generation and analysis. Parses frontmatter and ## Connections sections from all wiki and source files to build a connection network with mermaid visualization.

Key Features

  • Companion vault support — Link a personal vault as read-only. Claude cross-references your notes during compilation but never modifies them.
  • People pages — Three-tier system: always create for authors, create richer profiles for subjects, use wikilinks for passing references until the second independent citation.
  • Mermaid diagrams — Inline diagram support inside any note. Notes with diagrams are tracked in wiki/_meta/index.md.
  • First principles explanation mode — Explicit activation for ground-up causal explanations that cross-reference vault sources.

Why I Built This

The Problem

Knowledge work generates enormous amounts of raw material — articles, papers, videos, transcripts. Traditional note-taking leaves you with a graveyard of bookmarks and highlights that never synthesise into understanding.

The Solution

Bloom automates the synthesis layer. You curate the inputs. Claude handles the distillation, cross-referencing, and connection-building. The result is a compounding knowledge base where every new source strengthens the whole.

Use Cases

Research Tracking

Drop papers and articles into inbox/. Run /bloom-ingest to produce detailed source notes. Run /bloom-compile to forge concept articles from recurring themes.

Learning Acceleration

YouTube transcripts, course notes, and documentation all feed into the same system. Ask questions with /bloom-ask and get research reports that cite specific sources.

Writing Preparation

Concept articles serve as rough scaffolding for essays and posts. The prompts section of each concept suggests angles for your own writing.

Development Status

Current Features

  • Automated ingestion from URLs and YouTube
  • Source note generation with structured frontmatter
  • Concept compilation with 2-source rule enforcement
  • Query-driven research reports
  • Vault health auditing
  • Connection graph generation
  • Session saving for working conversations

In Development

  • Enhanced companion vault integration
  • Advanced research thread surfacing
  • Improved keyword drift detection

Impact & Vision

Bloom Vault represents a new paradigm in personal knowledge management — one where AI handles the mechanical work of synthesis and connection, freeing humans to focus on curation, questioning, and creative output. The goal is not a perfect wiki but a fertile ground where ideas can collide, compound, and evolve.

Designed by sidmanale643
© 2026. All rights reserved.