Obsidian RAG Chat

This project turns a personal Obsidian vault into a live question-answering tool. Instead of searching notes manually, a user types a question and gets a grounded answer — with citations pointing back to the exact source notes the response is built from.

How it works

The pipeline runs in four stages:

Ingestion — Markdown files are loaded from the vault and cleaned of Obsidian-specific syntax
Chunking — Notes are split using heading-aware boundaries with overlap, so retrieved chunks retain their structural context
Retrieval — Chunks are embedded with text-embedding-3-small and stored in ChromaDB; at query time, top-k semantic matches are fetched
Response — GPT-4o-mini generates an answer from the retrieved context, and the Streamlit interface surfaces the source notes so the output is always inspectable

Design decisions

Retrieval quality was the central design challenge. Splitting notes naively on character count discards the heading hierarchy that gives a note its meaning. Heading-aware chunking keeps related content together, which meaningfully improves what gets surfaced for a given query.

Citation grounding was equally deliberate. The goal was an answer you can verify, not just one that sounds right — so every response links back to its source chunks in the UI.

What this demonstrates

Building this end-to-end — from ingestion through a deployed Streamlit interface — shows how the components of a RAG system interact in practice: chunking strategy affects retrieval, retrieval quality constrains what the LLM can do, and the interface determines whether the output is actually trustworthy.

Source Code

View source code on GitHub ↗

Stack

Impact

How it works

Design decisions

What this demonstrates

Source Code