How to Build Persistent Memory for Claude Using Markdown Files

Mar 17, 2026

Problem

Every time I start a new Claude session, it has zero memory of our previous conversations. I spent weeks building features, making architecture decisions, and documenting patterns—but Claude forgets it all the moment I close the tab.

I tried the obvious solutions first:

Manual copy-paste: I’d paste relevant context from previous chats. Works for small things, but becomes unwieldy quickly.
Vector databases: I looked into Pinecone, Weaviate, and Chroma. But setting up embeddings, managing indexes, and building retrieval pipelines felt like overkill for a personal workflow.
Custom RAG systems: Same problem—too much infrastructure for something that should be simple.

Then I realized: Claude now has a 1M token context window. Why am I building retrieval systems when I can just… load everything as text?

Environment

Claude (with 1M context window access)
Markdown files (no special tools required)
Optional: Obsidian or any markdown editor
Optional: MCP server for automation

What Happened

I found a Reddit thread where someone mentioned they keep a running AGENTS.md file plus daily memory files, and with the 1M context window, it actually works as persistent memory across sessions. No vector DB, no retrieval complexity—just flat files.

This was the insight I needed. Let me show you what I built.

The Solution: Flat-File Memory System

The core idea is simple: maintain structured markdown files that Claude reads at the start of each session. Here’s the file structure I use:

/memory
  /AGENTS.md           # Master instructions
  /daily/
    2026-03-17.md      # Today's session
    2026-03-16.md      # Yesterday's session
    2026-03-15.md
  /knowledge/
    project-spec.md
    coding-standards.md
    api-reference.md

The Master File: AGENTS.md

This is the entry point. It contains behavioral instructions and links to other memory files:

# Claude Agent Instructions

## Role
You are a software development assistant with persistent memory.
Read all linked files before responding.

## Behavioral Guidelines
- Always check the daily memory files for recent context
- Maintain consistency with past decisions
- Document important choices in today's memory file
- Never repeat work already completed

## Current Context
- Project: Personal Knowledge Management System
- Tech Stack: Python, SQLite, Markdown
- Status: In development, Phase 2

## Memory Files
- [Today's Session](./daily/2026-03-17.md)
- [Project Spec](./knowledge/project-spec.md)
- [Coding Standards](./knowledge/coding-standards.md)

Daily Memory Template

Each day, I create a new memory file with a consistent structure:

# Session: 2026-03-17

## Focus
Implementing persistent memory system with markdown files.

## Key Decisions
1. Chose flat files over vector DB for simplicity
2. Using AGENTS.md as master configuration
3. Limiting context to last 5 days of daily files

## Discoveries
- Claude's 1M context window enables true persistent memory
- No retrieval complexity needed for typical use cases
- Markdown links work as navigable references

## Action Items
- [x] Test memory retention across sessions
- [ ] Document best practices
- [ ] Consider MCP server for automation

## Session Notes
Initial exploration of markdown-based memory system. Reddit thread
revealed practical implementation from power users.

How It Works

When I start a new session with Claude, I simply say:

Read /memory/AGENTS.md and all linked files. This is our context.

Claude loads everything into context, and suddenly it remembers:

What we built yesterday
Why we made certain architecture decisions
What’s left on the to-do list
Coding standards we agreed on

The flow looks like this:

┌─────────────────┐
│  Start Session  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Load AGENTS.md │
└────────┬────────┘
         │
         ▼
┌─────────────────────────────┐
│  Parse linked files:         │
│  - Daily memory (last N)     │
│  - Knowledge base files     │
└────────┬────────────────────┘
         │
         ▼
┌─────────────────┐
│   Work Session  │
└────────┬────────┘
         │
         ▼
┌─────────────────────────────┐
│  Update daily memory file:   │
│  - Key decisions            │
│  - Discoveries              │
│  - New action items         │
└─────────────────────────────┘

Why This Works: Context Window Math

Let’s do the math. Claude’s 1M token context window breaks down roughly like this:

Total context:     ~1,000,000 tokens
─────────────────────────────────────
Conversation:      ~200,000 tokens (current chat)
Output buffer:     ~100,000 tokens
─────────────────────────────────────
Available for memory: ~700,000 tokens

A typical markdown file with 1,000 words is about 1,300 tokens. So I can load roughly 500+ pages of markdown into context before hitting limits. That’s weeks of daily session logs plus comprehensive project documentation.

Best Practices I Learned

1. Keep Files Focused

One topic per file. I tried putting everything in a single memory file, and it became unmanageable. Now I split by:

Daily sessions (chronological)
Project specs (semantic)
Coding standards (reference)

2. Summarize Regularly

Old daily files accumulate. Every week, I condense the previous week’s entries into a single summary file:

# Week 11 Summary (March 10-16, 2026)

## Completed
- Implemented user authentication
- Added rate limiting to API endpoints
- Wrote unit tests for payment module

## Key Decisions
- Chose JWT over sessions for stateless auth
- Rate limit: 100 req/min per user

## Outstanding
- Payment webhook handling (in progress)
- Email notification system (not started)

3. Date Everything

Timestamps matter. I put dates on every decision:

- 2026-03-15: Decided to use SQLite over PostgreSQL
  Rationale: Simpler deployment, adequate for projected load

4. Link Between Files

Create a navigable knowledge graph:

Related: [[coding-standards]], [[api-reference]], [[2026-03-15]]

Obsidian-style links work well if you use that editor, but even plain markdown links help Claude understand relationships.

Advanced: MCP Server Integration

For automation, I built a simple MCP server that integrates with Obsidian:

from mcp.server import Server
from pathlib import Path

class LibrarianMCPServer(Server):
    def __init__(self, vault_path: str):
        self.vault_path = Path(vault_path)
        self.agents_file = self.vault_path / "AGENTS.md"

    async def load_context(self, session_date: str):
        """Load relevant memory files for Claude session."""
        context = []

        # Load master instructions
        if self.agents_file.exists():
            context.append(self.agents_file.read_text())

        # Load last 5 days of memory
        for days_ago in range(5):
            date = self._get_date(session_date, -days_ago)
            daily_file = self.vault_path / "daily" / f"{date}.md"
            if daily_file.exists():
                context.append(daily_file.read_text())

        return "\n\n---\n\n".join(context)

Now Claude automatically loads context when I start a session.

What Didn’t Work

I want to be clear about what I tried that failed:

Loading ALL daily files: After a month, context got too large. I now limit to 5 days.
Complex folder hierarchies: Deep nesting made it harder for Claude to follow links. Flat structures work better.
Binary files: Tried including SQLite dumps. Claude can’t read them. Stick to text.

When This Approach Breaks Down

This flat-file system isn’t perfect:

Very long projects: If you need months of history, you’ll hit context limits even with summarization.
Multi-project work: Context pollution becomes an issue when switching between unrelated projects.
Real-time data: This is memory, not a database. Frequent updates require rebuilding context.

For those cases, you might still need a proper RAG system. But for personal development workflows? Markdown files have been surprisingly sufficient.

The Key Insight

The Reddit thread that sparked this approach had one crucial line:

“No vector DB needed, no retrieval complexity—just flat files.”

That’s the paradigm shift. We spent years building retrieval systems because context windows were tiny. Now that Claude can hold 1M tokens, the simplest solution—loading text files—actually works.

I think we’ll see more of this pattern: infrastructure that once required engineering complexity becoming simple again as AI capabilities expand. The flat-file memory system is an early example.

Summary

In this post, I showed how to build a persistent memory system for Claude using nothing but markdown files. The key points:

Maintain an AGENTS.md master file with instructions and links
Create daily memory files with consistent structure
Load context at session start
Summarize regularly to stay within limits
Optional: automate with an MCP server

The approach works because Claude’s 1M context window can hold hundreds of pages of markdown. For personal workflows, this eliminates the need for vector databases and retrieval systems.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit: 1 mil context is so good
👨‍💻 Anthropic MCP Documentation

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!