How Deep Agents Uses Memory and AGENTS.md for Persistent Context

Mar 20, 2026

The Problem: Agents Start Fresh Every Session

I built an AI agent to help with my development work. It worked great for the first session. But when I started a new conversation the next day, it forgot everything:

The coding conventions we established
The project architecture decisions
The preferences I had expressed
The lessons learned from previous debugging sessions

Every conversation was like starting from zero. I had to re-explain the project context, re-state my preferences, and re-teach the agent how I wanted things done.

Then I discovered Deep Agents and its memory system built around AGENTS.md files.

What I Tried First: Hardcoding Context

My first attempt was to bake everything into the system prompt:

from langchain.agents import create_agent

SYSTEM_PROMPT = """
You are a development assistant.

Project: E-commerce platform with Django backend
Database: PostgreSQL with the following schema...
Conventions:
- Use snake_case for Python
- Prefer composition over inheritance
- Always write tests first
- ...
"""

agent = create_agent(
    model="claude-sonnet-4-6",
    system_prompt=SYSTEM_PROMPT
)

This worked, but I ran into problems:

Maintenance nightmare: Every time I wanted to update the agent’s knowledge, I had to modify Python code and redeploy
No separation of concerns: Project-specific knowledge was mixed with agent behavior
No dynamic updates: I couldn’t add new knowledge without restarting the agent

I needed something that could persist across sessions and be easily updated.

How Deep Agents Memory Works

Deep Agents solves this with MemoryMiddleware and AGENTS.md files. Here’s the architecture:

Agent Startup
     |
     v
MemoryMiddleware.before_agent()
     |
     +--> Load AGENTS.md files from sources
     |
     v
Store content in agent state
     |
     v
modify_request() injects into system prompt
     |
     v
Agent runs with persistent context

The key insight is that memory is loaded once at agent startup, then injected into every system prompt. The agent doesn’t need to “remember” across sessions because the knowledge is always present.

Setting Up Memory

Here’s how I configured memory for a text-to-SQL agent:

from deepagents import create_deep_agent

agent = create_deep_agent(
    model="claude-sonnet-4-6",
    memory=["./AGENTS.md"],
)

That’s it. The memory parameter accepts a list of file paths. Deep Agents loads these files and injects them into the system prompt.

The AGENTS.md File Structure

The AGENTS.md file is standard Markdown. I structured mine with clear sections:

# Text-to-SQL Agent Instructions

You are a Deep Agent designed to interact with a SQL database.

## Your Role

Given a natural language question, you will:
1. Explore the available database tables
2. Examine relevant table schemas
3. Generate syntactically correct SQL queries
4. Execute queries and analyze results
5. Format answers in a clear, readable way

## Database Information

- Database type: SQLite (Chinook database)
- Contains data about a digital media store: artists, albums, tracks, customers

## Query Guidelines

- Always limit results to 5 rows unless the user specifies otherwise
- Order results by relevant columns to show the most interesting data
- Only query relevant columns, not SELECT *

## Safety Rules

**NEVER execute these statements:**
- INSERT
- UPDATE
- DELETE
- DROP
- ALTER

**You have READ-ONLY access. Only SELECT queries are allowed.**

The agent sees this content as part of its system prompt, so it always knows the database context and safety constraints.

Multiple Memory Sources

I discovered that Deep Agents supports multiple memory sources. This is useful for layering different types of context:

from deepagents import create_deep_agent
from deepagents.backends.filesystem import FilesystemBackend

backend = FilesystemBackend(root_dir="/")

agent = create_deep_agent(
    backend=backend,
    memory=[
        "~/.deepagents/AGENTS.md",      # Global user preferences
        "./.deepagents/AGENTS.md",      # Project-specific context
    ],
)

Sources are loaded in order and concatenated. Later sources appear after earlier ones, so you can override specific instructions at the project level while keeping global defaults.

What Gets Injected Into the Prompt

I was curious about what the agent actually sees. Looking at the source code, the memory content gets wrapped in XML tags:

<agent_memory>
./AGENTS.md
# Text-to-SQL Agent Instructions
... content ...
</agent_memory>

<memory_guidelines>
    The above <agent_memory> was loaded from files in your filesystem.
    As you learn from your interactions with the user, you can save new
    knowledge by calling the `edit_file` tool.

    **Learning from feedback:**
    - When you need to remember something, updating memory must be your
      FIRST, IMMEDIATE action
    - When user says something is better/worse, capture WHY and encode it
    - Each correction is a chance to improve permanently

    **When to update memories:**
    - When the user explicitly asks you to remember something
    - When the user describes your role or how you should behave
    - When the user gives feedback on your work
    ...
</memory_guidelines>

The memory_guidelines section teaches the agent when and how to update its own memory during conversations.

Memory Updates During Conversations

One feature I found particularly useful is that agents can update their own memory. When I correct the agent, it can save that learning:

User: Can you write me an example for creating a deep agent in LangChain?
Agent: Sure, I'll write you an example...
[provides Python code]

User: Can you do this in JavaScript instead?
Agent: Let me save this to my memory.
[calls edit_file to update AGENTS.md with JavaScript preference]
Agent: Sure, here is the JavaScript example...

The agent recognized that my preference for JavaScript examples was worth remembering and updated the memory file automatically.

How the Middleware Works Internally

I dug into the source code to understand the implementation. The MemoryMiddleware class has three main responsibilities:

class MemoryMiddleware(AgentMiddleware):
    """Middleware for loading agent memory from AGENTS.md files."""

    def __init__(self, *, backend, sources: list[str]):
        self._backend = backend
        self.sources = sources

    def before_agent(self, state, runtime, config):
        """Load memory content before agent execution."""
        # Skip if already loaded
        if "memory_contents" in state:
            return None

        backend = self._get_backend(state, runtime, config)
        contents = {}

        # Load all memory files
        results = backend.download_files(list(self.sources))
        for path, response in zip(self.sources, results):
            if response.error == "file_not_found":
                continue
            if response.content:
                contents[path] = response.content.decode("utf-8")

        return {"memory_contents": contents}

    def modify_request(self, request):
        """Inject memory content into the system message."""
        contents = request.state.get("memory_contents", {})
        agent_memory = self._format_agent_memory(contents)
        new_system_message = append_to_system_message(
            request.system_message, agent_memory
        )
        return request.override(system_message=new_system_message)

The key methods are:

before_agent: Loads memory files before the agent starts
modify_request: Injects memory into the system prompt for each LLM call

Memory vs Skills: Understanding the Difference

Deep Agents distinguishes between memory and skills:

Aspect	Memory (AGENTS.md)	Skills
Loading	Always loaded	Loaded on-demand
Purpose	Persistent context	Specialized workflows
Content	Identity, rules, guidelines	Step-by-step procedures
Updates	Can be updated during conversations	Static workflows

Memory (AGENTS.md):
- Who I am (agent identity)
- What I should always remember (rules, constraints)
- How I should behave (communication style)

Skills (SKILL.md):
- How to accomplish specific tasks
- Step-by-step workflows
- Domain-specific procedures

Use memory for knowledge the agent always needs. Use skills for workflows the agent might need occasionally.

Common Mistakes I Made

Mistake 1: Putting Everything in Memory

My first AGENTS.md file was 2000 lines long. I included every possible detail about the project. This bloated the context window and made responses slower.

Fix: Keep memory focused on essential context. Move detailed procedures to skills.

Mistake 2: Not Updating Memory

I expected the agent to “learn” from conversations automatically. While it can update memory, I needed to explicitly ask it to remember important things.

Fix: Use phrases like “remember this” or “for future reference” to trigger memory updates.

Mistake 3: Ignoring the Display Name

The memory content shows the file path in the prompt. I initially used cryptic paths like ./m1.md.

Fix: Use descriptive paths like ./AGENTS.md or ./.deepagents/database-rules.md so the agent knows where information comes from.

When to Use Memory vs Other Approaches

Memory is not always the right solution. Here’s my decision matrix:

Use Memory (AGENTS.md) when:
- Agent needs to remember context across sessions
- Knowledge is project-specific but task-agnostic
- You want the agent to self-update its knowledge

Use System Prompt when:
- Context is temporary or per-session
- You need strict control over agent behavior
- Content shouldn't change during runtime

Use Skills when:
- Knowledge is task-specific
- You need progressive disclosure (load on-demand)
- Content follows a procedural workflow

Use External Tools when:
- Knowledge changes frequently
- You need real-time data
- Information is too large for context window

CLI Memory Support

The Deep Agents CLI has built-in memory support. Memory is persisted to ~/.deepagents/{agent_name}/:

# Start agent with memory
deepagents run --agent my-sql-agent

# Memory is stored at:
# ~/.deepagents/my-sql-agent/AGENTS.md

This means memory persists across CLI sessions without any configuration.

GitHub Actions Integration

For CI/CD workflows, Deep Agents supports memory with different scopes:

- uses: langchain-ai/deepagents@v1
  with:
    enable_memory: true
    memory_scope: pr  # Options: pr, branch, repo

Memory scopes determine how memory is cached:

pr: Memory persists only for this pull request
branch: Memory shared across the branch
repo: Memory shared across the entire repository

Summary

In this post, I explored how Deep Agents implements persistent memory through AGENTS.md files loaded via MemoryMiddleware. The key point is memory files are loaded at agent startup and injected into the system prompt, giving the agent access to project-specific knowledge, conventions, and accumulated learnings across sessions.

The system works by loading memory content before the agent executes, storing it in agent state, and injecting it into the system message for each LLM call. Multiple sources can be layered to combine global preferences with project-specific context.

Memory solves the problem of agents starting fresh every session. Instead of re-explaining project context and preferences, the agent always has access to persistent knowledge through the injected memory content.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!