How Deep Agents Uses Memory and AGENTS.md for Persistent Context
The Problem: Agents Start Fresh Every Session
I built an AI agent to help with my development work. It worked great for the first session. But when I started a new conversation the next day, it forgot everything:
- The coding conventions we established
- The project architecture decisions
- The preferences I had expressed
- The lessons learned from previous debugging sessions
Every conversation was like starting from zero. I had to re-explain the project context, re-state my preferences, and re-teach the agent how I wanted things done.
Then I discovered Deep Agents and its memory system built around AGENTS.md files.
What I Tried First: Hardcoding Context
My first attempt was to bake everything into the system prompt:
from langchain.agents import create_agent
SYSTEM_PROMPT = """You are a development assistant.
Project: E-commerce platform with Django backendDatabase: PostgreSQL with the following schema...Conventions:- Use snake_case for Python- Prefer composition over inheritance- Always write tests first- ..."""
agent = create_agent( model="claude-sonnet-4-6", system_prompt=SYSTEM_PROMPT)This worked, but I ran into problems:
- Maintenance nightmare: Every time I wanted to update the agent’s knowledge, I had to modify Python code and redeploy
- No separation of concerns: Project-specific knowledge was mixed with agent behavior
- No dynamic updates: I couldn’t add new knowledge without restarting the agent
I needed something that could persist across sessions and be easily updated.
How Deep Agents Memory Works
Deep Agents solves this with MemoryMiddleware and AGENTS.md files. Here’s the architecture:
Agent Startup | vMemoryMiddleware.before_agent() | +--> Load AGENTS.md files from sources | vStore content in agent state | vmodify_request() injects into system prompt | vAgent runs with persistent contextThe key insight is that memory is loaded once at agent startup, then injected into every system prompt. The agent doesn’t need to “remember” across sessions because the knowledge is always present.
Setting Up Memory
Here’s how I configured memory for a text-to-SQL agent:
from deepagents import create_deep_agent
agent = create_deep_agent( model="claude-sonnet-4-6", memory=["./AGENTS.md"],)That’s it. The memory parameter accepts a list of file paths. Deep Agents loads these files and injects them into the system prompt.
The AGENTS.md File Structure
The AGENTS.md file is standard Markdown. I structured mine with clear sections:
# Text-to-SQL Agent Instructions
You are a Deep Agent designed to interact with a SQL database.
## Your Role
Given a natural language question, you will:1. Explore the available database tables2. Examine relevant table schemas3. Generate syntactically correct SQL queries4. Execute queries and analyze results5. Format answers in a clear, readable way
## Database Information
- Database type: SQLite (Chinook database)- Contains data about a digital media store: artists, albums, tracks, customers
## Query Guidelines
- Always limit results to 5 rows unless the user specifies otherwise- Order results by relevant columns to show the most interesting data- Only query relevant columns, not SELECT *
## Safety Rules
**NEVER execute these statements:**- INSERT- UPDATE- DELETE- DROP- ALTER
**You have READ-ONLY access. Only SELECT queries are allowed.**The agent sees this content as part of its system prompt, so it always knows the database context and safety constraints.
Multiple Memory Sources
I discovered that Deep Agents supports multiple memory sources. This is useful for layering different types of context:
from deepagents import create_deep_agentfrom deepagents.backends.filesystem import FilesystemBackend
backend = FilesystemBackend(root_dir="/")
agent = create_deep_agent( backend=backend, memory=[ "~/.deepagents/AGENTS.md", # Global user preferences "./.deepagents/AGENTS.md", # Project-specific context ],)Sources are loaded in order and concatenated. Later sources appear after earlier ones, so you can override specific instructions at the project level while keeping global defaults.
What Gets Injected Into the Prompt
I was curious about what the agent actually sees. Looking at the source code, the memory content gets wrapped in XML tags:
<agent_memory>./AGENTS.md# Text-to-SQL Agent Instructions... content ...</agent_memory>
<memory_guidelines> The above <agent_memory> was loaded from files in your filesystem. As you learn from your interactions with the user, you can save new knowledge by calling the `edit_file` tool.
**Learning from feedback:** - When you need to remember something, updating memory must be your FIRST, IMMEDIATE action - When user says something is better/worse, capture WHY and encode it - Each correction is a chance to improve permanently
**When to update memories:** - When the user explicitly asks you to remember something - When the user describes your role or how you should behave - When the user gives feedback on your work ...</memory_guidelines>The memory_guidelines section teaches the agent when and how to update its own memory during conversations.
Memory Updates During Conversations
One feature I found particularly useful is that agents can update their own memory. When I correct the agent, it can save that learning:
User: Can you write me an example for creating a deep agent in LangChain?Agent: Sure, I'll write you an example...[provides Python code]
User: Can you do this in JavaScript instead?Agent: Let me save this to my memory.[calls edit_file to update AGENTS.md with JavaScript preference]Agent: Sure, here is the JavaScript example...The agent recognized that my preference for JavaScript examples was worth remembering and updated the memory file automatically.
How the Middleware Works Internally
I dug into the source code to understand the implementation. The MemoryMiddleware class has three main responsibilities:
class MemoryMiddleware(AgentMiddleware): """Middleware for loading agent memory from AGENTS.md files."""
def __init__(self, *, backend, sources: list[str]): self._backend = backend self.sources = sources
def before_agent(self, state, runtime, config): """Load memory content before agent execution.""" # Skip if already loaded if "memory_contents" in state: return None
backend = self._get_backend(state, runtime, config) contents = {}
# Load all memory files results = backend.download_files(list(self.sources)) for path, response in zip(self.sources, results): if response.error == "file_not_found": continue if response.content: contents[path] = response.content.decode("utf-8")
return {"memory_contents": contents}
def modify_request(self, request): """Inject memory content into the system message.""" contents = request.state.get("memory_contents", {}) agent_memory = self._format_agent_memory(contents) new_system_message = append_to_system_message( request.system_message, agent_memory ) return request.override(system_message=new_system_message)The key methods are:
before_agent: Loads memory files before the agent startsmodify_request: Injects memory into the system prompt for each LLM call
Memory vs Skills: Understanding the Difference
Deep Agents distinguishes between memory and skills:
| Aspect | Memory (AGENTS.md) | Skills |
|---|---|---|
| Loading | Always loaded | Loaded on-demand |
| Purpose | Persistent context | Specialized workflows |
| Content | Identity, rules, guidelines | Step-by-step procedures |
| Updates | Can be updated during conversations | Static workflows |
Memory (AGENTS.md):- Who I am (agent identity)- What I should always remember (rules, constraints)- How I should behave (communication style)
Skills (SKILL.md):- How to accomplish specific tasks- Step-by-step workflows- Domain-specific proceduresUse memory for knowledge the agent always needs. Use skills for workflows the agent might need occasionally.
Common Mistakes I Made
Mistake 1: Putting Everything in Memory
My first AGENTS.md file was 2000 lines long. I included every possible detail about the project. This bloated the context window and made responses slower.
Fix: Keep memory focused on essential context. Move detailed procedures to skills.
Mistake 2: Not Updating Memory
I expected the agent to “learn” from conversations automatically. While it can update memory, I needed to explicitly ask it to remember important things.
Fix: Use phrases like “remember this” or “for future reference” to trigger memory updates.
Mistake 3: Ignoring the Display Name
The memory content shows the file path in the prompt. I initially used cryptic paths like ./m1.md.
Fix: Use descriptive paths like ./AGENTS.md or ./.deepagents/database-rules.md so the agent knows where information comes from.
When to Use Memory vs Other Approaches
Memory is not always the right solution. Here’s my decision matrix:
Use Memory (AGENTS.md) when:- Agent needs to remember context across sessions- Knowledge is project-specific but task-agnostic- You want the agent to self-update its knowledge
Use System Prompt when:- Context is temporary or per-session- You need strict control over agent behavior- Content shouldn't change during runtime
Use Skills when:- Knowledge is task-specific- You need progressive disclosure (load on-demand)- Content follows a procedural workflow
Use External Tools when:- Knowledge changes frequently- You need real-time data- Information is too large for context windowCLI Memory Support
The Deep Agents CLI has built-in memory support. Memory is persisted to ~/.deepagents/{agent_name}/:
# Start agent with memorydeepagents run --agent my-sql-agent
# Memory is stored at:# ~/.deepagents/my-sql-agent/AGENTS.mdThis means memory persists across CLI sessions without any configuration.
GitHub Actions Integration
For CI/CD workflows, Deep Agents supports memory with different scopes:
- uses: langchain-ai/deepagents@v1 with: enable_memory: true memory_scope: pr # Options: pr, branch, repoMemory scopes determine how memory is cached:
pr: Memory persists only for this pull requestbranch: Memory shared across the branchrepo: Memory shared across the entire repository
Summary
In this post, I explored how Deep Agents implements persistent memory through AGENTS.md files loaded via MemoryMiddleware. The key point is memory files are loaded at agent startup and injected into the system prompt, giving the agent access to project-specific knowledge, conventions, and accumulated learnings across sessions.
The system works by loading memory content before the agent executes, storing it in agent state, and injecting it into the system message for each LLM call. Multiple sources can be layered to combine global preferences with project-specific context.
Memory solves the problem of agents starting fresh every session. Instead of re-explaining project context and preferences, the agent always has access to persistent knowledge through the injected memory content.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments