How Do I Give AI Agents Persistent Memory Across Sessions?
I was halfway through refactoring the authentication middleware when my AI assistant suddenly lost all context. “What were we doing again?” it asked, as if the previous two hours of discussion about OAuth flows and token refresh strategies had never happened. That’s when I realized: my AI agent had amnesia.
Every new chat session meant re-explaining my preferences, project architecture, and the decisions I’d already made. The agent knew I preferred TypeScript—but it forgot I was in the middle of migrating our REST API to GraphQL. This isn’t just annoying; it’s a productivity killer.
The Problem: AI Agents with Goldfish Memory
Traditional AI agents start each session with zero context. This creates several problems:
- Repeated explanations: You must re-establish preferences and project details every time
- Lost decisions: Critical choices made in previous sessions vanish
- No continuity: Long-running projects become disjointed
- Inefficient workflows: Time spent on context setup instead of actual work
The Reddit thread “10 MCP servers that together give your AI agent an actual brain” caught my attention because it highlighted OpenMemory as a “game changer for long-running projects.” The promise was compelling: “persistent memory across sessions, agent remembers your preferences, past conversations, project context.”
But when I tried it, I discovered the memory problem has two distinct layers:
- General Memory: User preferences, coding style, common patterns
- Session-Level Context: Current git branch, recent decisions, ongoing tasks, next steps
User kyletraz articulated this gap perfectly: “OpenMemory is great for general preferences and conversation history, but one gap I kept hitting was session-level dev context, like what branch I was on, what decisions I’d made, and what the next step was supposed to be.”
The Solution: MCP Memory Servers
The Model Context Protocol (MCP) provides a standardized way for AI agents to access external tools and resources. Memory servers built on MCP can persist data across sessions.
Option A: OpenMemory MCP Server
OpenMemory stores general preferences and conversation history. Here’s how I set it up:
First, I added it to my Claude Desktop configuration:
{ "mcpServers": { "openmemory": { "command": "uvx", "args": ["openmemory"] } }}After restarting Claude Desktop, the agent can now store and retrieve memories:
User: Remember that I prefer functional components over class components in React.Agent: I've stored this preference. You prefer functional components over class components in React.
User: What's my React component preference?Agent: Based on my memory, you prefer functional components over class components in React.This works well for cross-project preferences. The agent remembers I use TypeScript, prefer arrow functions, and avoid any types. But as kyletraz noted, it doesn’t track what I was doing yesterday.
Option B: Mem0
Mem0 offers a more sophisticated memory solution with user-specific and session-specific scopes:
from mem0 import Memory
m = Memory()
# Store user preferencem.add("User prefers dark mode in all IDEs", user_id="developer_1")
# Store session contextm.add( "Currently refactoring auth middleware, switched from JWT to session-based auth", user_id="developer_1", metadata={"session_id": "session_123", "project": "webapp"})
# Retrieve relevant memoriesmemories = m.search("authentication", user_id="developer_1")Mem0 provides both cloud and self-hosted options. The metadata field is key for session-level context—it lets you attach project and session identifiers to memories.
Option C: Custom MCP Memory Server
When OpenMemory and Mem0 didn’t fully address my session-level context needs, I built a custom solution. Here’s a simplified version:
from mcp.server import Serverfrom mcp.server.stdio import stdio_serverimport jsonimport sqlite3from datetime import datetime
app = Server("dev-memory")
# SQLite for persistenceconn = sqlite3.connect("dev_memory.db")conn.execute(""" CREATE TABLE IF NOT EXISTS memories ( id INTEGER PRIMARY KEY, project TEXT, session_id TEXT, memory_type TEXT, content TEXT, created_at TIMESTAMP )""")
@app.tool()async def store_memory(project: str, session_id: str, memory_type: str, content: str): """Store a development context memory.""" conn.execute( "INSERT INTO memories (project, session_id, memory_type, content, created_at) VALUES (?, ?, ?, ?, ?)", (project, session_id, memory_type, content, datetime.now()) ) conn.commit() return f"Stored {memory_type} memory for project {project}"
@app.tool()async def get_project_context(project: str): """Retrieve all context for a project.""" cursor = conn.execute( "SELECT memory_type, content, created_at FROM memories WHERE project = ? ORDER BY created_at DESC LIMIT 20", (project,) ) results = cursor.fetchall() return json.dumps([{"type": r[0], "content": r[1], "when": r[2]} for r in results])
@app.tool()async def get_current_task(project: str, session_id: str): """Get the most recent task context.""" cursor = conn.execute( "SELECT content FROM memories WHERE project = ? AND memory_type = 'current_task' ORDER BY created_at DESC LIMIT 1", (project,) ) result = cursor.fetchone() return result[0] if result else "No current task recorded"
async def main(): async with stdio_server() as (read_stream, write_stream): await app.run(read_stream, write_stream, app.create_initialization_options())
if __name__ == "__main__": import asyncio asyncio.run(main())This custom server tracks three types of memories:
- preferences: Long-term choices (editor, language, framework)
- decisions: Architectural or implementation choices
- current_task: What I was doing when I closed the session
The key insight is separating memory types. General preferences live forever; session-level context has a shorter lifespan but higher immediate relevance.
Why This Matters
Implementing persistent memory transforms how you work with AI agents:
Productivity: No more repeating context setup. My agent knows I’m working on the feature/auth-refactor branch and that we decided to use refresh tokens instead of sliding sessions.
Consistency: Agents maintain coherent behavior across sessions. My coding style preferences carry over without re-explanation.
Project Continuity: Resume work exactly where you left off. The agent remembers the half-implemented rate limiter and why we chose the token bucket algorithm.
Learning: Agents improve over time. They learn from your feedback and accumulated decisions.
When Does the Agent Access Memory?
User looktwise asked a critical question: “When does the agent access openmem / when does it decide to act on the task differently because of these memories?”
Based on my experiments, here’s what I’ve learned:
-
At session start: The agent should proactively retrieve project context, not wait for you to remind it.
-
When making decisions: If you’re choosing between libraries, the agent should recall your past preferences.
-
Before completing tasks: The agent should check if the proposed solution aligns with stored decisions.
I configured my custom server to auto-load context:
# In the agent's system promptTOOL_RULES = """At the start of each session, call get_project_context with the current project name.Before suggesting architectural changes, retrieve recent decisions.When storing new decisions, include the reasoning for future reference."""This makes memory retrieval automatic rather than manual.
Common Mistakes to Avoid
Over-relying on general memory: Storing everything as a “preference” clutters retrieval. I initially stored “current task” as a preference, which polluted the memory space with transient data.
No retrieval strategy: If the agent doesn’t know when to query memories, they’re useless. Define explicit rules for when memory should be accessed.
Storing irrelevant data: Every memory should have a retrieval use case. I made the mistake of storing entire conversation transcripts—which made semantic search noisy.
Ignoring privacy: Memory servers store potentially sensitive data. My initial implementation logged API keys by accident. Always sanitize inputs and consider encryption for sensitive projects.
Putting It All Together
Here’s my current workflow:
Session Start:1. Agent loads project context automatically2. Agent retrieves current task and recent decisions3. Agent summarizes state: "Continuing auth refactor. You were switching from JWT to sessions. Next step was updating the refresh token logic."
During Work:4. Agent stores new decisions with reasoning5. Agent references past preferences when suggesting solutions
Session End:6. Agent stores current task state for next sessionThe result? I can close my laptop at 6 PM and return at 9 AM without spending 20 minutes re-establishing context. The agent knows what I was doing, why I was doing it, and what comes next.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments