How Do I Give AI Agents Persistent Memory Across Sessions?

Mar 17, 2026

I was halfway through refactoring the authentication middleware when my AI assistant suddenly lost all context. “What were we doing again?” it asked, as if the previous two hours of discussion about OAuth flows and token refresh strategies had never happened. That’s when I realized: my AI agent had amnesia.

Every new chat session meant re-explaining my preferences, project architecture, and the decisions I’d already made. The agent knew I preferred TypeScript—but it forgot I was in the middle of migrating our REST API to GraphQL. This isn’t just annoying; it’s a productivity killer.

The Problem: AI Agents with Goldfish Memory

Traditional AI agents start each session with zero context. This creates several problems:

Repeated explanations: You must re-establish preferences and project details every time
Lost decisions: Critical choices made in previous sessions vanish
No continuity: Long-running projects become disjointed
Inefficient workflows: Time spent on context setup instead of actual work

The Reddit thread “10 MCP servers that together give your AI agent an actual brain” caught my attention because it highlighted OpenMemory as a “game changer for long-running projects.” The promise was compelling: “persistent memory across sessions, agent remembers your preferences, past conversations, project context.”

But when I tried it, I discovered the memory problem has two distinct layers:

General Memory: User preferences, coding style, common patterns
Session-Level Context: Current git branch, recent decisions, ongoing tasks, next steps

User kyletraz articulated this gap perfectly: “OpenMemory is great for general preferences and conversation history, but one gap I kept hitting was session-level dev context, like what branch I was on, what decisions I’d made, and what the next step was supposed to be.”

The Solution: MCP Memory Servers

The Model Context Protocol (MCP) provides a standardized way for AI agents to access external tools and resources. Memory servers built on MCP can persist data across sessions.

Option A: OpenMemory MCP Server

OpenMemory stores general preferences and conversation history. Here’s how I set it up:

First, I added it to my Claude Desktop configuration:

{
  "mcpServers": {
    "openmemory": {
      "command": "uvx",
      "args": ["openmemory"]
    }
  }
}

After restarting Claude Desktop, the agent can now store and retrieve memories:

User: Remember that I prefer functional components over class components in React.
Agent: I've stored this preference. You prefer functional components over class components in React.

User: What's my React component preference?
Agent: Based on my memory, you prefer functional components over class components in React.

This works well for cross-project preferences. The agent remembers I use TypeScript, prefer arrow functions, and avoid any types. But as kyletraz noted, it doesn’t track what I was doing yesterday.

Option B: Mem0

Mem0 offers a more sophisticated memory solution with user-specific and session-specific scopes:

from mem0 import Memory

m = Memory()

# Store user preference
m.add("User prefers dark mode in all IDEs", user_id="developer_1")

# Store session context
m.add(
    "Currently refactoring auth middleware, switched from JWT to session-based auth",
    user_id="developer_1",
    metadata={"session_id": "session_123", "project": "webapp"}
)

# Retrieve relevant memories
memories = m.search("authentication", user_id="developer_1")

Mem0 provides both cloud and self-hosted options. The metadata field is key for session-level context—it lets you attach project and session identifiers to memories.

Option C: Custom MCP Memory Server

When OpenMemory and Mem0 didn’t fully address my session-level context needs, I built a custom solution. Here’s a simplified version:

from mcp.server import Server
from mcp.server.stdio import stdio_server
import json
import sqlite3
from datetime import datetime

app = Server("dev-memory")

# SQLite for persistence
conn = sqlite3.connect("dev_memory.db")
conn.execute("""
    CREATE TABLE IF NOT EXISTS memories (
        id INTEGER PRIMARY KEY,
        project TEXT,
        session_id TEXT,
        memory_type TEXT,
        content TEXT,
        created_at TIMESTAMP
    )
""")

@app.tool()
async def store_memory(project: str, session_id: str, memory_type: str, content: str):
    """Store a development context memory."""
    conn.execute(
        "INSERT INTO memories (project, session_id, memory_type, content, created_at) VALUES (?, ?, ?, ?, ?)",
        (project, session_id, memory_type, content, datetime.now())
    )
    conn.commit()
    return f"Stored {memory_type} memory for project {project}"

@app.tool()
async def get_project_context(project: str):
    """Retrieve all context for a project."""
    cursor = conn.execute(
        "SELECT memory_type, content, created_at FROM memories WHERE project = ? ORDER BY created_at DESC LIMIT 20",
        (project,)
    )
    results = cursor.fetchall()
    return json.dumps([{"type": r[0], "content": r[1], "when": r[2]} for r in results])

@app.tool()
async def get_current_task(project: str, session_id: str):
    """Get the most recent task context."""
    cursor = conn.execute(
        "SELECT content FROM memories WHERE project = ? AND memory_type = 'current_task' ORDER BY created_at DESC LIMIT 1",
        (project,)
    )
    result = cursor.fetchone()
    return result[0] if result else "No current task recorded"

async def main():
    async with stdio_server() as (read_stream, write_stream):
        await app.run(read_stream, write_stream, app.create_initialization_options())

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

This custom server tracks three types of memories:

preferences: Long-term choices (editor, language, framework)
decisions: Architectural or implementation choices
current_task: What I was doing when I closed the session

The key insight is separating memory types. General preferences live forever; session-level context has a shorter lifespan but higher immediate relevance.

Why This Matters

Implementing persistent memory transforms how you work with AI agents:

Productivity: No more repeating context setup. My agent knows I’m working on the feature/auth-refactor branch and that we decided to use refresh tokens instead of sliding sessions.

Consistency: Agents maintain coherent behavior across sessions. My coding style preferences carry over without re-explanation.

Project Continuity: Resume work exactly where you left off. The agent remembers the half-implemented rate limiter and why we chose the token bucket algorithm.

Learning: Agents improve over time. They learn from your feedback and accumulated decisions.

When Does the Agent Access Memory?

User looktwise asked a critical question: “When does the agent access openmem / when does it decide to act on the task differently because of these memories?”

Based on my experiments, here’s what I’ve learned:

At session start: The agent should proactively retrieve project context, not wait for you to remind it.
When making decisions: If you’re choosing between libraries, the agent should recall your past preferences.
Before completing tasks: The agent should check if the proposed solution aligns with stored decisions.

I configured my custom server to auto-load context:

# In the agent's system prompt
TOOL_RULES = """
At the start of each session, call get_project_context with the current project name.
Before suggesting architectural changes, retrieve recent decisions.
When storing new decisions, include the reasoning for future reference.
"""

This makes memory retrieval automatic rather than manual.

Common Mistakes to Avoid

Over-relying on general memory: Storing everything as a “preference” clutters retrieval. I initially stored “current task” as a preference, which polluted the memory space with transient data.

No retrieval strategy: If the agent doesn’t know when to query memories, they’re useless. Define explicit rules for when memory should be accessed.

Storing irrelevant data: Every memory should have a retrieval use case. I made the mistake of storing entire conversation transcripts—which made semantic search noisy.

Ignoring privacy: Memory servers store potentially sensitive data. My initial implementation logged API keys by accident. Always sanitize inputs and consider encryption for sensitive projects.

Putting It All Together

Here’s my current workflow:

Session Start:
1. Agent loads project context automatically
2. Agent retrieves current task and recent decisions
3. Agent summarizes state: "Continuing auth refactor. You were switching from JWT to sessions. Next step was updating the refresh token logic."

During Work:
4. Agent stores new decisions with reasoning
5. Agent references past preferences when suggesting solutions

Session End:
6. Agent stores current task state for next session

The result? I can close my laptop at 6 PM and return at 9 AM without spending 20 minutes re-establishing context. The agent knows what I was doing, why I was doing it, and what comes next.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!