How to Build Multi-Agent Systems with Persistent Memory for Business Automation
Purpose
This post shows how to build multi-agent systems with persistent memory for business automation.
Problem
I built a single AI agent to handle my business operations. It worked for simple tasks. When I asked it to handle complex workflows across multiple days, everything broke.
Day 1: "Set up meeting with John" - Agent creates calendar eventDay 2: "What did I schedule with John?" - Agent: "I don't know who John is"
# No memory across sessions# No coordination between tasks# Using expensive model for simple operationsThe agent forgot context between sessions. It used Claude Sonnet for every task, even simple ones. My monthly AI cost hit $200 for basic operations.
Environment
- Python 3.12
- LangGraph 0.2 for orchestration
- Mem0 for persistent memory
- ChromaDB as vector database backend
- DeepSeek V3.2, Claude Haiku, Claude Sonnet for different agent roles
Solution
A multi-agent system with persistent memory needs three components:
- Orchestration Layer (LangGraph) - Routes tasks to correct agents
- Memory Layer (Mem0) - Stores facts, decisions, and preferences
- Storage Layer (ChromaDB) - Vector embeddings for memory search
Architecture Overview
I designed the system with a router agent that dispatches tasks to specialized agents:
┌─────────────────────────────────────────────┐ │ ROUTER AGENT (DeepSeek) │ │ Classifies task, dispatches to specialist │ └─────────────────┬───────────────────────────┘ │ ┌──────────────────────────┼──────────────────────────┐ │ │ │ ▼ ▼ ▼┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐│ OPS ENGINE │ │ OUTREACH QA │ │ WEBMASTER MGR ││ (MiniMax) │ │ (Claude Haiku) │ │ (Claude Sonnet)││ │ │ │ │ ││ Bulk operations │ │ Draft emails │ │ Complex decisions││ Batch processing │ │ Simple content │ │ Negotiations │└──────────────────┘ └──────────────────┘ └──────────────────┘ │ │ │ └──────────────────────────┼──────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────┐ │ Mem0 + ChromaDB │ │ Persistent Memory Storage & Retrieval │ │ │ │ Facts: "User prefers email over Slack" │ │ Decisions: "Approved budget of $500" │ │ Preferences: "Meeting times: 9am-11am" │ └─────────────────────────────────────────────┘Memory Layers
Mem0 provides four memory layers with different lifetimes:
┌─────────────────┬───────────────────┬───────────────────────────────┐│ Layer │ Lifetime │ Best For │├─────────────────┼───────────────────┼───────────────────────────────┤│ Conversation │ Single response │ Tool execution details ││ Session │ Minutes to hours │ Multi-step workflow context ││ User │ Weeks to forever │ Personal preferences ││ Organizational │ Global │ Shared FAQs, policies │└─────────────────┴───────────────────┴───────────────────────────────┘I implemented this hierarchy to match business needs:
from mem0 import Memoryfrom enum import Enum
class MemoryLayer(Enum): CONVERSATION = "conversation" # Dies after response SESSION = "session" # Lives during workflow USER = "user" # Persists per user ORGANIZATIONAL = "org" # Shared across all
class MemoryManager: def __init__(self, chroma_client): self.memory = Memory( backend=chroma_client, config={ "vector_store": { "provider": "chroma", "config": { "collection_name": "agent_memory", "path": "./chroma_db" } } } )
async def store_fact(self, user_id: str, fact: str, layer: MemoryLayer): """Store a fact in the appropriate memory layer""" await self.memory.add( messages=[{"role": "system", "content": fact}], user_id=user_id, metadata={ "layer": layer.value, "timestamp": datetime.now().isoformat() } )
async def get_relevant_context(self, user_id: str, query: str) -> list: """Retrieve relevant facts for current task""" results = await self.memory.search( query=query, user_id=user_id, limit=10 ) return resultsWhen I store a user preference:
# Store preferenceawait memory.store_fact( user_id="user_123", fact="User prefers email communication over Slack for important updates", layer=MemoryLayer.USER)
# Later, retrieve itcontext = await memory.get_relevant_context( user_id="user_123", query="How should I notify about the budget approval?")
# Output: [{"content": "User prefers email...", "score": 0.92}]Agent Router with LangGraph
I built a router that classifies tasks and dispatches to specialized agents:
from langgraph.graph import StateGraph, ENDfrom typing import TypedDict, Literalfrom enum import Enum
class TaskType(Enum): SIMPLE_OPERATION = "simple" # MiniMax handles CONTENT_DRAFT = "draft" # Claude Haiku handles COMPLEX_DECISION = "complex" # Claude Sonnet handles
class AgentState(TypedDict): input: str task_type: TaskType user_id: str context: list result: str cost: float
class AgentRouter: def __init__(self, memory_manager, agents): self.memory = memory_manager self.agents = agents self.graph = self._build_graph()
def _build_graph(self) -> StateGraph: """Build the routing graph""" workflow = StateGraph(AgentState)
# Add nodes workflow.add_node("classify", self.classify_task) workflow.add_node("retrieve_context", self.retrieve_context) workflow.add_node("dispatch", self.dispatch_to_agent) workflow.add_node("store_result", self.store_result)
# Add edges workflow.set_entry_point("classify") workflow.add_edge("classify", "retrieve_context") workflow.add_edge("retrieve_context", "dispatch") workflow.add_edge("dispatch", "store_result") workflow.add_edge("store_result", END)
return workflow.compile()
async def classify_task(self, state: AgentState) -> AgentState: """Classify task complexity using DeepSeek (cheap)""" classification_prompt = f""" Classify this task complexity: Task: {state['input']}
Options: - SIMPLE_OPERATION: Bulk tasks, batch processing, data retrieval - CONTENT_DRAFT: Writing emails, drafting content, simple summaries - COMPLEX_DECISION: Negotiations, strategic decisions, technical architecture
Return only the classification type. """
response = await self.agents["classifier"].generate(classification_prompt) state["task_type"] = TaskType(response.strip().upper()) return state
async def retrieve_context(self, state: AgentState) -> AgentState: """Get relevant memory context""" context = await self.memory.get_relevant_context( user_id=state["user_id"], query=state["input"] ) state["context"] = context return state
async def dispatch_to_agent(self, state: AgentState) -> AgentState: """Route to appropriate specialist agent""" agent_map = { TaskType.SIMPLE_OPERATION: "ops_engine", TaskType.CONTENT_DRAFT: "outreach_qa", TaskType.COMPLEX_DECISION: "webmaster_mgr" }
agent_name = agent_map[state["task_type"]] agent = self.agents[agent_name]
# Build prompt with context context_str = "\n".join([c["content"] for c in state["context"]]) prompt = f""" Context from memory: {context_str}
Task: {state['input']} """
result = await agent.generate(prompt) state["result"] = result state["cost"] = agent.get_cost() return state
async def store_result(self, state: AgentState) -> AgentState: """Store decision in memory""" await self.memory.store_fact( user_id=state["user_id"], fact=f"Decision made: {state['result']}", layer=MemoryLayer.SESSION ) return stateWhen I run the router:
# Initialize routerrouter = AgentRouter(memory_manager, agents)
# Process taskresult = await router.invoke({ "input": "Draft an email to John about the meeting schedule", "user_id": "user_123"})
# Classification flow:# 1. classify -> CONTENT_DRAFT (using DeepSeek, $0.001)# 2. retrieve_context -> ["User prefers email...", "Meeting times: 9am"]# 3. dispatch -> outreach_qa (Claude Haiku, $0.01)# 4. store_result -> Memory savedMulti-Agent Configuration
I configured each agent with a model matched to its task complexity:
from langchain_openai import ChatOpenAIfrom anthropic import Anthropicimport requests
class AgentConfig: """Agent configuration with cost tracking"""
agents = { "classifier": { "model": "deepseek-chat", "provider": "deepseek", "cost_per_1k_tokens": 0.0014, # $0.14 per million "role": "Task classification only" }, "ops_engine": { "model": "MiniMax-m2.1", "provider": "minimax", "cost_per_1k_tokens": 0.002, "role": "Bulk operations, batch processing" }, "outreach_qa": { "model": "claude-3-5-haiku", "provider": "anthropic", "cost_per_1k_tokens": 0.008, # $0.80 per million "role": "Draft emails, simple content" }, "webmaster_mgr": { "model": "claude-3-5-sonnet", "provider": "anthropic", "cost_per_1k_tokens": 0.03, # $3 per million "role": "Complex decisions, negotiations" } }
class AgentPool: def __init__(self): self.agents = {} self._initialize_agents()
def _initialize_agents(self): """Initialize all agents with their models""" for name, config in AgentConfig.agents.items(): if config["provider"] == "anthropic": self.agents[name] = AnthropicAgent(config) elif config["provider"] == "deepseek": self.agents[name] = DeepSeekAgent(config) elif config["provider"] == "minimax": self.agents[name] = MiniMaxAgent(config)
def get_agent(self, name: str): return self.agents[name]
class AnthropicAgent: def __init__(self, config): self.client = Anthropic() self.model = config["model"] self.cost_per_1k = config["cost_per_1k_tokens"] self.tokens_used = 0
async def generate(self, prompt: str) -> str: response = self.client.messages.create( model=self.model, max_tokens=1024, messages=[{"role": "user", "content": prompt}] ) self.tokens_used += response.usage.input_tokens + response.usage.output_tokens return response.content[0].text
def get_cost(self) -> float: return (self.tokens_used / 1000) * self.cost_per_1k
class DeepSeekAgent: def __init__(self, config): self.api_key = os.environ.get("DEEPSEEK_API_KEY") self.model = config["model"] self.cost_per_1k = config["cost_per_1k_tokens"] self.tokens_used = 0
async def generate(self, prompt: str) -> str: response = requests.post( "https://api.deepseek.com/v1/chat/completions", headers={"Authorization": f"Bearer {self.api_key}"}, json={ "model": self.model, "messages": [{"role": "user", "content": prompt}] } ) usage = response.json()["usage"] self.tokens_used += usage["total_tokens"] return response.json()["choices"][0]["message"]["content"]
def get_cost(self) -> float: return (self.tokens_used / 1000) * self.cost_per_1kWhen I compare costs for 1000 tasks:
Single Claude Sonnet for all tasks:- 1000 tasks * 2k tokens average * $0.03/1k = $60/month
Multi-agent with model matching:- 600 simple tasks * MiniMax * $0.002/1k = $1.20- 300 draft tasks * Haiku * $0.008/1k = $2.40- 100 complex tasks * Sonnet * $0.03/1k = $3.00- Classification overhead: $0.50- Total: $7.10/month
Savings: 88% cost reductionChromaDB Vector Store Setup
I set up ChromaDB as the vector storage backend:
import chromadbfrom chromadb.config import Settings
class VectorStore: def __init__(self, path: str = "./chroma_db"): self.client = chromadb.PersistentClient( path=path, settings=Settings( anonymized_telemetry=False, allow_reset=True ) ) self.collection = self.client.get_or_create_collection( name="agent_memory", metadata={"description": "Multi-agent persistent memory"} )
async def add_memory(self, id: str, content: str, metadata: dict): """Add memory entry with embeddings""" self.collection.add( documents=[content], metadatas=[metadata], ids=[id] )
async def search_memory(self, query: str, n_results: int = 10) -> list: """Semantic search for relevant memories""" results = self.collection.query( query_texts=[query], n_results=n_results ) return results
async def get_by_user(self, user_id: str) -> list: """Get all memories for a user""" results = self.collection.get( where={"user_id": user_id} ) return results
# Initializevector_store = VectorStore()
# Add organizational memory (shared rules)await vector_store.add_memory( id="org_rule_1", content="Always confirm budget approvals with finance team before execution", metadata={"layer": "org", "type": "policy"})
# Semantic searchresults = await vector_store.search_memory( query="How do I handle budget requests?", n_results=5)Complete System Integration
I integrated all components into a complete system:
from agent_router import AgentRouterfrom memory_config import MemoryManager, MemoryLayerfrom agents_config import AgentPoolfrom chroma_setup import VectorStore
class MultiAgentSystem: def __init__(self): # Initialize components self.vector_store = VectorStore() self.memory = MemoryManager(self.vector_store) self.agent_pool = AgentPool() self.router = AgentRouter(self.memory, self.agent_pool.agents)
# Load organizational rules self._load_org_memory()
def _load_org_memory(self): """Load shared organizational memory""" org_rules = [ "Budget approvals require finance team confirmation", "All client communications must be logged", "Meeting times preferred: 9am-11am, 2pm-4pm", "Response SLA: 24 hours for emails, 2 hours for Slack" ]
for i, rule in enumerate(org_rules): await self.vector_store.add_memory( id=f"org_rule_{i}", content=rule, metadata={"layer": "org", "type": "policy"} )
async def process_task(self, user_id: str, task: str) -> dict: """Process a task through the multi-agent system""" result = await self.router.invoke({ "input": task, "user_id": user_id })
return { "result": result["result"], "task_type": result["task_type"].value, "cost": result["cost"], "context_used": len(result["context"]) }
async def add_user_preference(self, user_id: str, preference: str): """Store user preference in memory""" await self.memory.store_fact( user_id=user_id, fact=preference, layer=MemoryLayer.USER )
# Usagesystem = MultiAgentSystem()
# Add user preferenceawait system.add_user_preference( user_id="john_123", preference="John prefers morning meetings between 9am and 11am")
# Process taskresult = await system.process_task( user_id="john_123", task="Schedule a meeting with the client team about the new project")
# Output:# {# "result": "Meeting scheduled for 9:30am tomorrow with client team...",# "task_type": "simple",# "cost": 0.002,# "context_used": 3# }Real-World Example: SEO Agency
I found a Reddit post about an SEO agency running 5 agents with this architecture:
Agent Roles:- Steve (DeepSeek V3.2): Main agent, handles WhatsApp & Slack daily ops- ops-engine (MiniMax): Bulk tasks, batch SEO operations- outreach-qa (Claude Haiku): Draft outreach emails, review content- webmaster-mgr (Claude Sonnet): Negotiate with webmasters, technical decisions
Memory Facts Stored:- "Client X prefers weekly reports"- "Budget limit for outreach: $500/month"- "Webmaster Y negotiated $200 for link placement"- "Previous successful outreach template: [stored]"
Monthly Cost: ~$20-30 (vs $200+ for single premium model)The agency reported: “We use Mem0 + ChromaDB - every decision, rule, and preference gets stored as a searchable fact.”
Common Mistakes I Made
1. Using one model for everything - Mistake: Claude Sonnet for simple classification - Fix: Use DeepSeek for routing, Haiku for drafts
2. Ignoring memory architecture - Mistake: All memories in one bucket - Fix: Layer by lifetime (conversation, session, user, org)
3. No agent routing logic - Mistake: Random agent assignment - Fix: Classification step before dispatch
4. Over-engineering first iteration - Mistake: 10 agents on day one - Fix: Start with 3 agents, add as needed
5. Storing secrets in memory - Mistake: API keys stored in vector DB - Fix: Only store business facts, use env vars for secretsSummary
In this post, I showed how to build multi-agent systems with persistent memory for business automation. The key point is combining LangGraph for orchestration, Mem0 for memory management, and matching models to task complexity.
The architecture uses three layers: Orchestration (LangGraph router), Memory (Mem0 with ChromaDB backend), and Storage (ChromaDB vector embeddings). I configured four memory layers: Conversation (single response), Session (workflow context), User (preferences), and Organizational (shared rules).
For model matching, I use DeepSeek for classification ($0.0014/1k), MiniMax for operations ($0.002/1k), Claude Haiku for drafts ($0.008/1k), and Claude Sonnet only for complex decisions ($0.03/1k). This reduced my monthly AI costs from $200 to ~$20-30.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Reddit: SEO Agency Multi-Agent System
- 👨💻 LangGraph Documentation
- 👨💻 Mem0 Documentation
- 👨💻 ChromaDB Documentation
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments