How to integrate memsearch with LangChain, LlamaIndex, and CrewAI
The Problem
I started building AI agents and hit the memory fragmentation problem. Each framework had its own memory solution:
- LangChain with InMemoryStore
- LlamaIndex with VectorStoreIndex
- CrewAI with built-in opaque storage
When I wanted to switch frameworks, I had to rewrite all the memory logic. When I needed to debug what my agent “remembered,” I was stuck with opaque databases that required specific tools to inspect.
The real problem: framework-specific memory creates lock-in and makes it hard to understand what your agent actually knows.
Framework-Agnostic Memory
The solution I found is memsearch—a Python library that treats markdown files as source of truth for memory. The vector database (Milvus) is just a rebuildable index.
Here’s the core API:
from memsearch import MemSearch
mem = MemSearch(paths=["./memory"])await mem.index() # Index markdown filesresults = await mem.search("query", top_k=3) # Semantic searchMemory lives in markdown files you can read and edit:
## Team- Alice: frontend lead- Bob: backend lead- Charlie: devops
## Technical Decisions- Chose Redis for caching over Memcached- Using PostgreSQL for primary database- Deploying on AWS ECSThis approach gives you:
- Human-readable memory
- Git version control
- Zero vendor lock-in
- Same API across all frameworks
LangChain Integration
LangChain uses retrievers as the interface for memory. You create a BaseRetriever that wraps memsearch:
import asynciofrom typing import Listfrom langchain_core.documents import Documentfrom langchain_core.retrievers import BaseRetrieverfrom langchain_core.callbacks import CallbackManagerForRetrieverRunfrom memsearch import MemSearch
class MemSearchRetriever(BaseRetriever): """LangChain BaseRetriever wrapper for memsearch."""
def __init__(self, memory_paths: List[str] = ["./memory"], top_k: int = 3): super().__init__() self.mem = MemSearch(paths=memory_paths) self.top_k = top_k
def _get_relevant_documents( self, query: str, *, run_manager: CallbackManagerForRetrieverRun, ) -> List[Document]: """Retrieve documents synchronously.""" results = asyncio.run(self.mem.search(query, top_k=self.top_k)) return [ Document( page_content=result["content"], metadata={ "score": result["score"], "source": result.get("source", ""), "chunk_id": result.get("chunk_id", ""), }, ) for result in results ]
async def _aget_relevant_documents( self, query: str, *, run_manager: CallbackManagerForRetrieverRun, ) -> List[Document]: """Retrieve documents asynchronously.""" results = await self.mem.search(query, top_k=self.top_k) return [ Document( page_content=result["content"], metadata={ "score": result["score"], "source": result.get("source", ""), "chunk_id": result.get("chunk_id", ""), }, ) for result in results ]Use it in a LangChain LCEL chain:
import asynciofrom langchain_openai import ChatOpenAIfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.output_parsers import StrOutputParserfrom memsearch import MemSearch
async def langchain_example(): # Initialize memsearch and index mem = MemSearch(paths=["./memory"]) await mem.index()
# Create retriever retriever = MemSearchRetriever(memory_paths=["./memory"], top_k=3)
# Build LCEL chain llm = ChatOpenAI(model="gpt-4o-mini") prompt = ChatPromptTemplate.from_template( """Answer using the following retrieved memories:
Memories:{context}
Question: {question}
Answer:""" )
chain = ( {"context": retriever, "question": lambda x: x["question"]} | prompt | llm | StrOutputParser() )
# Query result = await chain.ainvoke({"question": "What decisions have we made about caching?"}) print(result)
asyncio.run(langchain_example())LangGraph Tool Integration
For LangGraph ReAct agents, wrap memsearch as a tool:
import asynciofrom langchain_openai import ChatOpenAIfrom langchain_core.tools import toolfrom langgraph.prebuilt import create_react_agentfrom memsearch import MemSearch
# Initialize memsearchmem = MemSearch(paths=["./memory"])
@toolasync def search_memory(query: str, top_k: int = 3) -> str: """Search through persistent markdown memories.
Use this tool when you need to recall previous information, decisions, or knowledge stored in the memory system. Returns relevant memories based on semantic similarity.
Args: query: The search query to find relevant memories top_k: Number of top results to return (default: 3) """ results = await mem.search(query, top_k=top_k)
if not results: return "No relevant memories found."
formatted = [] for i, result in enumerate(results, 1): formatted.append( f"Memory {i} (Similarity: {result['score']:.3f}):\n" f"{result['content'][:200]}...\n" )
return "\n".join(formatted)
async def langgraph_example(): # Index memories await mem.index()
# Create LLM with tool binding llm = ChatOpenAI(model="gpt-4o-mini", temperature=0) llm_with_tools = llm.bind_tools([search_memory])
# Create ReAct agent agent = create_react_agent( llm=llm_with_tools, tools=[search_memory], state_modifier="You have access to a memory search tool. " "Always search memory before answering questions about past decisions or knowledge." )
# Execute agent response = await agent.ainvoke( {"messages": [("user", "What caching solution did we choose?")]} )
print(response["messages"][-1].content)
asyncio.run(langgraph_example())The tool docstring tells the agent when and how to use memory search.
LlamaIndex Integration
LlamaIndex requires extending BaseRetriever:
import asynciofrom llama_index.core.retrievers import BaseRetrieverfrom llama_index.core.schema import QueryBundle, NodeWithScore, TextNodefrom llama_index.core.query_engine import RetrieverQueryEnginefrom llama_index.llms.openai import OpenAIfrom memsearch import MemSearch
class MemSearchRetriever(BaseRetriever): """LlamaIndex BaseRetriever wrapper for memsearch."""
def __init__(self, memory_paths: list = ["./memory"], top_k: int = 3): super().__init__() self.mem = MemSearch(paths=memory_paths) self.top_k = top_k
def _retrieve(self, query_bundle: QueryBundle) -> list[NodeWithScore]: """Retrieve nodes for given query.""" # Run async search in sync context results = asyncio.run( self.mem.search(query_bundle.query_str, top_k=self.top_k) )
# Convert memsearch results to LlamaIndex NodeWithScore nodes_with_scores = [] for result in results: node = TextNode( text=result["content"], metadata={ "score": result["score"], "source": result.get("source", ""), "chunk_id": result.get("chunk_id", ""), }, ) nodes_with_scores.append( NodeWithScore(node=node, score=result["score"]) )
return nodes_with_scores
async def llamaindex_example(): # Initialize memsearch and index mem = MemSearch(paths=["./memory"]) await mem.index()
# Create custom retriever retriever = MemSearchRetriever(memory_paths=["./memory"], top_k=3)
# Create query engine with custom retriever llm = OpenAI(model="gpt-4o-mini") query_engine = RetrieverQueryEngine.from_args( retriever=retriever, llm=llm, verbose=True, )
# Query response = query_engine.query("What decisions have we made about system architecture?") print(response)
asyncio.run(llamaindex_example())CrewAI Integration
CrewAI uses tool decorators or BaseTool classes:
import asynciofrom typing import Typefrom pydantic import BaseModel, Fieldfrom crewai.tools import BaseToolfrom crewai import Agent, Task, Crew, Process, LLMfrom memsearch import MemSearch
class MemSearchInput(BaseModel): """Input schema for MemSearchTool.""" query: str = Field( ..., description="The search query to find relevant memories" ) top_k: int = Field( default=3, description="Number of top results to return" )
class MemSearchTool(BaseTool): """CrewAI tool wrapper for memsearch."""
name: str = "MemSearch" description: str = ( "Search through persistent markdown memories stored in memsearch. " "Use this tool when you need to recall previous information, " "decisions, or knowledge stored in the memory system. " "Returns relevant memories based on semantic similarity." ) args_schema: Type[BaseModel] = MemSearchInput
def __init__(self, memory_paths: list = ["./memory"], **kwargs): super().__init__(**kwargs) self.mem = MemSearch(paths=memory_paths)
def _run(self, query: str, top_k: int = 3) -> str: """Execute memsearch synchronously.""" try: # Run async search in sync context results = asyncio.run(self.mem.search(query, top_k=top_k))
if not results: return "No relevant memories found."
# Format results for CrewAI formatted_results = [] for i, result in enumerate(results, 1): formatted_results.append( f"Memory {i} (Similarity: {result['score']:.3f}):\n" f"{result['content'][:200]}...\n" )
return "\n".join(formatted_results) except Exception as e: return f"Error searching memory: {str(e)}"
async def crewai_example(): # Initialize memsearch and index mem = MemSearch(paths=["./memory"]) await mem.index()
# Create memsearch tool memory_tool = MemSearchTool(memory_paths=["./memory"])
# Create agent with memory tool llm = LLM(model="openai/gpt-4o-mini") agent = Agent( role="Memory-Enabled Assistant", goal="Provide answers with context from persistent memory", backstory=( "You have access to a memory system containing past " "decisions and knowledge. Always search memory before " "answering questions about history or context." ), tools=[memory_tool], llm=llm, verbose=True, )
# Create task task = Task( description="Answer: {question}", expected_output="A comprehensive answer based on memory search results", agent=agent, )
# Create and execute crew crew = Crew( agents=[agent], tasks=[task], process=Process.sequential, verbose=True, )
result = crew.kickoff(inputs={"question": "What caching solution did we choose?"}) print(result)
asyncio.run(crewai_example())Common Integration Pitfalls
I ran into these problems while integrating across frameworks:
Forgetting to index before search:
# WRONG: Search without indexingmem = MemSearch(paths=["./memory"])results = await mem.search("query") # Returns empty
# CORRECT: Index firstmem = MemSearch(paths=["./memory"])await mem.index() # Build vector indexresults = await mem.search("query")Mixing async/sync in LangChain:
# WRONG: Using async method in sync contextclass BadRetriever(BaseRetriever): def _get_relevant_documents(self, query): return await self.mem.search(query) # SyntaxError
# CORRECT: Run async in sync contextimport asyncioclass GoodRetriever(BaseRetriever): def _get_relevant_documents(self, query): return asyncio.run(self.mem.search(query))Wrong return type in LlamaIndex:
# WRONG: Returning dict instead of NodeWithScoredef _retrieve(self, query_bundle): return {"content": "...", "score": 0.9}
# CORRECT: Return NodeWithScore objectsfrom llama_index.core.schema import NodeWithScore, TextNodedef _retrieve(self, query_bundle): return [NodeWithScore(node=TextNode(text="..."), score=0.9)]Not handling async in CrewAI tools:
# WRONG: Async method in sync _runclass BadTool(BaseTool): def _run(self, query): return await self.mem.search(query)
# CORRECT: Use asyncio.runimport asyncioclass GoodTool(BaseTool): def _run(self, query): return asyncio.run(self.mem.search(query))Provider mismatch:
# WRONG: Provider mismatchpip install memsearch # Only OpenAI includedmem = MemSearch(paths=["./memory"], embedding_provider="ollama") # Error!
# CORRECT: Install required extraspip install "memsearch[ollama]"mem = MemSearch(paths=["./memory"], embedding_provider="ollama")Framework-Agnostic Agent
Here’s a complete agent that works with any LLM provider:
import asynciofrom datetime import datefrom pathlib import Pathfrom openai import OpenAIfrom memsearch import MemSearch
class UniversalMemoryAgent: """Framework-agnostic agent with persistent memory.
Works with OpenAI, Anthropic Claude, Ollama, or any LLM provider. Memory is managed by memsearch and stored as markdown files. """
def __init__( self, memory_paths: list = ["./memory"], llm_provider: str = "openai", top_k: int = 3, ): self.memory_paths = memory_paths self.mem = MemSearch(paths=memory_paths) self.top_k = top_k self.llm = OpenAI() # Could be Anthropic(), Ollama(), etc.
def save_memory(self, content: str, category: str = "general") -> None: """Append a note to today's memory log.""" memory_dir = Path(self.memory_paths[0]) if self.memory_paths else Path("./memory") memory_dir.mkdir(parents=True, exist_ok=True)
today = date.today() file_path = memory_dir / f"{today}.md"
with open(file_path, "a", encoding="utf-8") as f: f.write(f"\n## {category.title()}\n{content}\n")
async def recall(self, query: str) -> list[dict]: """Recall relevant memories for query.""" results = await self.mem.search(query, top_k=self.top_k) return results
async def chat(self, user_input: str) -> str: """Process user input with memory recall and storage.
Implements Recall-Think-Remember pattern: 1. Recall - search past memories for relevant context 2. Think - call LLM with memory context 3. Remember - save this exchange and index it """ # 1. Recall memories = await self.recall(user_input) context = "\n".join( f"- {m['content'][:200]}..." for m in memories )
# 2. Think system_prompt = ( f"You have these memories:\n{context}\n\n" "Use this context to inform your response. " "If memories are relevant, reference them. " "If not, answer based on your general knowledge." )
resp = self.llm.chat.completions.create( model="gpt-4o-mini", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": user_input}, ], ) answer = resp.choices[0].message.content
# 3. Remember self.save_memory(f"## {user_input}\n{answer}", "conversation") await self.mem.index()
return answer
async def main(): # Initialize agent agent = UniversalMemoryAgent(memory_paths=["./memory"])
# Seed some initial knowledge agent.save_memory( "## Team\n- Alice: frontend lead\n- Bob: backend lead\n- Charlie: devops", "team" ) agent.save_memory( "## Technical Decisions\n- Chose Redis for caching over Memcached\n- Using PostgreSQL for primary database\n- Deploying on AWS ECS", "decisions" ) await agent.mem.index()
# Chat with memory while True: user_input = input("\nYou: ") if user_input.lower() in ["quit", "exit"]: break
response = await agent.chat(user_input) print(f"\nAgent: {response}")
asyncio.run(main())The Reason
The key insight is that framework-agnostic memory separates concerns:
- Memory system: Stores and retrieves knowledge (memsearch)
- Agent framework: Orchestrates tool use and reasoning (LangChain, LlamaIndex, CrewAI)
- LLM provider: Generates responses (OpenAI, Anthropic, Ollama)
When each layer does one thing well, you can swap components without rewriting everything.
Summary
I showed how to integrate memsearch with LangChain, LlamaIndex, and CrewAI. The pattern is consistent: wrap memsearch’s simple API to match each framework’s expected interface.
Key takeaways:
- Use BaseRetriever for LangChain and LlamaIndex
- Use @tool decorator or BaseTool for CrewAI and LangGraph
- Markdown files are source of truth—vector DB is just a derived index
- The same MemSearch instance can be shared across multiple frameworks
- Zero vendor lock-in means easy framework switching
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 memsearch - A Markdown-first memory system
- 👨💻 LangChain Documentation - Retrievers
- 👨💻 LangGraph Persistence
- 👨💻 LlamaIndex Documentation
- 👨💻 CrewAI Documentation
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments