What Tools and Primitives Exist for Building AI Agents in 2025
Purpose
When I started building AI agents, I kept asking myself: what building blocks do I actually need? I knew about LangChain, but what about memory? What about giving agents access to real tools? What about letting them browse the web?
After digging through documentation, Reddit threads, and building a few agents myself, I found that the AI agent stack has matured into clear categories. Each category solves a specific problem.
The Stack Overview
Here’s how I think about the agent tool landscape:
┌─────────────────────────────────────────────────────────────────┐│ YOUR AGENT ││ (LangChain Core) │└─────────────────────────────────────────────────────────────────┘ │ │ │ │ ▼ ▼ ▼ ▼ ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌──────────┐ │ Memory │ │ Tools │ │ Compute │ │ Browsers │ │ (Mem0) │ │(Composio)│ │ (E2B) │ │(Browserbase)│ └─────────┘ └─────────┘ └──────────┘ └──────────┘ │ │ │ │ ▼ ▼ ▼ ▼ ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌──────────┐ │Context │ │800+ SaaS│ │Sandboxed │ │ Web Auto │ │Retention│ │Actions │ │Exec │ │ mation │ └─────────┘ └─────────┘ └──────────┘ └──────────┘Let me walk through each layer.
Core Framework: LangChain
LangChain sits at the center. It handles the agent loop - reasoning, deciding which tools to call, and processing results.
from langchain.agents import create_agentfrom langchain_anthropic import ChatAnthropicfrom langchain.tools import tool
@tooldef search_web(query: str) -> str: """Search the web for information.""" pass
agent = create_agent( model=ChatAnthropic(model="claude-sonnet-4-5-20250929"), tools=[search_web], system_prompt="You are a helpful autonomous agent.")The framework handles orchestration. But agents need more than orchestration.
Memory Layer: Mem0
Agents forget things between sessions. Mem0 gives them persistent memory.
from mem0 import MemoryClient
mem0 = MemoryClient()mem0.add([{"role": "user", "content": "User prefers Italian food"}], user_id="alice")memories = mem0.search("food recommendations", user_id="alice", limit=3)I use this when my agents need to remember user preferences across conversations. Without it, every session starts from scratch.
Tool Integration: Composio
This is where agents get real work done. Composio connects to 800+ SaaS tools - Gmail, Slack, GitHub, Notion, and many more.
from composio import Composiofrom composio_openai_agents import OpenAIAgentsProvider
composio = Composio(provider=OpenAIAgentsProvider())session = composio.create(user_id="user_123")tools = session.tools()Instead of writing API integrations myself, I get pre-built actions my agents can call directly.
Secure Execution: E2B and Daytona
Agents sometimes need to run code. E2B and Daytona provide sandboxed environments where agents can execute Python, JavaScript, or shell commands safely.
| Sandbox | Use Case | Key Feature |
|---|---|---|
| E2B | Code execution | 50ms startup |
| Daytona | Full dev environment | Persistent workspaces |
I use these when building agents that write and run their own code. The sandboxing means a rogue agent can’t delete my actual files.
Browser Automation: Browserbase and Hyperbrowser
Web browsing agents need a browser they control. Browserbase and Hyperbrowser provide headless browser instances.
Agent Request │ ▼┌─────────────────┐│ Browserbase ││ Headless Chrome│└─────────────────┘ │ ▼ Navigate, Click, Extract DataI use these for agents that need to fill forms, scrape dynamic pages, or interact with JavaScript-heavy sites.
Web Data Extraction: Firecrawl
When agents need data from websites, Firecrawl handles the scraping and returns clean markdown.
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key="your-key")result = app.scrape_url("https://example.com")docs = app.crawl_url("https://docs.example.com")This is cleaner than parsing HTML myself. Firecrawl handles JavaScript rendering, rate limiting, and structured extraction.
Communication Channels
Agents can talk to the world through specialized services:
| Channel | Service | What It Does |
|---|---|---|
| AgentMail | Send/receive emails | |
| Phone | AgentPhone | Voice calls |
| Kapso | Messaging | |
| Voice | ElevenLabs, Vapi | Speech synthesis |
Search: Exa
Exa provides AI-native web search optimized for agents. It returns relevant results with context, not just keyword matches.
Comparison Table
Here’s how I think about choosing between options:
| Primitive | Tools | Best For | Cost Model |
|---|---|---|---|
| Framework | LangChain | Core agent logic | Open source |
| Memory | Mem0 | Context retention | Freemium |
| SaaS Tools | Composio | External integrations | Usage-based |
| Sandbox | E2B, Daytona | Code execution | Usage-based |
| Browser | Browserbase, Hyperbrowser | Web automation | Usage-based |
| Scraping | Firecrawl | Data extraction | Freemium |
| Voice | ElevenLabs, Vapi | Speech | Usage-based |
| Search | Exa | Web search | Usage-based |
| AgentMail | Email communication | Usage-based |
Choosing What You Need
Not every agent needs every primitive. Here’s my decision process:
- Does my agent need memory? If it should learn from past interactions, add Mem0.
- Should it take actions? If yes, Composio gives access to external tools.
- Will it run code? E2B or Daytona for sandboxed execution.
- Does it browse the web? Browserbase for interaction, Firecrawl for extraction.
- Should it communicate? AgentMail, ElevenLabs based on channel needs.
A Simple Agent Architecture
Here’s how these pieces fit together:
User Request │ ▼┌─────────────────────────────────────┐│ LangChain Agent ││ ┌─────────────────────────────┐ ││ │ Planning & Reasoning │ ││ └─────────────────────────────┘ ││ │ ││ ▼ ││ ┌─────────────────────────────┐ ││ │ Tool Selection │ ││ └─────────────────────────────┘ │└─────────────────────────────────────┘ │ ┌────┴────┬─────────┬─────────┐ ▼ ▼ ▼ ▼┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐│ Mem0 │ │Composio│ │ E2B │ │Exa ││Memory │ │Tools │ │Sandbox│ │Search │└───────┘ └───────┘ └───────┘ └───────┘What I Learned
The key insight is that building agents is now about choosing and combining primitives, not building everything from scratch. Each tool in the stack solves a specific problem.
Start with LangChain for the core loop. Add primitives based on what your agent actually needs to do. Don’t over-engineer - a simple agent might only need LangChain and one or two tools.
The ecosystem is still evolving fast. New primitives appear regularly. But the categories - memory, tools, compute, browsers, communication - stay stable even as specific tools change.
Summary
In this post, I showed the current AI agent tool stack and how different primitives fit together. The key point is that each category solves a specific problem: LangChain for orchestration, Mem0 for memory, Composio for tool access, E2B for execution, and specialized services for browsing, scraping, and communication. Choose primitives based on what your agent actually needs, not because they exist.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 LangChain Documentation
- 👨💻 Mem0 Documentation
- 👨💻 Composio Documentation
- 👨💻 E2B Sandbox
- 👨💻 Browserbase
- 👨💻 Firecrawl
- 👨💻 Reddit Discussion: AI Agent Primitives
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments