What Tools and Primitives Exist for Building AI Agents in 2025

Mar 31, 2026

Purpose

When I started building AI agents, I kept asking myself: what building blocks do I actually need? I knew about LangChain, but what about memory? What about giving agents access to real tools? What about letting them browse the web?

After digging through documentation, Reddit threads, and building a few agents myself, I found that the AI agent stack has matured into clear categories. Each category solves a specific problem.

The Stack Overview

Here’s how I think about the agent tool landscape:

┌─────────────────────────────────────────────────────────────────┐
│                        YOUR AGENT                               │
│                    (LangChain Core)                             │
└─────────────────────────────────────────────────────────────────┘
         │           │            │            │
         ▼           ▼            ▼            ▼
    ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌──────────┐
    │ Memory  │ │ Tools   │ │ Compute  │ │ Browsers │
    │  (Mem0) │ │(Composio)│ │  (E2B)   │ │(Browserbase)│
    └─────────┘ └─────────┘ └──────────┘ └──────────┘
         │           │            │            │
         ▼           ▼            ▼            ▼
    ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌──────────┐
    │Context  │ │800+ SaaS│ │Sandboxed │ │ Web Auto │
    │Retention│ │Actions  │ │Exec      │ │ mation   │
    └─────────┘ └─────────┘ └──────────┘ └──────────┘

Let me walk through each layer.

Core Framework: LangChain

LangChain sits at the center. It handles the agent loop - reasoning, deciding which tools to call, and processing results.

from langchain.agents import create_agent
from langchain_anthropic import ChatAnthropic
from langchain.tools import tool

@tool
def search_web(query: str) -> str:
    """Search the web for information."""
    pass

agent = create_agent(
    model=ChatAnthropic(model="claude-sonnet-4-5-20250929"),
    tools=[search_web],
    system_prompt="You are a helpful autonomous agent."
)

The framework handles orchestration. But agents need more than orchestration.

Memory Layer: Mem0

Agents forget things between sessions. Mem0 gives them persistent memory.

from mem0 import MemoryClient

mem0 = MemoryClient()
mem0.add([{"role": "user", "content": "User prefers Italian food"}], user_id="alice")
memories = mem0.search("food recommendations", user_id="alice", limit=3)

I use this when my agents need to remember user preferences across conversations. Without it, every session starts from scratch.

Tool Integration: Composio

This is where agents get real work done. Composio connects to 800+ SaaS tools - Gmail, Slack, GitHub, Notion, and many more.

from composio import Composio
from composio_openai_agents import OpenAIAgentsProvider

composio = Composio(provider=OpenAIAgentsProvider())
session = composio.create(user_id="user_123")
tools = session.tools()

Instead of writing API integrations myself, I get pre-built actions my agents can call directly.

Secure Execution: E2B and Daytona

Agents sometimes need to run code. E2B and Daytona provide sandboxed environments where agents can execute Python, JavaScript, or shell commands safely.

Sandbox	Use Case	Key Feature
E2B	Code execution	50ms startup
Daytona	Full dev environment	Persistent workspaces

I use these when building agents that write and run their own code. The sandboxing means a rogue agent can’t delete my actual files.

Browser Automation: Browserbase and Hyperbrowser

Web browsing agents need a browser they control. Browserbase and Hyperbrowser provide headless browser instances.

Agent Request
      │
      ▼
┌─────────────────┐
│  Browserbase    │
│  Headless Chrome│
└─────────────────┘
      │
      ▼
  Navigate, Click,
  Extract Data

I use these for agents that need to fill forms, scrape dynamic pages, or interact with JavaScript-heavy sites.

Web Data Extraction: Firecrawl

When agents need data from websites, Firecrawl handles the scraping and returns clean markdown.

from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key="your-key")
result = app.scrape_url("https://example.com")
docs = app.crawl_url("https://docs.example.com")

This is cleaner than parsing HTML myself. Firecrawl handles JavaScript rendering, rate limiting, and structured extraction.

Communication Channels

Agents can talk to the world through specialized services:

Channel	Service	What It Does
Email	AgentMail	Send/receive emails
Phone	AgentPhone	Voice calls
WhatsApp	Kapso	Messaging
Voice	ElevenLabs, Vapi	Speech synthesis

Search: Exa

Exa provides AI-native web search optimized for agents. It returns relevant results with context, not just keyword matches.

Comparison Table

Here’s how I think about choosing between options:

Primitive	Tools	Best For	Cost Model
Framework	LangChain	Core agent logic	Open source
Memory	Mem0	Context retention	Freemium
SaaS Tools	Composio	External integrations	Usage-based
Sandbox	E2B, Daytona	Code execution	Usage-based
Browser	Browserbase, Hyperbrowser	Web automation	Usage-based
Scraping	Firecrawl	Data extraction	Freemium
Voice	ElevenLabs, Vapi	Speech	Usage-based
Search	Exa	Web search	Usage-based
Email	AgentMail	Email communication	Usage-based

Choosing What You Need

Not every agent needs every primitive. Here’s my decision process:

Does my agent need memory? If it should learn from past interactions, add Mem0.
Should it take actions? If yes, Composio gives access to external tools.
Will it run code? E2B or Daytona for sandboxed execution.
Does it browse the web? Browserbase for interaction, Firecrawl for extraction.
Should it communicate? AgentMail, ElevenLabs based on channel needs.

A Simple Agent Architecture

Here’s how these pieces fit together:

User Request
      │
      ▼
┌─────────────────────────────────────┐
│         LangChain Agent             │
│  ┌─────────────────────────────┐    │
│  │      Planning & Reasoning   │    │
│  └─────────────────────────────┘    │
│              │                       │
│              ▼                       │
│  ┌─────────────────────────────┐    │
│  │        Tool Selection       │    │
│  └─────────────────────────────┘    │
└─────────────────────────────────────┘
         │
    ┌────┴────┬─────────┬─────────┐
    ▼         ▼         ▼         ▼
┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐
│ Mem0  │ │Composio│ │  E2B  │ │Exa    │
│Memory │ │Tools   │ │Sandbox│ │Search │
└───────┘ └───────┘ └───────┘ └───────┘

What I Learned

The key insight is that building agents is now about choosing and combining primitives, not building everything from scratch. Each tool in the stack solves a specific problem.

Start with LangChain for the core loop. Add primitives based on what your agent actually needs to do. Don’t over-engineer - a simple agent might only need LangChain and one or two tools.

The ecosystem is still evolving fast. New primitives appear regularly. But the categories - memory, tools, compute, browsers, communication - stay stable even as specific tools change.

Summary

In this post, I showed the current AI agent tool stack and how different primitives fit together. The key point is that each category solves a specific problem: LangChain for orchestration, Mem0 for memory, Composio for tool access, E2B for execution, and specialized services for browsing, scraping, and communication. Choose primitives based on what your agent actually needs, not because they exist.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 LangChain Documentation
👨‍💻 Mem0 Documentation
👨‍💻 Composio Documentation
👨‍💻 E2B Sandbox
👨‍💻 Browserbase
👨‍💻 Firecrawl
👨‍💻 Reddit Discussion: AI Agent Primitives

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!