Skip to content

What Tools and Primitives Exist for Building AI Agents in 2025

Purpose

When I started building AI agents, I kept asking myself: what building blocks do I actually need? I knew about LangChain, but what about memory? What about giving agents access to real tools? What about letting them browse the web?

After digging through documentation, Reddit threads, and building a few agents myself, I found that the AI agent stack has matured into clear categories. Each category solves a specific problem.

The Stack Overview

Here’s how I think about the agent tool landscape:

┌─────────────────────────────────────────────────────────────────┐
│ YOUR AGENT │
│ (LangChain Core) │
└─────────────────────────────────────────────────────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌──────────┐ ┌──────────┐
│ Memory │ │ Tools │ │ Compute │ │ Browsers │
│ (Mem0) │ │(Composio)│ │ (E2B) │ │(Browserbase)│
└─────────┘ └─────────┘ └──────────┘ └──────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌──────────┐ ┌──────────┐
│Context │ │800+ SaaS│ │Sandboxed │ │ Web Auto │
│Retention│ │Actions │ │Exec │ │ mation │
└─────────┘ └─────────┘ └──────────┘ └──────────┘

Let me walk through each layer.

Core Framework: LangChain

LangChain sits at the center. It handles the agent loop - reasoning, deciding which tools to call, and processing results.

agent.py
from langchain.agents import create_agent
from langchain_anthropic import ChatAnthropic
from langchain.tools import tool
@tool
def search_web(query: str) -> str:
"""Search the web for information."""
pass
agent = create_agent(
model=ChatAnthropic(model="claude-sonnet-4-5-20250929"),
tools=[search_web],
system_prompt="You are a helpful autonomous agent."
)

The framework handles orchestration. But agents need more than orchestration.

Memory Layer: Mem0

Agents forget things between sessions. Mem0 gives them persistent memory.

memory.py
from mem0 import MemoryClient
mem0 = MemoryClient()
mem0.add([{"role": "user", "content": "User prefers Italian food"}], user_id="alice")
memories = mem0.search("food recommendations", user_id="alice", limit=3)

I use this when my agents need to remember user preferences across conversations. Without it, every session starts from scratch.

Tool Integration: Composio

This is where agents get real work done. Composio connects to 800+ SaaS tools - Gmail, Slack, GitHub, Notion, and many more.

tools.py
from composio import Composio
from composio_openai_agents import OpenAIAgentsProvider
composio = Composio(provider=OpenAIAgentsProvider())
session = composio.create(user_id="user_123")
tools = session.tools()

Instead of writing API integrations myself, I get pre-built actions my agents can call directly.

Secure Execution: E2B and Daytona

Agents sometimes need to run code. E2B and Daytona provide sandboxed environments where agents can execute Python, JavaScript, or shell commands safely.

SandboxUse CaseKey Feature
E2BCode execution50ms startup
DaytonaFull dev environmentPersistent workspaces

I use these when building agents that write and run their own code. The sandboxing means a rogue agent can’t delete my actual files.

Browser Automation: Browserbase and Hyperbrowser

Web browsing agents need a browser they control. Browserbase and Hyperbrowser provide headless browser instances.

Agent Request
┌─────────────────┐
│ Browserbase │
│ Headless Chrome│
└─────────────────┘
Navigate, Click,
Extract Data

I use these for agents that need to fill forms, scrape dynamic pages, or interact with JavaScript-heavy sites.

Web Data Extraction: Firecrawl

When agents need data from websites, Firecrawl handles the scraping and returns clean markdown.

scrape.py
from firecrawl import FirecrawlApp
app = FirecrawlApp(api_key="your-key")
result = app.scrape_url("https://example.com")
docs = app.crawl_url("https://docs.example.com")

This is cleaner than parsing HTML myself. Firecrawl handles JavaScript rendering, rate limiting, and structured extraction.

Communication Channels

Agents can talk to the world through specialized services:

ChannelServiceWhat It Does
EmailAgentMailSend/receive emails
PhoneAgentPhoneVoice calls
WhatsAppKapsoMessaging
VoiceElevenLabs, VapiSpeech synthesis

Search: Exa

Exa provides AI-native web search optimized for agents. It returns relevant results with context, not just keyword matches.

Comparison Table

Here’s how I think about choosing between options:

PrimitiveToolsBest ForCost Model
FrameworkLangChainCore agent logicOpen source
MemoryMem0Context retentionFreemium
SaaS ToolsComposioExternal integrationsUsage-based
SandboxE2B, DaytonaCode executionUsage-based
BrowserBrowserbase, HyperbrowserWeb automationUsage-based
ScrapingFirecrawlData extractionFreemium
VoiceElevenLabs, VapiSpeechUsage-based
SearchExaWeb searchUsage-based
EmailAgentMailEmail communicationUsage-based

Choosing What You Need

Not every agent needs every primitive. Here’s my decision process:

  1. Does my agent need memory? If it should learn from past interactions, add Mem0.
  2. Should it take actions? If yes, Composio gives access to external tools.
  3. Will it run code? E2B or Daytona for sandboxed execution.
  4. Does it browse the web? Browserbase for interaction, Firecrawl for extraction.
  5. Should it communicate? AgentMail, ElevenLabs based on channel needs.

A Simple Agent Architecture

Here’s how these pieces fit together:

User Request
┌─────────────────────────────────────┐
│ LangChain Agent │
│ ┌─────────────────────────────┐ │
│ │ Planning & Reasoning │ │
│ └─────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ Tool Selection │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────┘
┌────┴────┬─────────┬─────────┐
▼ ▼ ▼ ▼
┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐
│ Mem0 │ │Composio│ │ E2B │ │Exa │
│Memory │ │Tools │ │Sandbox│ │Search │
└───────┘ └───────┘ └───────┘ └───────┘

What I Learned

The key insight is that building agents is now about choosing and combining primitives, not building everything from scratch. Each tool in the stack solves a specific problem.

Start with LangChain for the core loop. Add primitives based on what your agent actually needs to do. Don’t over-engineer - a simple agent might only need LangChain and one or two tools.

The ecosystem is still evolving fast. New primitives appear regularly. But the categories - memory, tools, compute, browsers, communication - stay stable even as specific tools change.

Summary

In this post, I showed the current AI agent tool stack and how different primitives fit together. The key point is that each category solves a specific problem: LangChain for orchestration, Mem0 for memory, Composio for tool access, E2B for execution, and specialized services for browsing, scraping, and communication. Choose primitives based on what your agent actually needs, not because they exist.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments