Skip to content

LangGraph vs CrewAI vs Simple API: When to Use Each?

Problem

I was building an email parsing system for a client. My first instinct? Install LangGraph, set up a multi-node workflow, create separate agents for extraction, validation, and formatting.

Two weeks later, I had a complex system that cost $0.15 per email and took 8 seconds to process. Then I tried something embarrassing - a single API call with a good prompt.

simple_solution.py
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt_with_examples}],
)

Same accuracy. $0.03 per email. 3 seconds latency. I had over-engineered from the start.

The Framework Trap

A Reddit thread from someone who built 25+ production agents confirmed what I experienced:

“Before you reach for CrewAI or LangGraph, ask yourself: Could a single API call with a really good prompt solve 80% of this problem?”

The profitable AI systems they built all use the same stack:

production_stack.txt
OpenAI API + n8n (or webhook/cron) + Supabase for persistence
No frameworks. No orchestration. No complex chains.

Real examples:

SystemRevenueArchitecture
Email-to-CRM updater$200/monthSimple API
Resume parser$50/seatSimple API
Invoice extractor$500/monthSimple API

Meanwhile, developers report the same pattern with frameworks:

framework_headaches.txt
"I had a whole planner-executor-reviewer pipeline going and spent
more time debugging agent handoffs than the actual task logic."
"Ditched it for one agent with a really detailed spec file and
it just works."
"When I do need parallelism I run completely independent agents
that share nothing except a lock file."

Why We Reach for Frameworks

I fell into these traps:

  1. Marketing makes orchestration feel essential - Every framework demo shows complex multi-agent setups
  2. Complex feels more “professional” - Simple solutions seem amateur
  3. FOMO on features - What if I need checkpointing later?
  4. Research paper envy - Academic papers showcase multi-agent patterns

The result:

hidden_costs.txt
- Framework lock-in and dependency management
- Debugging agent handoffs instead of task logic
- Hidden costs from multiple LLM calls per request
- Latency multiplied by orchestration layers
- Premature complexity before understanding requirements

Decision Framework

After testing all three approaches, I built this decision tree:

decision_tree.txt
START: What does your task need?
1. Clear input/output transformation?
+-- YES --> Can examples demonstrate expected behavior?
| +-- YES --> SIMPLE API (80% of cases)
| +-- NO --> Does task need state management?
| +-- YES --> LANGGRAPH
| +-- NO --> SIMPLE API with better prompt
2. Complex branching logic?
+-- YES --> LANGGRAPH (conditional execution paths)
3. Distinct agent personas with specific roles?
+-- YES --> Does role separation add genuine value?
+-- YES --> CREWAI
+-- NO --> Try single agent with tool use first
4. Parallel processing with shared state?
+-- YES --> LANGGRAPH (parallel nodes with synchronization)

Simple API (Start Here - 80% of Cases)

When a single LLM call with a good prompt achieves the goal, frameworks are overhead.

Use simple API when:

  • Task has clear input/output transformation
  • Examples can demonstrate expected behavior
  • Response time matters (< 5 seconds target)
  • Cost efficiency is important
  • Task doesn’t require state management
simple_api.py
from openai import OpenAI
from pydantic import BaseModel
class ContentAnalysis(BaseModel):
sentiment: str
topics: list[str]
action_items: list[str]
confidence: float
def analyze_content(text: str) -> ContentAnalysis:
"""Simple API call - no frameworks needed"""
client = OpenAI()
prompt = f"""
Analyze the content and return JSON with:
- sentiment: positive/negative/neutral
- topics: list of main topics
- action_items: list of action items mentioned
- confidence: 0.0 to 1.0
Examples:
Text: "We need to schedule a meeting about Q4 targets. Team morale is high."
Output: {{"sentiment": "positive", "topics": ["Q4 targets", "meeting"], "action_items": ["schedule meeting"], "confidence": 0.95}}
Text: {text}
Output:
"""
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
response_format={"type": "json_object"}
)
return ContentAnalysis.model_validate_json(response.choices[0].message.content)
# Usage: One function, one call, done
result = analyze_content(email_content)
# Cost: ~$0.03, Latency: ~3 seconds

What this approach gives you:

simple_benefits.txt
Single API call:
- 1 LLM call
- ~2,000 tokens
- $0.03 per request
- 2-4 seconds latency
- Easy to debug
- Easy to test
- No framework lock-in

LangGraph (When You Need Stateful Workflows)

LangGraph shines when workflows have complex branching logic or need state management across multiple steps.

Use LangGraph when:

  • Workflow has conditional execution paths
  • State management across multiple steps required
  • Need for checkpointing/resumable workflows
  • Parallel execution with synchronization points
  • Fine-grained control over agent flow
langgraph_workflow.py
from langgraph import StateGraph, END
from typing import TypedDict
from langchain_openai import ChatOpenAI
class WorkflowState(TypedDict):
input: str
research_result: str
analysis_result: str
needs_review: bool
final_output: str
def research_node(state: WorkflowState) -> dict:
"""Research phase - gathers information"""
llm = ChatOpenAI(model="gpt-4")
result = llm.invoke(f"Research: {state['input']}")
return {"research_result": result.content}
def analyze_node(state: WorkflowState) -> dict:
"""Analysis phase - processes research"""
llm = ChatOpenAI(model="gpt-4")
result = llm.invoke(f"Analyze: {state['research_result']}")
needs_review = "complex" in result.content.lower()
return {"analysis_result": result.content, "needs_review": needs_review}
def review_node(state: WorkflowState) -> dict:
"""Optional review - only for complex cases"""
llm = ChatOpenAI(model="gpt-4")
result = llm.invoke(f"Review: {state['analysis_result']}")
return {"final_output": result.content}
def finalize_node(state: WorkflowState) -> dict:
"""Direct to final - for simple cases"""
return {"final_output": state["analysis_result"]}
# Build the graph with conditional logic
workflow = StateGraph(WorkflowState)
workflow.add_node("research", research_node)
workflow.add_node("analyze", analyze_node)
workflow.add_node("review", review_node)
workflow.add_node("finalize", finalize_node)
workflow.set_entry_point("research")
workflow.add_edge("research", "analyze")
# Conditional branching based on state
workflow.add_conditional_edges(
"analyze",
lambda state: "review" if state["needs_review"] else "finalize",
{"review": "review", "finalize": "finalize"}
)
workflow.add_edge("review", END)
workflow.add_edge("finalize", END)
app = workflow.compile()
# Cost: ~$0.10-0.15, Latency: ~8-12 seconds

The branching logic is the key feature:

branching_diagram.txt
START
|
v
research
|
v
analyze
|
+-- needs_review=true --> review --> END
|
+-- needs_review=false --> finalize --> END

This is harder to express with simple API calls, and LangGraph provides the structure to manage it cleanly.

CrewAI (When You Need Role-Based Collaboration)

CrewAI is designed for scenarios where distinct agent personas with specific roles add genuine value.

Use CrewAI when:

  • Need distinct agent personas with specific roles
  • Collaborative problem-solving benefits from role separation
  • Each “crew member” has specialized tools/knowledge
  • Task naturally decomposes into expert domains
  • Want human-like team dynamics (researcher, writer, reviewer)
crewai_example.py
from crewai import Agent, Task, Crew
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")
# Define agents with specific roles and personas
researcher = Agent(
role="Research Specialist",
goal="Gather comprehensive information on the topic",
backstory="Expert researcher with 10 years of experience",
llm=llm,
verbose=True
)
writer = Agent(
role="Content Writer",
goal="Create engaging, well-structured content",
backstory="Professional writer specializing in technical content",
llm=llm,
verbose=True
)
editor = Agent(
role="Senior Editor",
goal="Ensure quality, accuracy, and consistency",
backstory="Editor with eye for detail and quality standards",
llm=llm,
verbose=True
)
# Define tasks for each agent
research_task = Task(
description="Research the topic: {topic}",
agent=researcher,
expected_output="Comprehensive research notes"
)
writing_task = Task(
description="Write article based on research",
agent=writer,
expected_output="Draft article"
)
editing_task = Task(
description="Edit and finalize the article",
agent=editor,
expected_output="Final polished article"
)
# Assemble the crew
crew = Crew(
agents=[researcher, writer, editor],
tasks=[research_task, writing_task, editing_task],
verbose=True
)
result = crew.kickoff(inputs={"topic": "AI Agent Frameworks"})
# Cost: ~$0.15-0.20, Latency: ~10-15+ seconds

The role separation can help when:

role_separation_benefits.txt
- Researcher focuses on gathering facts (different system prompt)
- Writer focuses on narrative flow (different tools/examples)
- Editor focuses on quality gates (different evaluation criteria)
Each agent has:
- Distinct backstory and expertise
- Specific tools for their domain
- Clear output expectations

But I’ve found this often overcomplicates tasks that a single agent with a comprehensive prompt could handle.

Cost and Latency Comparison

I tracked actual production costs:

cost_comparison.txt
Per request comparison:
| Approach | LLM Calls | Tokens | Cost | Latency |
|---------------|-----------|---------|---------|------------|
| Simple API | 1 | ~2,000 | $0.03 | 2-4 sec |
| LangGraph (3) | 3+ | ~8,000 | $0.12 | 6-12 sec |
| CrewAI (3) | 3+ | ~10,000 | $0.15 | 10-15+ sec |
Monthly cost (1000 requests/day):
- Simple API: ~$900/month
- LangGraph: ~$3,600/month
- CrewAI: ~$4,500/month
Annual difference: $40,000+ between simple and complex

Development overhead also differs:

development_overhead.txt
| Approach | Setup Complexity | Debugging Difficulty |
|-------------|------------------|----------------------|
| Simple API | Low | Low |
| LangGraph | Medium | Medium |
| CrewAI | Medium-High | High |
With CrewAI, I spent more time debugging agent handoffs than the actual task logic.

Common Mistakes I Made

Mistake 1: Framework-First Thinking

wrong_approach.py
# WRONG: Choose framework first, then fit the problem
from crewai import Crew
crew = Crew(agents=[...], tasks=[...])
result = crew.kickoff()
# RIGHT: Solve the problem first, add framework if needed
response = client.chat.completions.create(...)
# If that works, ship it. Only add complexity when you hit walls.

Mistake 2: Not Calculating Costs

I didn’t realize my 3-agent system cost 5x more per request until I saw the monthly bill.

Mistake 3: Ignoring Latency

latency_impact.txt
User waiting for email analysis:
- Simple API: 3 seconds (feels instant)
- LangGraph: 10 seconds (feels slow)
- CrewAI: 15+ seconds (user might refresh)
Latency affects user experience and conversion rates.

Mistake 4: Copying Research Paper Patterns

Academic papers showcase multi-agent architectures because that’s what gets published. Production systems need reliability, not novelty.

academic_vs_production.txt
Academic paper priorities:
- Novel architecture
- Complex agent interactions
- Publishable contribution
Production priorities:
- Reliable execution
- Minimal failure points
- Cost efficiency

When Frameworks Are Worth It

I’m not saying frameworks are always wrong. They’re just overused.

LangGraph justified:

  • Customer support with conditional escalation paths
  • Multi-step research workflows with decision trees
  • Human-in-the-loop workflows with approval gates
  • Document processing with validation checkpoints

CrewAI justified:

  • Multi-perspective content creation where different viewpoints add value
  • Educational simulations with distinct expert roles
  • Business analysis crews (analyst, strategist, reviewer)
  • Code analysis covering whole systems (security, performance, style)

A Reddit comment summarized it well:

“For research assistant you can easily use one single agent and one single high quality prompt. For code analyzers covering whole systems it won’t work with single agent.”

Summary

I spent weeks building complex multi-agent systems before realizing I was solving the wrong problem. The frameworks weren’t the solution - they were the obstacle.

Key takeaways:

  1. Start with a simple API call and excellent prompts
  2. Measure results before adding complexity
  3. Add LangGraph when workflow complexity demands state management
  4. Add CrewAI when role-based collaboration adds genuine value
  5. Calculate costs before committing to an architecture

Before your next AI feature, try this:

  1. Prototype with a single API call
  2. Measure accuracy, cost, latency
  3. If it achieves 80% of your goal, ship it
  4. Only add framework complexity when you hit specific walls

Most AI systems work best with the simple stack: OpenAI API + webhook/cron trigger + database for persistence. No frameworks, no orchestration, no complex chains. That’s the whole thing.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments