LangGraph vs CrewAI vs Simple API: When to Use Each?
Problem
I was building an email parsing system for a client. My first instinct? Install LangGraph, set up a multi-node workflow, create separate agents for extraction, validation, and formatting.
Two weeks later, I had a complex system that cost $0.15 per email and took 8 seconds to process. Then I tried something embarrassing - a single API call with a good prompt.
response = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": prompt_with_examples}],)Same accuracy. $0.03 per email. 3 seconds latency. I had over-engineered from the start.
The Framework Trap
A Reddit thread from someone who built 25+ production agents confirmed what I experienced:
“Before you reach for CrewAI or LangGraph, ask yourself: Could a single API call with a really good prompt solve 80% of this problem?”
The profitable AI systems they built all use the same stack:
OpenAI API + n8n (or webhook/cron) + Supabase for persistenceNo frameworks. No orchestration. No complex chains.Real examples:
| System | Revenue | Architecture |
|---|---|---|
| Email-to-CRM updater | $200/month | Simple API |
| Resume parser | $50/seat | Simple API |
| Invoice extractor | $500/month | Simple API |
Meanwhile, developers report the same pattern with frameworks:
"I had a whole planner-executor-reviewer pipeline going and spentmore time debugging agent handoffs than the actual task logic."
"Ditched it for one agent with a really detailed spec file andit just works."
"When I do need parallelism I run completely independent agentsthat share nothing except a lock file."Why We Reach for Frameworks
I fell into these traps:
- Marketing makes orchestration feel essential - Every framework demo shows complex multi-agent setups
- Complex feels more “professional” - Simple solutions seem amateur
- FOMO on features - What if I need checkpointing later?
- Research paper envy - Academic papers showcase multi-agent patterns
The result:
- Framework lock-in and dependency management- Debugging agent handoffs instead of task logic- Hidden costs from multiple LLM calls per request- Latency multiplied by orchestration layers- Premature complexity before understanding requirementsDecision Framework
After testing all three approaches, I built this decision tree:
START: What does your task need?
1. Clear input/output transformation? +-- YES --> Can examples demonstrate expected behavior? | +-- YES --> SIMPLE API (80% of cases) | +-- NO --> Does task need state management? | +-- YES --> LANGGRAPH | +-- NO --> SIMPLE API with better prompt
2. Complex branching logic? +-- YES --> LANGGRAPH (conditional execution paths)
3. Distinct agent personas with specific roles? +-- YES --> Does role separation add genuine value? +-- YES --> CREWAI +-- NO --> Try single agent with tool use first
4. Parallel processing with shared state? +-- YES --> LANGGRAPH (parallel nodes with synchronization)Simple API (Start Here - 80% of Cases)
When a single LLM call with a good prompt achieves the goal, frameworks are overhead.
Use simple API when:
- Task has clear input/output transformation
- Examples can demonstrate expected behavior
- Response time matters (< 5 seconds target)
- Cost efficiency is important
- Task doesn’t require state management
from openai import OpenAIfrom pydantic import BaseModel
class ContentAnalysis(BaseModel): sentiment: str topics: list[str] action_items: list[str] confidence: float
def analyze_content(text: str) -> ContentAnalysis: """Simple API call - no frameworks needed""" client = OpenAI()
prompt = f""" Analyze the content and return JSON with: - sentiment: positive/negative/neutral - topics: list of main topics - action_items: list of action items mentioned - confidence: 0.0 to 1.0
Examples: Text: "We need to schedule a meeting about Q4 targets. Team morale is high." Output: {{"sentiment": "positive", "topics": ["Q4 targets", "meeting"], "action_items": ["schedule meeting"], "confidence": 0.95}}
Text: {text} Output: """
response = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": prompt}], response_format={"type": "json_object"} )
return ContentAnalysis.model_validate_json(response.choices[0].message.content)
# Usage: One function, one call, doneresult = analyze_content(email_content)# Cost: ~$0.03, Latency: ~3 secondsWhat this approach gives you:
Single API call:- 1 LLM call- ~2,000 tokens- $0.03 per request- 2-4 seconds latency- Easy to debug- Easy to test- No framework lock-inLangGraph (When You Need Stateful Workflows)
LangGraph shines when workflows have complex branching logic or need state management across multiple steps.
Use LangGraph when:
- Workflow has conditional execution paths
- State management across multiple steps required
- Need for checkpointing/resumable workflows
- Parallel execution with synchronization points
- Fine-grained control over agent flow
from langgraph import StateGraph, ENDfrom typing import TypedDictfrom langchain_openai import ChatOpenAI
class WorkflowState(TypedDict): input: str research_result: str analysis_result: str needs_review: bool final_output: str
def research_node(state: WorkflowState) -> dict: """Research phase - gathers information""" llm = ChatOpenAI(model="gpt-4") result = llm.invoke(f"Research: {state['input']}") return {"research_result": result.content}
def analyze_node(state: WorkflowState) -> dict: """Analysis phase - processes research""" llm = ChatOpenAI(model="gpt-4") result = llm.invoke(f"Analyze: {state['research_result']}") needs_review = "complex" in result.content.lower() return {"analysis_result": result.content, "needs_review": needs_review}
def review_node(state: WorkflowState) -> dict: """Optional review - only for complex cases""" llm = ChatOpenAI(model="gpt-4") result = llm.invoke(f"Review: {state['analysis_result']}") return {"final_output": result.content}
def finalize_node(state: WorkflowState) -> dict: """Direct to final - for simple cases""" return {"final_output": state["analysis_result"]}
# Build the graph with conditional logicworkflow = StateGraph(WorkflowState)workflow.add_node("research", research_node)workflow.add_node("analyze", analyze_node)workflow.add_node("review", review_node)workflow.add_node("finalize", finalize_node)
workflow.set_entry_point("research")workflow.add_edge("research", "analyze")
# Conditional branching based on stateworkflow.add_conditional_edges( "analyze", lambda state: "review" if state["needs_review"] else "finalize", {"review": "review", "finalize": "finalize"})workflow.add_edge("review", END)workflow.add_edge("finalize", END)
app = workflow.compile()# Cost: ~$0.10-0.15, Latency: ~8-12 secondsThe branching logic is the key feature:
START | vresearch | vanalyze | +-- needs_review=true --> review --> END | +-- needs_review=false --> finalize --> ENDThis is harder to express with simple API calls, and LangGraph provides the structure to manage it cleanly.
CrewAI (When You Need Role-Based Collaboration)
CrewAI is designed for scenarios where distinct agent personas with specific roles add genuine value.
Use CrewAI when:
- Need distinct agent personas with specific roles
- Collaborative problem-solving benefits from role separation
- Each “crew member” has specialized tools/knowledge
- Task naturally decomposes into expert domains
- Want human-like team dynamics (researcher, writer, reviewer)
from crewai import Agent, Task, Crewfrom langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4")
# Define agents with specific roles and personasresearcher = Agent( role="Research Specialist", goal="Gather comprehensive information on the topic", backstory="Expert researcher with 10 years of experience", llm=llm, verbose=True)
writer = Agent( role="Content Writer", goal="Create engaging, well-structured content", backstory="Professional writer specializing in technical content", llm=llm, verbose=True)
editor = Agent( role="Senior Editor", goal="Ensure quality, accuracy, and consistency", backstory="Editor with eye for detail and quality standards", llm=llm, verbose=True)
# Define tasks for each agentresearch_task = Task( description="Research the topic: {topic}", agent=researcher, expected_output="Comprehensive research notes")
writing_task = Task( description="Write article based on research", agent=writer, expected_output="Draft article")
editing_task = Task( description="Edit and finalize the article", agent=editor, expected_output="Final polished article")
# Assemble the crewcrew = Crew( agents=[researcher, writer, editor], tasks=[research_task, writing_task, editing_task], verbose=True)
result = crew.kickoff(inputs={"topic": "AI Agent Frameworks"})# Cost: ~$0.15-0.20, Latency: ~10-15+ secondsThe role separation can help when:
- Researcher focuses on gathering facts (different system prompt)- Writer focuses on narrative flow (different tools/examples)- Editor focuses on quality gates (different evaluation criteria)
Each agent has:- Distinct backstory and expertise- Specific tools for their domain- Clear output expectationsBut I’ve found this often overcomplicates tasks that a single agent with a comprehensive prompt could handle.
Cost and Latency Comparison
I tracked actual production costs:
Per request comparison:
| Approach | LLM Calls | Tokens | Cost | Latency ||---------------|-----------|---------|---------|------------|| Simple API | 1 | ~2,000 | $0.03 | 2-4 sec || LangGraph (3) | 3+ | ~8,000 | $0.12 | 6-12 sec || CrewAI (3) | 3+ | ~10,000 | $0.15 | 10-15+ sec |
Monthly cost (1000 requests/day):- Simple API: ~$900/month- LangGraph: ~$3,600/month- CrewAI: ~$4,500/month
Annual difference: $40,000+ between simple and complexDevelopment overhead also differs:
| Approach | Setup Complexity | Debugging Difficulty ||-------------|------------------|----------------------|| Simple API | Low | Low || LangGraph | Medium | Medium || CrewAI | Medium-High | High |
With CrewAI, I spent more time debugging agent handoffs than the actual task logic.Common Mistakes I Made
Mistake 1: Framework-First Thinking
# WRONG: Choose framework first, then fit the problemfrom crewai import Crewcrew = Crew(agents=[...], tasks=[...])result = crew.kickoff()
# RIGHT: Solve the problem first, add framework if neededresponse = client.chat.completions.create(...)# If that works, ship it. Only add complexity when you hit walls.Mistake 2: Not Calculating Costs
I didn’t realize my 3-agent system cost 5x more per request until I saw the monthly bill.
Mistake 3: Ignoring Latency
User waiting for email analysis:- Simple API: 3 seconds (feels instant)- LangGraph: 10 seconds (feels slow)- CrewAI: 15+ seconds (user might refresh)
Latency affects user experience and conversion rates.Mistake 4: Copying Research Paper Patterns
Academic papers showcase multi-agent architectures because that’s what gets published. Production systems need reliability, not novelty.
Academic paper priorities:- Novel architecture- Complex agent interactions- Publishable contribution
Production priorities:- Reliable execution- Minimal failure points- Cost efficiencyWhen Frameworks Are Worth It
I’m not saying frameworks are always wrong. They’re just overused.
LangGraph justified:
- Customer support with conditional escalation paths
- Multi-step research workflows with decision trees
- Human-in-the-loop workflows with approval gates
- Document processing with validation checkpoints
CrewAI justified:
- Multi-perspective content creation where different viewpoints add value
- Educational simulations with distinct expert roles
- Business analysis crews (analyst, strategist, reviewer)
- Code analysis covering whole systems (security, performance, style)
A Reddit comment summarized it well:
“For research assistant you can easily use one single agent and one single high quality prompt. For code analyzers covering whole systems it won’t work with single agent.”
Summary
I spent weeks building complex multi-agent systems before realizing I was solving the wrong problem. The frameworks weren’t the solution - they were the obstacle.
Key takeaways:
- Start with a simple API call and excellent prompts
- Measure results before adding complexity
- Add LangGraph when workflow complexity demands state management
- Add CrewAI when role-based collaboration adds genuine value
- Calculate costs before committing to an architecture
Before your next AI feature, try this:
- Prototype with a single API call
- Measure accuracy, cost, latency
- If it achieves 80% of your goal, ship it
- Only add framework complexity when you hit specific walls
Most AI systems work best with the simple stack: OpenAI API + webhook/cron trigger + database for persistence. No frameworks, no orchestration, no complex chains. That’s the whole thing.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 Reddit: Framework-first thinking is a trap
- 👨💻 LangGraph Documentation
- 👨💻 CrewAI Framework
- 👨💻 OpenAI API Reference
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments