How to Build Opus-Sonnet-Haiku Multi-Model Orchestration
Purpose
I needed to build an AI system that could handle complex workflows without burning my budget. A single-model approach meant either overspending on simple tasks or getting poor results on complex ones. The solution: a hierarchical multi-model architecture.
The Architecture
+------------------+ | OPUS TIER | | Orchestrator | +--------+---------+ | +--------------+--------------+ | | +---------v----------+ +----------v---------+ | SONNET TIER | | SONNET TIER | | Executor #1 | | Executor #2 | +---------+----------+ +----------+---------+ | | +---------v----------+ +----------v---------+ | HAIKU SWARM | | HAIKU SWARM | | Sub-Agents | | Sub-Agents | | [H][H][H][H][H] | | [H][H][H][H][H] | +--------------------+ +--------------------+Opus plans and coordinates. Sonnet executes complex tasks. Haiku handles high-volume narrow operations.
What Each Tier Does
Opus (Orchestrator)
Opus handles strategic decisions:
- Analyze complex problems and decompose into subtasks
- Create execution plans with dependencies
- Route tasks to appropriate executors
- Synthesize results from multiple agents
Sonnet (Executor)
Sonnet does the heavy lifting:
- Execute plans created by Opus
- Implement features from specifications
- Debug and fix issues requiring reasoning
- Generate content requiring nuance
Haiku (Sub-Agent)
Haiku swarms handle volume:
- Execute narrow, well-defined operations
- Process large batches in parallel
- Perform validation and classification
- Extract and transform data
How to Implement It
Hereβs a LangGraph implementation:
from langgraph.graph import StateGraph, ENDfrom typing import TypedDict, List, Annotatedimport operatorimport asyncio
class OrchestratorState(TypedDict): task: str plan: dict subtasks: List[dict] results: Annotated[List[dict], operator.add] final_output: dict
def build_multi_model_graph(): graph = StateGraph(OrchestratorState)
# Opus creates the plan async def opus_orchestrator(state: OrchestratorState): plan = await opus_client.generate( f"""Analyze this task and create an execution plan:{state['task']}
Output JSON with:- subtasks: list of {{"task": str, "assignee": "sonnet"|"haiku", "priority": int}}- coordination_strategy: "parallel" | "sequential" | "hybrid"""" ) return {"plan": plan, "subtasks": plan["subtasks"]}
# Sonnet executes complex tasks async def sonnet_executor(state: OrchestratorState): sonnet_tasks = [t for t in state["subtasks"] if t["assignee"] == "sonnet"]
# Execute in parallel (max 5 concurrent) results = await asyncio.gather(*[ sonnet_client.execute(task["task"]) for task in sonnet_tasks[:5] ])
return {"results": [{"task": t["task"], "result": r} for t, r in zip(sonnet_tasks, results)]}
# Haiku handles high-volume tasks async def haiku_swarm(state: OrchestratorState): haiku_tasks = [t for t in state["subtasks"] if t["assignee"] == "haiku"]
# Execute in parallel (max 50 concurrent) results = await asyncio.gather(*[ haiku_client.execute(task["task"]) for task in haiku_tasks[:50] ])
return {"results": [{"task": t["task"], "result": r} for t, r in zip(haiku_tasks, results)]}
# Opus synthesizes final output async def opus_synthesizer(state: OrchestratorState): final = await opus_client.generate( f"""Synthesize these results:Original task: {state['task']}Results: {state['results']}
Create a comprehensive response.""" ) return {"final_output": final}
# Build the graph graph.add_node("orchestrator", opus_orchestrator) graph.add_node("sonnet_executor", sonnet_executor) graph.add_node("haiku_swarm", haiku_swarm) graph.add_node("synthesizer", opus_synthesizer)
# Define edges graph.set_entry_point("orchestrator") graph.add_edge("orchestrator", "sonnet_executor") graph.add_edge("orchestrator", "haiku_swarm") graph.add_edge("sonnet_executor", "synthesizer") graph.add_edge("haiku_swarm", "synthesizer") graph.add_edge("synthesizer", END)
return graph.compile()Task Router
Route tasks based on complexity:
class MultiModelRouter: async def classify_task(self, task: str) -> str: """Use Haiku to classify task complexity.""" classification = await self.haiku.generate( f"""Classify this task's complexity:
Task: {task}
Rules:- "haiku": Narrow, well-defined, explicit instructions- "sonnet": Moderate complexity, needs context- "opus": Complex reasoning, architectural decisions
Output JSON: {{"tier": str, "reasoning": str}}""" ) return classification["tier"]
async def route(self, task: str) -> str: tier = await self.classify_task(task)
if tier == "opus": return await self.opus.generate(task) elif tier == "sonnet": return await self.sonnet.generate(task) else: return await self.haiku.generate(task)Cost Comparison
The savings compound quickly:
Example: Research Report Generation
Traditional (Single Opus):- Cost: $15.00 for 100k input + 20k output tokens- Time: 45 minutes sequential
Multi-Model Orchestration:- Opus (planning): 10k tokens = $1.50- Sonnet (drafting): 30k tokens = $0.90- Haiku swarm (data gathering): 200k tokens = $2.00- Opus (synthesis): 15k tokens = $2.25- Total: $6.65 (56% savings)- Time: 15 minutes with parallel executionCommon Mistakes
Mistake 1: Opus Doing Grunt Work
# WRONG: Opus classifying reviewsdef analyze_sentiment(reviews): return opus.batch_classify(reviews, ["positive", "negative"])
# RIGHT: Opus plans, Haiku executesasync def analyze_sentiment(reviews): criteria = await opus.define_criteria("Create sentiment classification rules") results = await haiku_swarm.classify(reviews, criteria=criteria) return resultsMistake 2: Sequential When Parallel Works
# WRONG: Sequential processingresults = []for doc in documents: results.append(haiku.extract(doc))
# RIGHT: Parallel processingresults = await asyncio.gather(*[ haiku.extract(doc) for doc in documents])Summary
In this post, I showed how to build multi-model orchestration with Opus, Sonnet, and Haiku. The key point is: Opus orchestrates, Sonnet executes, Haiku handles volume. Build systems that route intelligently between all three based on task complexity.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- π¨βπ» LangGraph Documentation
- π¨βπ» Anthropic Claude Models
- π¨βπ» LangChain Multi-Agent Patterns
Oh, and if you found these resources useful, donβt forget to support me by starring the repo on GitHub!
Comments