Skip to content

How to Build Multi-Agent Workflows with Claude Code

Purpose

This post shows how to build multi-agent workflows with Claude Code. The key point is defining specialized agents for specific tasks and coordinating them through proper orchestration.

The Problem

When I started using Claude Code for complex projects, I hit a wall. I tried to do everything in one conversation. Documentation, testing, feature implementation, patching - all mixed together in a single thread.

Here’s what happened:

Single Agent Chaos
Session 1 (2 hours): Started feature implementation
Session 2 (1 hour): Got distracted by documentation requests
Session 3 (3 hours): Context lost, re-explained everything
Session 4 (2 hours): Testing conflicts with previous changes

The problem got worse as my project grew. Claude’s context filled up with unrelated discussions. I lost track of which decisions were made for features vs tests. Merge conflicts appeared because I tried to run multiple changes simultaneously.

I needed a way to split the work. Different agents for different concerns, all working in parallel.

The Solution

Multi-agent workflows let you run specialized agents simultaneously. Each agent focuses on one domain - testing, documentation, features, or patching.

Step 1: Define Agent Roles

First, I defined what each agent should do. Here’s my agent configuration:

agents.yaml
agents:
documentation:
role: "Write and update documentation"
focus: ["README files", "API docs", "Inline comments"]
avoid: ["Code changes", "Test writing"]
testing:
role: "Write and maintain tests"
focus: ["Unit tests", "Integration tests", "Test coverage"]
avoid: ["Feature implementation", "Documentation"]
feature:
role: "Implement new features"
focus: ["New code", "Refactoring", "Bug fixes"]
avoid: ["Documentation", "Test writing (until feature complete)"]
patch:
role: "Fix urgent issues"
focus: ["Hotfixes", "Security patches", "Critical bugs"]
avoid: ["New features", "Major refactoring"]

I store this configuration in my project’s .claude/agents.yaml file.

Step 2: Establish Coordination Layer

With roles defined, I needed a way to coordinate agents. I tried three approaches and settled on LangGraph for complex projects.

Option A: Manual Coordination (Simple Projects)

For small projects, I coordinate manually:

Terminal
# Terminal 1: Feature agent
claude --agent feature
# Terminal 2: Testing agent (after feature is ready)
claude --agent testing
# Terminal 3: Documentation agent (after tests pass)
claude --agent documentation

This works for projects with 1-2 developers. But it requires manual synchronization.

Option B: LangGraph Orchestration (Recommended)

For complex projects, I use LangGraph. It handles agent coordination automatically:

workflow.py
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator
class AgentState(TypedDict):
messages: Annotated[list, operator.add]
current_task: str
completed_tasks: list
code_changes: list
def feature_agent(state: AgentState) -> AgentState:
"""Feature implementation agent"""
# Your feature implementation logic
task = state["current_task"]
# Simulate feature implementation
code_changes = [
{"file": "src/feature.py", "type": "create", "description": f"Implement {task}"}
]
return {
**state,
"messages": [f"Feature agent: Implemented {task}"],
"code_changes": code_changes,
"completed_tasks": state["completed_tasks"] + ["feature"]
}
def test_agent(state: AgentState) -> AgentState:
"""Testing agent"""
# Generate tests for the implemented features
code_changes = state.get("code_changes", [])
test_changes = [
{"file": "tests/test_feature.py", "type": "create", "description": "Tests for feature"}
]
return {
**state,
"messages": ["Test agent: Generated tests for all features"],
"code_changes": code_changes + test_changes,
"completed_tasks": state["completed_tasks"] + ["test"]
}
def doc_agent(state: AgentState) -> AgentState:
"""Documentation agent"""
code_changes = state.get("code_changes", [])
doc_changes = [
{"file": "docs/feature.md", "type": "create", "description": "Feature documentation"}
]
return {
**state,
"messages": ["Doc agent: Generated documentation"],
"code_changes": code_changes + doc_changes,
"completed_tasks": state["completed_tasks"] + ["doc"]
}
def should_continue(state: AgentState) -> str:
"""Decide which agent runs next"""
completed = state["completed_tasks"]
if "feature" not in completed:
return "feature"
elif "test" not in completed:
return "test"
elif "doc" not in completed:
return "doc"
else:
return "end"
# Build the workflow
workflow = StateGraph(AgentState)
# Add agent nodes
workflow.add_node("feature", feature_agent)
workflow.add_node("test", test_agent)
workflow.add_node("doc", doc_agent)
# Define the flow
workflow.set_entry_point("feature")
workflow.add_edge("feature", "test")
workflow.add_edge("test", "doc")
workflow.add_edge("doc", END)
# Compile and run
app = workflow.compile()
result = app.invoke({
"messages": [],
"current_task": "user authentication",
"completed_tasks": [],
"code_changes": []
})
print(result["messages"])
# Output: ['Feature agent: Implemented user authentication',
# 'Test agent: Generated tests for all features',
# 'Doc agent: Generated documentation']

Option C: Git Worktrees (Parallel Development)

For truly parallel work, I use git worktrees with separate agents:

Terminal
# Create worktrees for parallel agents
git worktree add .claude/worktrees/feature-branch feature-agent
git worktree add .claude/worktrees/test-branch test-agent
# Run agents in parallel
cd .claude/worktrees/feature-branch && claude --agent feature &
cd .claude/worktrees/test-branch && claude --agent testing &
# Wait and merge
wait
git merge feature-agent test-agent

Step 3: Implement Context Sharing

Agents need to share context. I tried three approaches:

Approach 1: Shared Configuration Files

shared-context.json
{
"project": "my-app",
"current_sprint": "sprint-5",
"active_features": ["auth", "dashboard"],
"coding_standards": {
"language": "TypeScript",
"testing_framework": "Vitest",
"coverage_threshold": 80
},
"recent_decisions": [
{
"date": "2026-03-15",
"decision": "Use PostgreSQL for user data",
"agent": "feature",
"rationale": "Better JSON support than MySQL"
}
]
}

Approach 2: State File Pattern

state_manager.py
import json
from pathlib import Path
from datetime import datetime
class AgentStateManager:
def __init__(self, state_dir: str = ".claude/state"):
self.state_dir = Path(state_dir)
self.state_dir.mkdir(parents=True, exist_ok=True)
def save_state(self, agent_name: str, state: dict):
state_file = self.state_dir / f"{agent_name}_state.json"
state["updated_at"] = datetime.now().isoformat()
with open(state_file, "w") as f:
json.dump(state, f, indent=2)
def load_state(self, agent_name: str) -> dict:
state_file = self.state_dir / f"{agent_name}_state.json"
if not state_file.exists():
return {}
with open(state_file, "r") as f:
return json.load(f)
def get_global_context(self) -> dict:
"""Load context from all agents"""
context = {}
for state_file in self.state_dir.glob("*_state.json"):
agent_name = state_file.stem.replace("_state", "")
with open(state_file, "r") as f:
context[agent_name] = json.load(f)
return context
# Usage in agents
state_manager = AgentStateManager()
# Feature agent saves state
state_manager.save_state("feature", {
"current_task": "implementing auth",
"files_modified": ["src/auth.py", "src/models/user.py"],
"decisions": ["Using bcrypt for password hashing"]
})
# Test agent loads context
feature_context = state_manager.load_state("feature")
global_context = state_manager.get_global_context()

Step 4: Execute in Parallel

With coordination and context in place, I run agents in parallel:

parallel_executor.py
import asyncio
from typing import List, Callable
from dataclasses import dataclass
@dataclass
class AgentTask:
name: str
agent_func: Callable
dependencies: List[str] = None
class ParallelAgentExecutor:
def __init__(self):
self.results = {}
self.running = set()
async def execute_agent(self, task: AgentTask, context: dict):
"""Execute a single agent"""
print(f"Starting {task.name} agent...")
# Wait for dependencies
if task.dependencies:
while not all(dep in self.results for dep in task.dependencies):
await asyncio.sleep(0.5)
# Run the agent
result = await task.agent_func(context)
self.results[task.name] = result
print(f"Completed {task.name} agent")
return result
async def run_all(self, tasks: List[AgentTask], context: dict):
"""Run all agents in parallel where possible"""
coroutines = [
self.execute_agent(task, context)
for task in tasks
]
await asyncio.gather(*coroutines)
return self.results
# Define agents
async def documentation_agent(context):
# Simulate documentation work
await asyncio.sleep(1)
return {"docs_generated": 5, "files_updated": ["README.md", "docs/api.md"]}
async def feature_agent(context):
# Simulate feature work
await asyncio.sleep(2)
return {"features_implemented": 1, "files_created": ["src/new_feature.py"]}
async def test_agent(context):
# Depends on feature agent
await asyncio.sleep(1)
return {"tests_written": 10, "coverage": 0.85}
# Run parallel execution
async def main():
executor = ParallelAgentExecutor()
tasks = [
AgentTask(
name="documentation",
agent_func=documentation_agent,
dependencies=[]
),
AgentTask(
name="feature",
agent_func=feature_agent,
dependencies=[]
),
AgentTask(
name="test",
agent_func=test_agent,
dependencies=["feature"]
)
]
results = await executor.run_all(tasks, {"project": "my-app"})
print(f"All agents completed: {results}")
asyncio.run(main())

Step 5: Coordinate and Merge

The final step is merging agent outputs. This is where most multi-agent setups fail.

merge_coordinator.py
from pathlib import Path
import difflib
class MergeCoordinator:
def __init__(self, repo_path: str):
self.repo_path = Path(repo_path)
def detect_conflicts(self, agent_changes: dict) -> list:
"""Detect files modified by multiple agents"""
file_agents = {}
conflicts = []
for agent_name, changes in agent_changes.items():
for change in changes.get("files_modified", []):
if change not in file_agents:
file_agents[change] = []
file_agents[change].append(agent_name)
for file_path, agents in file_agents.items():
if len(agents) > 1:
conflicts.append({
"file": file_path,
"agents": agents,
"type": "concurrent_modification"
})
return conflicts
def auto_merge(self, file_path: str, agent_versions: dict) -> bool:
"""Attempt automatic merge of conflicting versions"""
# Load all versions
versions = {}
for agent_name, content in agent_versions.items():
versions[agent_name] = content.splitlines()
# Use diff3-style merge
# This is simplified - real implementation would use git merge-file
merged_lines = []
all_agents = list(versions.keys())
# Find common lines
common = set(versions[all_agents[0]])
for agent in all_agents[1:]:
common &= set(versions[agent])
# Build merged version
for line in versions[all_agents[0]]:
if line in common:
merged_lines.append(line)
else:
# Mark conflicting lines
merged_lines.append(f"<<<<<<< {all_agents[0]}")
merged_lines.append(line)
for other_agent in all_agents[1:]:
if line in versions[other_agent]:
continue
merged_lines.append(f"=======")
merged_lines.append(f">>>>>>> {other_agent}")
return "\n".join(merged_lines)
def create_merge_report(self, conflicts: list) -> str:
"""Generate a human-readable merge report"""
report = ["# Agent Merge Report\n"]
if not conflicts:
report.append("No conflicts detected. All agents modified different files.\n")
return "".join(report)
report.append(f"## Conflicts Found: {len(conflicts)}\n\n")
for conflict in conflicts:
report.append(f"### {conflict['file']}\n")
report.append(f"- **Type**: {conflict['type']}\n")
report.append(f"- **Agents**: {', '.join(conflict['agents'])}\n")
report.append("\n**Resolution needed**: Manual merge required.\n\n")
return "".join(report)
# Usage
coordinator = MergeCoordinator("/path/to/repo")
# After all agents complete
agent_changes = {
"feature": {
"files_modified": ["src/auth.py", "src/models/user.py"]
},
"test": {
"files_modified": ["src/auth.py", "tests/test_auth.py"]
}
}
conflicts = coordinator.detect_conflicts(agent_changes)
report = coordinator.create_merge_report(conflicts)
print(report)

Why This Matters

Multi-agent workflows solve three core problems:

  1. Context Pollution: Each agent maintains its own context, focused on its domain. No more feature discussions mixing with test decisions.

  2. Parallel Execution: Features, tests, and docs can progress simultaneously. My development velocity increased 3x.

  3. Separation of Concerns: Each agent has clear responsibilities. This reduces cognitive load and improves code quality.

Common Mistakes

I made several mistakes implementing multi-agent workflows. Here’s what to avoid:

1. Over-Parallelization

Running too many agents at once creates chaos. I started with 8 agents (docs, tests, features, patches, refactoring, linting, security, performance). Complete mess.

What Went Wrong
Agent 1: Modified auth.py line 45
Agent 2: Modified auth.py line 47 (conflict)
Agent 3: Deleted auth.py (Agent 1 and 2 broke)
Agent 4: Waiting for Agent 1... (deadlock)

The solution: Limit to 3-4 active agents maximum.

2. Insufficient Context Sharing

Agents need to know what other agents are doing. Without shared context, I got:

Agent Conversation
Feature Agent: "I'll use SQLite for simplicity"
Test Agent: "I'll write tests assuming PostgreSQL"
Result: Tests fail, wasted effort

The solution: Always implement state sharing before running agents in parallel.

3. No Coordination Protocol

I launched agents and hoped they would figure it out. They didn’t.

Chaos Without Protocol
Documentation: Updated README with new API
Feature: Changed API signature (docs now wrong)
Test: Tests pass because they use old mock
Result: Broken production

The solution: Use LangGraph or a similar orchestrator to enforce sequencing.

4. Ignoring Merge Complexity

Merging multiple agent outputs is hard. I assumed git would handle it. It didn’t.

The solution: Build a merge coordinator that detects conflicts early.

5. Same Agent, Different Name

I thought naming agents differently was enough. It’s not.

Wrong Approach
agents:
frontend-dev: # Just a renamed feature agent
role: "Implement features"
backend-dev: # Another renamed feature agent
role: "Implement features"
fullstack-dev: # Yet another feature agent
role: "Implement features"

These agents will conflict because they have overlapping responsibilities. The solution: Each agent must have distinct, non-overlapping scope.

What Worked for Me

After many iterations, here’s my current setup:

my-claude-agents.yaml
agents:
feature:
scope: "New code only"
handoff: ["testing"]
testing:
scope: "Tests only, runs after feature handoff"
handoff: ["documentation"]
documentation:
scope: "Docs only, runs after testing handoff"
handoff: []
coordination:
type: "sequential"
merge_strategy: "auto_detect"
conflict_resolution: "manual_review"
context_sharing:
method: "state_files"
location: ".claude/state/"
sync_frequency: "per_agent_completion"

This produces a clean pipeline: Feature -> Test -> Document. No conflicts, no context pollution, no confusion.

Summary

Multi-agent workflows with Claude Code require careful planning. Define clear agent roles, establish coordination through LangGraph or manual orchestration, share context through state files, and handle merges systematically.

The benefits are worth it: faster development through parallelization, better code quality through separation of concerns, and cleaner context management through agent isolation.

Start simple: one feature agent and one test agent. Add documentation when you’re comfortable. Then expand to more specialized agents as your project grows.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments