Claude Code Architecture: How Anthropic's Agent Balances Simplicity and Power

Mar 25, 2026

I assumed Claude Code used a sophisticated multi-agent architecture. Complex tool orchestration, distributed task queues, maybe even a graph-based workflow engine. When I finally understood the actual implementation, I was shocked: it’s basically a while loop.

But that simplicity is exactly why it works. Let me explain how Anthropic built an agent that contributed 4% of GitHub’s public commits in February 2026 with an architecture you could explain in five minutes.

The Core Loop: Deceptively Simple

At its heart, Claude Code is a single-threaded agent that follows this pattern:

┌─────────────────────────────────────────────────────────────┐
│  1. Receive user request                                     │
│  2. Build message history (system + user + tool results)     │
│  3. Call Claude API with tools available                     │
│  4. If no tool calls → return response                       │
│  5. If tool calls → execute each, append results to history  │
│  6. Go to step 3                                             │
└─────────────────────────────────────────────────────────────┘

I expected something like LangGraph’s state machine or AutoGen’s multi-agent conversation patterns. Instead, here’s the simplified core:

def claude_code_loop(user_request):
    messages = [system_prompt, user_request]

    while True:
        response = claude.chat(messages=messages, tools=tools)

        if not response.tool_calls:
            return response.content

        for tool_call in response.tool_calls:
            result = execute_tool(tool_call)

            # Only append - preserve cache
            messages.append({
                "role": "tool_result",
                "content": result
            })

That’s it. No complex state management. No multi-agent coordination. Just a flat message history that grows with each tool interaction.

Why This Works: The Cache Advantage

I initially thought this was a limitation. But the flat message history has a critical advantage: prompt caching.

When you use Anthropic’s prompt caching, you pay full price for the first request but only 10% for cached tokens on subsequent calls. Claude Code maximizes this by:

Keeping system prompt at the start (always cached)
Only appending new content to the end
Never restructuring the message list

def build_messages(system_prompt, user_request, tool_results):
    # System prompt is always first for caching
    messages = [{"role": "system", "content": system_prompt}]

    # User request follows
    messages.append({"role": "user", "content": user_request})

    # Tool results are appended, never inserted
    for result in tool_results:
        messages.append({"role": "tool_result", "content": result})

    return messages

If Claude Code restructured messages (like some multi-agent frameworks do), it would lose cache hits and cost significantly more.

The Evolution: From TODO to Tasks

In January 2026 (v2.1.16), Anthropic added the Tasks system. I assumed this was just a renamed TODO list. I was wrong.

The old TODO approach was:

class TodoList:
    def __init__(self):
        self.items = []  # In-memory only

    def add(self, description):
        self.items.append({"description": description, "done": False})

    def complete(self, index):
        self.items[index]["done"] = True

This had three fatal flaws:

Lost on restart - If Claude Code crashed, the TODO list vanished
No dependencies - You couldn’t say “Task B depends on Task A”
Single-agent only - Multiple agents couldn’t share the list

The new Tasks system solves all three:

from dataclasses import dataclass, field
from typing import List, Set
from pathlib import Path
import json

@dataclass
class Task:
    id: str
    description: str
    dependencies: List[str] = field(default_factory=list)
    status: str = "pending"  # pending, in_progress, completed, failed

    def can_start(self, completed_task_ids: Set[str]) -> bool:
        """Check if all dependencies are satisfied."""
        return all(dep in completed_task_ids for dep in self.dependencies)

class TaskManager:
    def __init__(self, tasks_dir: str = "~/.claude/tasks"):
        self.tasks_dir = Path(tasks_dir).expanduser()
        self.tasks_dir.mkdir(parents=True, exist_ok=True)

    def create_task(self, task: Task) -> None:
        """Persist task to disk."""
        task_file = self.tasks_dir / f"{task.id}.json"
        task_file.write_text(json.dumps({
            "id": task.id,
            "description": task.description,
            "dependencies": task.dependencies,
            "status": task.status
        }))

    def load_tasks(self) -> List[Task]:
        """Load all tasks from disk."""
        tasks = []
        for task_file in self.tasks_dir.glob("*.json"):
            data = json.loads(task_file.read_text())
            tasks.append(Task(**data))
        return tasks

    def get_ready_tasks(self) -> List[Task]:
        """Get tasks with all dependencies satisfied."""
        tasks = self.load_tasks()
        completed = {t.id for t in tasks if t.status == "completed"}
        return [t for t in tasks if t.status == "pending" and t.can_start(completed)]

The key differences:

Feature	Old TODO	New Tasks
Persistence	In-memory	File-based at ~/.claude/tasks/
Dependencies	None	DAG with dependency tracking
Crash recovery	Lost	Survives restart
Multi-agent	No	Shared with file locking

Agent Teams: When Simple Isn’t Enough

For most tasks, single-agent Claude Code is sufficient. But sometimes you need parallel work. That’s where Agent Teams comes in.

I tried using Agent Teams for a large refactoring project. Here’s what I learned:

┌─────────────────────────────────────────────────────────────┐
│                      Coordinator Agent                       │
│                    (main Claude Code instance)               │
└─────────────────────────────────────────────────────────────┘
                              │
          ┌──────────────────┼──────────────────┐
          │                  │                  │
          ▼                  ▼                  ▼
   ┌─────────────┐   ┌─────────────┐   ┌─────────────┐
   │ Teammate 1  │   │ Teammate 2  │   │ Teammate 3  │
   │ (independent│   │ (independent│   │ (independent│
   │  context)   │   │  context)   │   │  context)   │
   └─────────────┘   └─────────────┘   └─────────────┘
          │                  │                  │
          └──────────────────┼──────────────────┘
                              │
                              ▼
                   ┌─────────────────────┐
                   │   Shared Tasks      │
                   │   (file-based)      │
                   └─────────────────────┘

Each Teammate is an independent Claude Code instance:

Own context - Loads project CLAUDE.md independently
Own Skills - Can use specialized skills
Shared tasks - Coordinates via file-based task list
Mailbox system - Asynchronous message passing

But there’s a catch. The experimental status warning in the docs is real:

- Token cost: ~5x compared to single agent
- Context duplication: Each agent maintains full context
- Coordination overhead: Messages between agents add tokens
- Not production-ready: Still experimental in v2.1.71

I used Agent Teams to parallelize a documentation update across 50 files. It worked, but my token bill was 4.2x higher than expected. For most tasks, the single-agent approach is more cost-effective.

The Extreme Test: 16 Agents Building a Compiler

The most impressive demonstration of Agent Teams was when Anthropic engineers used 16 parallel agents to build a 100K-line Rust C compiler that could compile the Linux 6.9 kernel.

How they made it work:

from dataclasses import dataclass
from typing import List, Dict
from enum import Enum

class AgentRole(Enum):
    LEXER = "lexer"
    PARSER = "parser"
    CODEGEN = "codegen"
    OPTIMIZER = "optimizer"
    TESTER = "tester"

@dataclass
class AgentAssignment:
    agent_id: str
    role: AgentRole
    files_assigned: List[str]
    dependencies: List[str]  # Other agents this one depends on

def coordinate_compiler_build(agents: List[AgentAssignment]) -> None:
    """
    Coordinate 16 agents to build a C compiler.

    Key strategies:
    1. Clear module boundaries (lexer, parser, etc.)
    2. Interface contracts defined upfront
    3. Shared test suite for integration
    4. File locking for concurrent writes
    """
    for agent in agents:
        # Each agent works on its module
        # Results shared via file-based tasks
        pass

This demonstrates when multi-agent makes sense: highly parallelizable work with clear boundaries.

Design Philosophy: Start Simple, Add Complexity Only When Needed

What struck me most about Claude Code’s architecture is the restraint. Anthropic could have built:

A graph-based workflow engine
A distributed multi-agent framework from day one
Complex state machines for tool orchestration

Instead, they followed a principle: “Use the simplest architecture that works.”

v1.0 → Core loop (while + tool calls)
         │
         ▼ (months of iteration)
v2.1.16 → Tasks system (persistence + dependencies)
         │
         ▼ (when parallelism needed)
v2.1.71 → Agent Teams (experimental multi-agent)

Each evolution added only what was necessary:

Phase	Problem Solved	Complexity Added
Core loop	Basic agent functionality	Minimal
Tasks	Persistence, dependencies, recovery	File I/O
Agent Teams	Parallel work on large tasks	5x token cost

Common Misconceptions

I held several wrong beliefs about Claude Code before digging deeper:

Misconception 1: “Claude Code uses complex multi-agent by default”

No. Single-threaded execution is the default. Agent Teams is opt-in and experimental.

Misconception 2: “Tasks are just renamed TODOs”

No. Tasks have DAG dependencies, file persistence, and cross-agent sharing. TODOs were in-memory lists.

Misconception 3: “Agent Teams is production-ready”

No. The docs explicitly mark it experimental with 5x cost overhead.

Misconception 4: “The architecture is hidden/proprietary”

No. Anthropic has been transparent about the design. It’s a simple loop with progressive enhancement.

Why This Matters for Your Own Agents

If you’re building your own AI agents, Claude Code’s architecture offers a blueprint:

class SimpleAgent:
    """Minimal agent that works."""

    def __init__(self, model, tools):
        self.model = model
        self.tools = tools
        self.messages = []

    def run(self, user_input: str) -> str:
        self.messages.append({"role": "user", "content": user_input})

        while True:
            response = self.model.chat(
                messages=self.messages,
                tools=self.tools
            )

            if not response.tool_calls:
                return response.content

            for tool_call in response.tool_calls:
                result = self.execute(tool_call)
                self.messages.append({
                    "role": "tool_result",
                    "content": result
                })

    def execute(self, tool_call) -> str:
        tool = self.tools.get(tool_call.name)
        return tool(**tool_call.arguments)

Add complexity only when you hit real limits:

Need persistence? Add file-based task storage
Need dependencies? Add DAG tracking
Need parallelism? Add multi-agent coordination
Need memory? Add retrieval or summarization

But start with the simple loop. It handles 90% of use cases.

Summary

In this post, I explained Claude Code’s architecture and why its simplicity is a feature, not a limitation. The key point is that Anthropic deliberately chose a simple single-threaded loop, evolving it with persistent Tasks and optional Agent Teams only when needed. The core loop with flat message history maximizes prompt caching. Tasks add persistence and dependencies without breaking simplicity. Agent Teams provide parallelism for extreme cases but at 5x cost. The lesson: start simple, add complexity only when you hit real limits.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Anthropic Agent Documentation
👨‍💻 Claude Code GitHub Discussions

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!