Skip to content

How Do You Design an AI Organization System with Departments, Managers, and Employees?

I was drowning in AI tasks. Multiple Claude Code sessions running in different terminals, each with its own context. One for research, one for product specs, another for coding. Tasks were getting lost. Context was bleeding between domains. My API bills were spiraling, and I had no idea which “department” was burning through tokens.

The breaking point came when I accidentally sent a research task to my engineering context and spent 30 minutes debugging why Claude was giving me implementation details instead of literature reviews.

I needed organization. Not just a task list—an actual organizational structure.

The Problem with Flat AI Contexts

Most people use AI agents like this:

Problem: Flat AI Usage
Terminal 1: claude (research stuff, code stuff, random stuff)
Terminal 2: claude (more random stuff)
Terminal 3: claude (even more stuff mixed together)

This creates several issues:

  1. Context pollution: Every task accumulates in the same context window
  2. Task collision: Two terminals working on related things without coordination
  3. Zero visibility: No idea what’s happening across all your AI workstreams
  4. No cost tracking: API usage is a black box until you get the bill

I needed something more structured. Something that mirrored how actual organizations work.

Borrowing from Corporate Structure

Real organizations don’t have everyone reporting to one person. They have:

  • Departments with focused domains
  • Managers who coordinate within departments
  • Workers who execute specific tasks
  • Task boards that track progress and dependencies

What if I built my AI system the same way?

Here’s the structure I ended up with:

AI Organization Hierarchy
Executive Agent (Strategy & Planning)
|
+-- Research Department
| +-- Manager Agent (Literature Review, Synthesis)
| +-- Worker Agents (Paper Analysis, Citation Mining)
|
+-- Product Department
| +-- Manager Agent (Roadmap, Prioritization)
| +-- Worker Agents (Feature Specs, User Stories)
|
+-- Engineering Department
| +-- Manager Agent (Code Review, Architecture)
| +-- Worker Agents (Implementation, Testing)
|
+-- Operations Department
+-- Manager Agent (Monitoring, Incidents)
+-- Worker Agents (Alerts, Maintenance)

Each department maintains its own context. Managers coordinate within their domain. Workers focus on execution.

The Core Components

1. Task Board (Kanban)

First, I needed a way to track tasks with dependencies. A task shouldn’t be “ready” until its dependencies are done.

models/task.py
from enum import Enum
from typing import Optional
from pydantic import BaseModel
from datetime import datetime
class TaskStatus(str, Enum):
BACKLOG = "backlog"
READY = "ready"
IN_PROGRESS = "in_progress"
REVIEW = "review"
DONE = "done"
class Department(str, Enum):
RESEARCH = "research"
PRODUCT = "product"
ENGINEERING = "engineering"
OPERATIONS = "operations"
class Task(BaseModel):
id: str
title: str
description: str
department: Department
dependencies: list[str] = []
priority: int = 0
status: TaskStatus = TaskStatus.BACKLOG
assigned_agent: Optional[str] = None
created_at: datetime
updated_at: datetime

The key insight: status isn’t just manually set. It’s computed based on dependencies.

services/dependency_resolver.py
def resolve_task_status(task: Task, all_tasks: list[Task]) -> TaskStatus:
"""Check if all dependencies are done before marking ready."""
if task.status != TaskStatus.BACKLOG:
return task.status
for dep_id in task.dependencies:
dep = next((t for t in all_tasks if t.id == dep_id), None)
if not dep or dep.status != TaskStatus.DONE:
return TaskStatus.BACKLOG
return TaskStatus.READY

This means I can add a task with dependencies and it automatically becomes “ready” when those dependencies complete. No manual tracking.

2. Agent Registry

Each agent needs to know its role, capabilities, and current workload:

models/agent.py
class Agent(BaseModel):
id: str
name: str
department: Department
role: str # "executive", "manager", "worker"
capabilities: list[str]
current_tasks: list[str] = []
cost_budget: float
cost_used: float = 0.0
backend: str # "claude", "codex", "gemini"

The cost_budget field is critical. Without it, API costs spiral out of control.

3. The Scheduler (The Real Magic)

The scheduler is a cron job that polls for ready tasks and dispatches them:

services/scheduler.py
import asyncio
from typing import Callable
class AIScheduler:
def __init__(self, task_repo, agent_registry, dispatcher):
self.task_repo = task_repo
self.agent_registry = agent_registry
self.dispatcher = dispatcher
async def poll_and_dispatch(self):
"""Main loop: find ready tasks and dispatch to available agents."""
ready_tasks = self.task_repo.find_by_status(TaskStatus.READY)
for task in ready_tasks:
agent = self.agent_registry.find_available_agent(
department=task.department,
required_capabilities=self._infer_capabilities(task)
)
if agent and agent.cost_used < agent.cost_budget:
await self.dispatcher.dispatch(task, agent)
task.status = TaskStatus.IN_PROGRESS
task.assigned_agent = agent.id
self.task_repo.save(task)
def _infer_capabilities(self, task: Task) -> list[str]:
"""Map task requirements to agent capabilities."""
# Simple keyword matching for now
keywords = {
"review": ["code_review", "analysis"],
"implement": ["coding", "testing"],
"research": ["web_search", "synthesis"],
"spec": ["writing", "planning"],
}
for keyword, caps in keywords.items():
if keyword in task.title.lower() or keyword in task.description.lower():
return caps
return []

Note the &lt; instead of < in the cost comparison—this is JSX/MDX-safe syntax for generics.

4. Claude Code CLI Integration

Here’s how I dispatch tasks to actual AI agents:

services/claude_dispatcher.py
import subprocess
import json
class ClaudeCodeDispatcher:
def dispatch(self, task: Task, agent: Agent) -> str:
"""Invoke Claude Code CLI with task context."""
prompt = self._build_prompt(task, agent)
result = subprocess.run(
["claude", "--model", agent.backend, prompt],
capture_output=True,
text=True
)
# Track cost from response headers or estimate
self._update_cost_tracking(agent, result)
return result.stdout
def _build_prompt(self, task: Task, agent: Agent) -> str:
"""Construct context-aware prompt for the agent."""
return f"""
You are a {agent.role} in the {agent.department.value} department.
Your capabilities: {', '.join(agent.capabilities)}
Task: {task.title}
Description: {task.description}
Priority: {task.priority}
Complete this task and provide your output.
"""

What I Got Wrong Initially

Mistake 1: Flat Architecture

At first, I had all agents at the same level. No departments. This caused immediate problems:

  • Tasks would get assigned to the wrong context
  • Research tasks ended up in engineering sessions
  • Context windows filled with irrelevant information

Mistake 2: No Dependency Tracking

I’d dispatch tasks in random order. Then I’d have to redo work because a prerequisite wasn’t done. The dependency resolution logic fixed this completely.

Mistake 3: Manual Dispatch

I was hand-picking which agent to send which task. Now the scheduler does it automatically based on:

  • Department match
  • Capability match
  • Current workload
  • Cost budget remaining

Mistake 4: Single AI Backend

I started with only Claude. But different tasks have different optimal backends:

  • Claude for coding and analysis
  • Gemini for research and synthesis
  • Codex for pure code generation

The agent registry supports multiple backends per role.

The Web Dashboard

After implementing the core system, I added a dashboard for visibility:

Dashboard Features
+------------------+------------------------+
| Org Map | Kanban Board |
| (visual tree) | (drag & drop tasks) |
+------------------+------------------------+
| Chat Interface | Cost Tracking |
| (per dept) | (per agent budgets) |
+------------------+------------------------+

The cost tracking alone has saved me hundreds of dollars. I can see exactly which department is consuming API credits and set budgets accordingly.

Why This Architecture Works

  1. Scalability: I can add dozens of concurrent tasks without losing track
  2. Context Isolation: Each department maintains focused context, reducing hallucinations
  3. Cost Control: Per-department budgets prevent runaway API costs
  4. Auditability: Full history of decisions and outputs per department
  5. Flexibility: I can add/remove departments as my needs evolve

The key insight is that organizational patterns from human teams translate well to AI systems. Departments, managers, task boards, and schedulers aren’t just corporate bureaucracy—they’re information architecture patterns that help manage complexity.

Getting Started

If you’re struggling with multiple AI contexts, start simple:

  1. Define 2-3 departments based on your actual workstreams
  2. Build a basic task board with status tracking
  3. Add dependency resolution
  4. Implement a simple scheduler (even cron + shell scripts work)
  5. Add cost tracking before your API bill surprises you

The full organizational structure can evolve incrementally. Don’t over-engineer it upfront—let your actual usage patterns guide the architecture.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments