Skip to content

How AI Agent Task Systems Work: Persistent Goals with Dependencies

I lost my work. My agent was making great progress on a complex refactoring task. It had broken down the work into 7 steps, completed 3 of them, and was deep into step 4. Then context compression kicked in.

After context compression
User: What's the status of the refactoring task?
Agent: I don't have context on any previous refactoring task. Could you remind me what we were working on?

All the planning, all the progress tracking, gone. The agent had been using an in-memory todo list (s03’s TodoManager), and context compression wiped it clean.

The real problem? Agents without persistent task management lose goals between sessions. They have no memory that spans conversations. This is where a proper task system comes in.

Here’s what I learned about building a file-based task graph with dependencies.

Why In-Memory Planning Fails

s03’s TodoManager is a flat checklist in memory. It has three critical weaknesses:

  1. No ordering: Tasks are just a list, no sequence
  2. No dependencies: Can’t express “task B depends on task A”
  3. No persistence: Lost on context compression or restart
The problem with flat checklists
Task: Refactor authentication module
[ ] Update password hashing
[ ] Add MFA support
[ ] Update session management
[ ] Write tests
[ ] Deploy to staging
What order? What depends on what? Can tests run before the implementation?
The agent has to figure it out every single time.

When you’re working on complex, multi-step goals, you need structure. You need dependencies. And you need persistence.

The Solution: File-Based Task Graph

The s07 task system solves all three problems by promoting the checklist into a task graph persisted to disk.

Each task is a JSON file:

Task directory structure
.tasks/
task_1.json {"id": 1, "status": "completed", ...}
task_2.json {"id": 2, "blockedBy": [1], "status": "pending", ...}
task_3.json {"id": 3, "blockedBy": [1], "status": "pending", ...}
task_4.json {"id": 4, "blockedBy": [2, 3], "status": "pending", ...}

This gives you three things immediately:

  1. What’s ready? — tasks with pending status and empty blockedBy
  2. What’s blocked? — tasks waiting on unfinished dependencies
  3. What’s done?completed tasks, whose completion automatically unblocks dependents
Visual task graph (DAG)
+----------+
+--> | task 2 | --+
| | pending | |
+----------+ +----------+ +--> +----------+
| task 1 | | task 4 |
| completed| --> +----------+ +--> | blocked |
+----------+ | task 3 | --+ +----------+
| pending |
+----------+
Ordering: task 1 must finish before 2 and 3
Parallelism: tasks 2 and 3 can run at the same time
Dependencies: task 4 waits for both 2 and 3
Status: pending -> in_progress -> completed

Implementing the TaskManager

Here’s the core implementation from s07_task_system.py:

TaskManager class
class TaskManager:
def __init__(self, tasks_dir: Path):
self.dir = tasks_dir
self.dir.mkdir(exist_ok=True)
self._next_id = self._max_id() + 1
def _max_id(self) -> int:
ids = [int(f.stem.split("_")[1]) for f in self.dir.glob("task_*.json")]
return max(ids) if ids else 0
def _load(self, task_id: int) -> dict:
path = self.dir / f"task_{task_id}.json"
if not path.exists():
raise ValueError(f"Task {task_id} not found")
return json.loads(path.read_text())
def _save(self, task: dict):
path = self.dir / f"task_{task['id']}.json"
path.write_text(json.dumps(task, indent=2))

Each task has this structure:

Task JSON structure
{
"id": 1,
"subject": "Update password hashing",
"description": "Migrate from SHA256 to bcrypt",
"status": "pending",
"blockedBy": [],
"blocks": [2],
"owner": ""
}

Key fields:

  • status: pending, in_progress, or completed
  • blockedBy: list of task IDs this task depends on
  • blocks: list of task IDs that depend on this task
  • owner: which agent is working on this (for multi-agent teams)

CRUD Operations

The task system provides four core operations:

1. Create a Task

Creating a task
def create(self, subject: str, description: str = "") -> str:
task = {
"id": self._next_id,
"subject": subject,
"description": description,
"status": "pending",
"blockedBy": [],
"blocks": [],
"owner": "",
}
self._save(task)
self._next_id += 1
return json.dumps(task, indent=2)

2. Update a Task

Updating a task with status and dependencies
def update(self, task_id: int, status: str = None,
add_blocked_by: list = None, add_blocks: list = None) -> str:
task = self._load(task_id)
if status:
if status not in ("pending", "in_progress", "completed"):
raise ValueError(f"Invalid status: {status}")
task["status"] = status
# When completed, remove this task from all blockedBy lists
if status == "completed":
self._clear_dependency(task_id)
if add_blocked_by:
task["blockedBy"] = list(set(task["blockedBy"] + add_blocked_by))
if add_blocks:
task["blocks"] = list(set(task["blocks"] + add_blocks))
# Bidirectional: update the blocked tasks' blockedBy lists
for blocked_id in add_blocks:
try:
blocked = self._load(blocked_id)
if task_id not in blocked["blockedBy"]:
blocked["blockedBy"].append(task_id)
self._save(blocked)
except ValueError:
pass
self._save(task)
return json.dumps(task, indent=2)

3. List All Tasks

Listing tasks with status markers
def list_all(self) -> str:
tasks = []
for f in sorted(self.dir.glob("task_*.json")):
tasks.append(json.loads(f.read_text()))
if not tasks:
return "No tasks."
lines = []
for t in tasks:
marker = {
"pending": "[ ]",
"in_progress": "[>]",
"completed": "[x]"
}.get(t["status"], "[?]")
blocked = f" (blocked by: {t['blockedBy']})" if t.get("blockedBy") else ""
lines.append(f"{marker} #{t['id']}: {t['subject']}{blocked}")
return "\n".join(lines)

Output looks like this:

Task list output
[x] #1: Setup project structure
[>] #2: Implement core modules (blocked by: [])
[ ] #3: Write unit tests (blocked by: [2])
[ ] #4: Integration tests (blocked by: [2])
[ ] #5: Deploy to staging (blocked by: [3, 4])

4. Get Task Details

Getting task details
def get(self, task_id: int) -> str:
return json.dumps(self._load(task_id), indent=2)

Dependency Resolution: The Magic

The most powerful part is automatic dependency resolution. When a task is completed, it automatically unblocks all tasks waiting on it:

Automatic dependency clearing
def _clear_dependency(self, completed_id: int):
"""Remove completed_id from all other tasks' blockedBy lists."""
for f in self.dir.glob("task_*.json"):
task = json.loads(f.read_text())
if completed_id in task.get("blockedBy", []):
task["blockedBy"].remove(completed_id)
self._save(task)

This means:

Before completing task 1
[ ] #1: Setup project structure
[ ] #2: Implement core modules (blocked by: [1])
[ ] #3: Write unit tests (blocked by: [2])
After: update task 1 status to "completed"
[x] #1: Setup project structure
[ ] #2: Implement core modules
[ ] #3: Write unit tests (blocked by: [2])

Task 2 is now unblocked automatically. The graph updates itself.

Connecting Tasks to the Agent

Four task tools go into the dispatch map:

Task tools in dispatch map
TOOL_HANDLERS = {
# ...base tools...
"task_create": lambda **kw: TASKS.create(kw["subject"], kw.get("description", "")),
"task_update": lambda **kw: TASKS.update(
kw["task_id"],
kw.get("status"),
kw.get("addBlockedBy"),
kw.get("addBlocks")
),
"task_list": lambda **kw: TASKS.list_all(),
"task_get": lambda **kw: TASKS.get(kw["task_id"]),
}

The tool definitions tell the model what’s available:

Task tool definitions
TOOLS = [
# ...base tools...
{
"name": "task_create",
"description": "Create a new task.",
"input_schema": {
"type": "object",
"properties": {
"subject": {"type": "string"},
"description": {"type": "string"}
},
"required": ["subject"]
}
},
{
"name": "task_update",
"description": "Update a task's status or dependencies.",
"input_schema": {
"type": "object",
"properties": {
"task_id": {"type": "integer"},
"status": {"type": "string", "enum": ["pending", "in_progress", "completed"]},
"addBlockedBy": {"type": "array", "items": {"type": "integer"}},
"addBlocks": {"type": "array", "items": {"type": "integer"}}
},
"required": ["task_id"]
}
},
{
"name": "task_list",
"description": "List all tasks with status summary.",
"input_schema": {"type": "object", "properties": {}}
},
{
"name": "task_get",
"description": "Get full details of a task by ID.",
"input_schema": {
"type": "object",
"properties": {"task_id": {"type": "integer"}},
"required": ["task_id"]
}
},
]

Real-World Example

I gave my agent this prompt:

Create a task board for refactoring the auth module:
1. Update password hashing (must finish first)
2. Add MFA support (can run parallel with session management)
3. Update session management (can run parallel with MFA)
4. Write integration tests (waits for MFA and session management)
5. Deploy to staging (waits for everything)

The agent created this graph:

Agent-created task graph
.tasks/
task_1.json {"id": 1, "subject": "Update password hashing", "status": "pending", "blocks": [2, 3]}
task_2.json {"id": 2, "subject": "Add MFA support", "blockedBy": [1], "blocks": [4]}
task_3.json {"id": 3, "subject": "Update session management", "blockedBy": [1], "blocks": [4]}
task_4.json {"id": 4, "subject": "Write integration tests", "blockedBy": [2, 3], "blocks": [5]}
task_5.json {"id": 5, "subject": "Deploy to staging", "blockedBy": [4]}

Visual representation:

Task dependency graph
+----------+
+--> | task 2 | --+
| | MFA | |
+----------+ | +----------+ | +----------+
| task 1 | | +--> | task 4 |
| password | --+ | tests |
+----------+ | +--> +----------+
| +----------+ | |
+--> | task 3 | --+ v
| session | +----------+
+----------+ | task 5 |
| deploy |
+----------+

The agent understood parallelism: tasks 2 and 3 can run concurrently after task 1.

Why Persistence Matters

After context compression, the agent still knows what it was working on:

After context compression - tasks survive
User: What were we working on?
Agent: Let me check the task list...
[x] #1: Update password hashing
[>] #2: Add MFA support
[ ] #3: Update session management
[ ] #4: Write integration tests (blocked by: [2, 3])
[ ] #5: Deploy to staging (blocked by: [4])
Agent: We completed the password hashing update and are currently working on MFA support. Session management is ready to start in parallel if needed.

The tasks are on disk. They survive context compression. They survive session restarts. They survive crashes.

The Motto: Break Big Goals into Small Tasks

The learn-claude-code project has a motto for s07:

“Break big goals into small tasks, order them, persist to disk”

This is the foundation for everything that comes after:

  • s08 Background Tasks: Execute tasks asynchronously while the agent keeps thinking
  • s09 Agent Teams: Multiple agents coordinate via the shared task graph
  • s11 Autonomous Agents: Agents claim tasks from the board without being assigned
  • s12 Worktree Isolation: Each task gets its own isolated working directory

Without the task system, none of these advanced features would work. The task graph becomes the coordination backbone for multi-agent collaboration.

Task System vs Todo List

When should you use which?

FeatureTodo (s03)Tasks (s07)
PersistenceMemory onlyDisk
DependenciesNoYes (DAG)
Statusdone/not donepending/in_progress/completed/blocked
ParallelismCan’t expressExplicit via dependencies
Multi-agentNoYes (owner field)
Use caseQuick single-session checklistsComplex multi-session goals

Use Todo for quick lists within a session. Use Tasks for goals that matter.

Common Patterns

Pattern 1: Sequential Pipeline

task 1 --> task 2 --> task 3 --> task 4
Use case: Build -> Test -> Deploy

Pattern 2: Parallel Execution

+--> task 2 --+
task 1 ----+ +--> task 4
+--> task 3 --+
Use case: Setup complete, then implement frontend and backend in parallel

Pattern 3: Diamond Dependency

+--> task 2 --+
task 1 ----+ +--> task 4
+--> task 3 --+
Use case: Split work, merge results (like git branching)

Pattern 4: Multiple Blocking

task 1 --+
+--> task 3
task 2 --+
Use case: Task 3 needs both task 1 and task 2 to complete first

Error Handling

The task system handles edge cases gracefully:

Error handling examples
# Invalid task ID
>>> TASKS.get(999)
ValueError: Task 999 not found
# Invalid status
>>> TASKS.update(1, status="done")
ValueError: Invalid status: done
# Circular dependency (detected at update time)
# The system prevents adding a task as its own blocker

Status transitions follow a simple state machine:

Status state machine
start
|
v
+----------+
| pending | <--------+
+----------+ |
| |
(start work) |
v |
+----------+ |
|in_progress| --------+
+----------+ (blocked)
|
(complete)
v
+----------+
|completed |
+----------+

What Changed From s06

ComponentBefore (s06)After (s07)
Tools58 (task_create/update/list/get)
Planning modelFlat checklist (in-memory)Task graph with dependencies (on disk)
RelationshipsNoneblockedBy + blocks edges
Status trackingDone or notpending -> in_progress -> completed
PersistenceLost on compressionSurvives compression and restarts

DAG (Directed Acyclic Graph): The structure of task dependencies. No cycles allowed - task A can’t depend on task B if task B depends on task A.

Topological Sort: The order in which tasks can be executed. All dependencies must complete before a task can start.

Critical Path: The longest path through the dependency graph. Determines minimum time to complete all tasks.

Task Ownership: The owner field enables multi-agent coordination. When an agent claims a task, it sets itself as the owner.

References

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Tasks are goals that survive. The file-based dependency graph lets agents remember, order, and track work across sessions. This is the foundation for teams. Without persistence, agents lose context. With the task system, agents have memory that spans conversations - goals persist on disk, dependencies encode relationships, and the graph structure enables complex workflows.

Comments