How AI Agent Task Systems Work: Persistent Goals with Dependencies

Mar 18, 2026

I lost my work. My agent was making great progress on a complex refactoring task. It had broken down the work into 7 steps, completed 3 of them, and was deep into step 4. Then context compression kicked in.

User: What's the status of the refactoring task?
Agent: I don't have context on any previous refactoring task. Could you remind me what we were working on?

All the planning, all the progress tracking, gone. The agent had been using an in-memory todo list (s03’s TodoManager), and context compression wiped it clean.

The real problem? Agents without persistent task management lose goals between sessions. They have no memory that spans conversations. This is where a proper task system comes in.

Here’s what I learned about building a file-based task graph with dependencies.

Why In-Memory Planning Fails

s03’s TodoManager is a flat checklist in memory. It has three critical weaknesses:

No ordering: Tasks are just a list, no sequence
No dependencies: Can’t express “task B depends on task A”
No persistence: Lost on context compression or restart

Task: Refactor authentication module
  [ ] Update password hashing
  [ ] Add MFA support
  [ ] Update session management
  [ ] Write tests
  [ ] Deploy to staging

What order? What depends on what? Can tests run before the implementation?
The agent has to figure it out every single time.

When you’re working on complex, multi-step goals, you need structure. You need dependencies. And you need persistence.

The Solution: File-Based Task Graph

The s07 task system solves all three problems by promoting the checklist into a task graph persisted to disk.

Each task is a JSON file:

.tasks/
  task_1.json  {"id": 1, "status": "completed", ...}
  task_2.json  {"id": 2, "blockedBy": [1], "status": "pending", ...}
  task_3.json  {"id": 3, "blockedBy": [1], "status": "pending", ...}
  task_4.json  {"id": 4, "blockedBy": [2, 3], "status": "pending", ...}

This gives you three things immediately:

What’s ready? — tasks with pending status and empty blockedBy
What’s blocked? — tasks waiting on unfinished dependencies
What’s done? — completed tasks, whose completion automatically unblocks dependents

                +----------+
           +--> | task 2   | --+
           |    | pending  |   |
+----------+     +----------+    +--> +----------+
| task 1   |                          | task 4   |
| completed| --> +----------+    +--> | blocked  |
+----------+     | task 3   | --+     +----------+
                 | pending  |
                 +----------+

Ordering:     task 1 must finish before 2 and 3
Parallelism:  tasks 2 and 3 can run at the same time
Dependencies: task 4 waits for both 2 and 3
Status:       pending -> in_progress -> completed

Implementing the TaskManager

Here’s the core implementation from s07_task_system.py:

class TaskManager:
    def __init__(self, tasks_dir: Path):
        self.dir = tasks_dir
        self.dir.mkdir(exist_ok=True)
        self._next_id = self._max_id() + 1

    def _max_id(self) -> int:
        ids = [int(f.stem.split("_")[1]) for f in self.dir.glob("task_*.json")]
        return max(ids) if ids else 0

    def _load(self, task_id: int) -> dict:
        path = self.dir / f"task_{task_id}.json"
        if not path.exists():
            raise ValueError(f"Task {task_id} not found")
        return json.loads(path.read_text())

    def _save(self, task: dict):
        path = self.dir / f"task_{task['id']}.json"
        path.write_text(json.dumps(task, indent=2))

Each task has this structure:

{
  "id": 1,
  "subject": "Update password hashing",
  "description": "Migrate from SHA256 to bcrypt",
  "status": "pending",
  "blockedBy": [],
  "blocks": [2],
  "owner": ""
}

Key fields:

status: pending, in_progress, or completed
blockedBy: list of task IDs this task depends on
blocks: list of task IDs that depend on this task
owner: which agent is working on this (for multi-agent teams)

CRUD Operations

The task system provides four core operations:

1. Create a Task

def create(self, subject: str, description: str = "") -> str:
    task = {
        "id": self._next_id,
        "subject": subject,
        "description": description,
        "status": "pending",
        "blockedBy": [],
        "blocks": [],
        "owner": "",
    }
    self._save(task)
    self._next_id += 1
    return json.dumps(task, indent=2)

2. Update a Task

def update(self, task_id: int, status: str = None,
           add_blocked_by: list = None, add_blocks: list = None) -> str:
    task = self._load(task_id)

    if status:
        if status not in ("pending", "in_progress", "completed"):
            raise ValueError(f"Invalid status: {status}")
        task["status"] = status
        # When completed, remove this task from all blockedBy lists
        if status == "completed":
            self._clear_dependency(task_id)

    if add_blocked_by:
        task["blockedBy"] = list(set(task["blockedBy"] + add_blocked_by))

    if add_blocks:
        task["blocks"] = list(set(task["blocks"] + add_blocks))
        # Bidirectional: update the blocked tasks' blockedBy lists
        for blocked_id in add_blocks:
            try:
                blocked = self._load(blocked_id)
                if task_id not in blocked["blockedBy"]:
                    blocked["blockedBy"].append(task_id)
                    self._save(blocked)
            except ValueError:
                pass

    self._save(task)
    return json.dumps(task, indent=2)

3. List All Tasks

def list_all(self) -> str:
    tasks = []
    for f in sorted(self.dir.glob("task_*.json")):
        tasks.append(json.loads(f.read_text()))

    if not tasks:
        return "No tasks."

    lines = []
    for t in tasks:
        marker = {
            "pending": "[ ]",
            "in_progress": "[>]",
            "completed": "[x]"
        }.get(t["status"], "[?]")
        blocked = f" (blocked by: {t['blockedBy']})" if t.get("blockedBy") else ""
        lines.append(f"{marker} #{t['id']}: {t['subject']}{blocked}")

    return "\n".join(lines)

Output looks like this:

[x] #1: Setup project structure
[>] #2: Implement core modules (blocked by: [])
[ ] #3: Write unit tests (blocked by: [2])
[ ] #4: Integration tests (blocked by: [2])
[ ] #5: Deploy to staging (blocked by: [3, 4])

4. Get Task Details

def get(self, task_id: int) -> str:
    return json.dumps(self._load(task_id), indent=2)

Dependency Resolution: The Magic

The most powerful part is automatic dependency resolution. When a task is completed, it automatically unblocks all tasks waiting on it:

def _clear_dependency(self, completed_id: int):
    """Remove completed_id from all other tasks' blockedBy lists."""
    for f in self.dir.glob("task_*.json"):
        task = json.loads(f.read_text())
        if completed_id in task.get("blockedBy", []):
            task["blockedBy"].remove(completed_id)
            self._save(task)

This means:

[ ] #1: Setup project structure
[ ] #2: Implement core modules (blocked by: [1])
[ ] #3: Write unit tests (blocked by: [2])

After: update task 1 status to "completed"

[x] #1: Setup project structure
[ ] #2: Implement core modules
[ ] #3: Write unit tests (blocked by: [2])

Task 2 is now unblocked automatically. The graph updates itself.

Connecting Tasks to the Agent

Four task tools go into the dispatch map:

TOOL_HANDLERS = {
    # ...base tools...
    "task_create": lambda **kw: TASKS.create(kw["subject"], kw.get("description", "")),
    "task_update": lambda **kw: TASKS.update(
        kw["task_id"],
        kw.get("status"),
        kw.get("addBlockedBy"),
        kw.get("addBlocks")
    ),
    "task_list":   lambda **kw: TASKS.list_all(),
    "task_get":    lambda **kw: TASKS.get(kw["task_id"]),
}

The tool definitions tell the model what’s available:

TOOLS = [
    # ...base tools...
    {
        "name": "task_create",
        "description": "Create a new task.",
        "input_schema": {
            "type": "object",
            "properties": {
                "subject": {"type": "string"},
                "description": {"type": "string"}
            },
            "required": ["subject"]
        }
    },
    {
        "name": "task_update",
        "description": "Update a task's status or dependencies.",
        "input_schema": {
            "type": "object",
            "properties": {
                "task_id": {"type": "integer"},
                "status": {"type": "string", "enum": ["pending", "in_progress", "completed"]},
                "addBlockedBy": {"type": "array", "items": {"type": "integer"}},
                "addBlocks": {"type": "array", "items": {"type": "integer"}}
            },
            "required": ["task_id"]
        }
    },
    {
        "name": "task_list",
        "description": "List all tasks with status summary.",
        "input_schema": {"type": "object", "properties": {}}
    },
    {
        "name": "task_get",
        "description": "Get full details of a task by ID.",
        "input_schema": {
            "type": "object",
            "properties": {"task_id": {"type": "integer"}},
            "required": ["task_id"]
        }
    },
]

Real-World Example

I gave my agent this prompt:

Create a task board for refactoring the auth module:
1. Update password hashing (must finish first)
2. Add MFA support (can run parallel with session management)
3. Update session management (can run parallel with MFA)
4. Write integration tests (waits for MFA and session management)
5. Deploy to staging (waits for everything)

The agent created this graph:

.tasks/
  task_1.json  {"id": 1, "subject": "Update password hashing", "status": "pending", "blocks": [2, 3]}
  task_2.json  {"id": 2, "subject": "Add MFA support", "blockedBy": [1], "blocks": [4]}
  task_3.json  {"id": 3, "subject": "Update session management", "blockedBy": [1], "blocks": [4]}
  task_4.json  {"id": 4, "subject": "Write integration tests", "blockedBy": [2, 3], "blocks": [5]}
  task_5.json  {"id": 5, "subject": "Deploy to staging", "blockedBy": [4]}

Visual representation:

                      +----------+
                 +--> | task 2   | --+
                 |    | MFA      |   |
+----------+     |    +----------+    |    +----------+
| task 1   |     |                    +--> | task 4   |
| password | --+                          | tests    |
+----------+     |                    +--> +----------+
                 |    +----------+    |          |
                 +--> | task 3   | --+          v
                      | session  |          +----------+
                      +----------+          | task 5   |
                                            | deploy   |
                                            +----------+

The agent understood parallelism: tasks 2 and 3 can run concurrently after task 1.

Why Persistence Matters

After context compression, the agent still knows what it was working on:

User: What were we working on?
Agent: Let me check the task list...

[x] #1: Update password hashing
[>] #2: Add MFA support
[ ] #3: Update session management
[ ] #4: Write integration tests (blocked by: [2, 3])
[ ] #5: Deploy to staging (blocked by: [4])

Agent: We completed the password hashing update and are currently working on MFA support. Session management is ready to start in parallel if needed.

The tasks are on disk. They survive context compression. They survive session restarts. They survive crashes.

The Motto: Break Big Goals into Small Tasks

The learn-claude-code project has a motto for s07:

“Break big goals into small tasks, order them, persist to disk”

This is the foundation for everything that comes after:

s08 Background Tasks: Execute tasks asynchronously while the agent keeps thinking
s09 Agent Teams: Multiple agents coordinate via the shared task graph
s11 Autonomous Agents: Agents claim tasks from the board without being assigned
s12 Worktree Isolation: Each task gets its own isolated working directory

Without the task system, none of these advanced features would work. The task graph becomes the coordination backbone for multi-agent collaboration.

Task System vs Todo List

When should you use which?

Feature	Todo (s03)	Tasks (s07)
Persistence	Memory only	Disk
Dependencies	No	Yes (DAG)
Status	done/not done	pending/in_progress/completed/blocked
Parallelism	Can’t express	Explicit via dependencies
Multi-agent	No	Yes (owner field)
Use case	Quick single-session checklists	Complex multi-session goals

Use Todo for quick lists within a session. Use Tasks for goals that matter.

Common Patterns

Pattern 1: Sequential Pipeline

task 1 --> task 2 --> task 3 --> task 4

Use case: Build -> Test -> Deploy

Pattern 2: Parallel Execution

            +--> task 2 --+
task 1 ----+              +--> task 4
            +--> task 3 --+

Use case: Setup complete, then implement frontend and backend in parallel

Pattern 3: Diamond Dependency

            +--> task 2 --+
task 1 ----+              +--> task 4
            +--> task 3 --+

Use case: Split work, merge results (like git branching)

Pattern 4: Multiple Blocking

task 1 --+
         +--> task 3
task 2 --+

Use case: Task 3 needs both task 1 and task 2 to complete first

Error Handling

The task system handles edge cases gracefully:

# Invalid task ID
>>> TASKS.get(999)
ValueError: Task 999 not found

# Invalid status
>>> TASKS.update(1, status="done")
ValueError: Invalid status: done

# Circular dependency (detected at update time)
# The system prevents adding a task as its own blocker

Status transitions follow a simple state machine:

                    start
                      |
                      v
                 +----------+
                 | pending  | <--------+
                 +----------+          |
                      |                |
              (start work)             |
                      v                |
                 +----------+          |
                 |in_progress| --------+
                 +----------+   (blocked)
                      |
                (complete)
                      v
                 +----------+
                 |completed |
                 +----------+

What Changed From s06

Component	Before (s06)	After (s07)
Tools	5	8 (`task_create/update/list/get`)
Planning model	Flat checklist (in-memory)	Task graph with dependencies (on disk)
Relationships	None	`blockedBy` + `blocks` edges
Status tracking	Done or not	`pending` -> `in_progress` -> `completed`
Persistence	Lost on compression	Survives compression and restarts

DAG (Directed Acyclic Graph): The structure of task dependencies. No cycles allowed - task A can’t depend on task B if task B depends on task A.

Topological Sort: The order in which tasks can be executed. All dependencies must complete before a task can start.

Critical Path: The longest path through the dependency graph. Determines minimum time to complete all tasks.

Task Ownership: The owner field enables multi-agent coordination. When an agent claims a task, it sets itself as the owner.

References

learn-claude-code GitHub Repository - Source code and documentation
Dependency Graph - Wikipedia - Theory behind task dependencies
Topological Sorting - Algorithm for ordering tasks

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Tasks are goals that survive. The file-based dependency graph lets agents remember, order, and track work across sessions. This is the foundation for teams. Without persistence, agents lose context. With the task system, agents have memory that spans conversations - goals persist on disk, dependencies encode relationships, and the graph structure enables complex workflows.