How GSD's Multi-Agent Orchestration Runs Entire Phases Without Filling Your Context Window

Apr 21, 2026

GSD Multi-Agent Orchestration

I ran a full GSD project recently and watched something interesting happen. The research phase spawned four parallel agents. The planning phase spawned a planner and a checker. The execution phase ran multiple implementation agents simultaneously. My main context window stayed around 35% full throughout.

This isn’t magic. It’s the orchestrator/agent split built into GSD’s architecture. Understanding this pattern changed how I think about AI-assisted development.

The Orchestrator/Agent Split

GSD workflow files are thin coordinators. They don’t do heavy lifting. Instead, they:

Load context via SDK queries
Spawn specialized agents with focused prompts
Collect results and route to the next step
Update state between steps

The actual work happens in spawned agents. Each agent gets a fresh 200k-token context window. Your main session never absorbs their output. It only receives summaries.

┌─────────────────────────────────────────────────────────────────┐
│                      ORCHESTRATOR                                │
│  (runs in your main context)                                     │
│                                                                  │
│  • Load PROJECT.md, REQUIREMENTS.md, CONTEXT.md                 │
│  • Spawn agents with focused prompts                             │
│  • Collect results (summaries only)                              │
│  • Route to next phase                                           │
│  • Track state and progress                                      │
│                                                                  │
│  Context usage: 30-40% (stays lean)                              │
└─────────────────────────────────────────────────────────────────┘
                         │
                         │ spawns
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│                       AGENTS                                      │
│  (each gets fresh 200k context)                                  │
│                                                                  │
│  • Deep research on specific topic                               │
│  • Create detailed implementation plans                          │
│  • Write and verify code                                         │
│  • Debug and diagnose issues                                     │
│                                                                  │
│  Context usage: Can fill entire 200k                             │
│  Results: Summarized back to orchestrator                        │
└─────────────────────────────────────────────────────────────────┘

I observed this during the research phase. My orchestrator loaded the project files, then spawned four research agents in parallel. Each agent dove deep into its topic—stack analysis, feature requirements, architecture patterns, potential pitfalls. They produced detailed reports. My main context received one-page summaries.

Wave Execution for Parallelism

Plans are grouped into waves based on dependencies. Each wave runs when previous waves complete.

Wave 1: Independent plans (no dependencies)
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│   Plan A        │ │   Plan B        │ │   Plan C        │
│   (parallel)    │ │   (parallel)    │ │   (parallel)    │
└─────────────────┘ └─────────────────┘ └─────────────────┘
                         │
                         │ all complete
                         ▼
Wave 2: Plans that depend on Wave 1
┌─────────────────┐ ┌─────────────────┐
│   Plan D        │ │   Plan E        │
│ (depends on A)  │ │ (depends on B,C)│
└─────────────────┘ └─────────────────┘
                         │
                         │ all complete
                         ▼
Wave 3: Plans that depend on Wave 2
┌─────────────────┐
│   Plan F        │
│ (depends on D,E)│
└─────────────────┘

Independent plans run in parallel within a wave. Dependent plans wait in later waves. This maximizes throughput while respecting dependencies.

Why Vertical Slices Parallelize Better

I learned this distinction during execution planning. Vertical slices have fewer cross-plan dependencies than horizontal layers.

WRONG: Horizontal Layers (many dependencies)
─────────────────────────────────────────────
Plan 01: All models (User, Post, Comment, Tag)
Plan 02: All routes (auth, posts, comments, tags)
Plan 03: All views (login, dashboard, posts, comments)

Problem: Plan 02 depends on Plan 01
         Plan 03 depends on Plan 02
         Everything must be sequential

CORRECT: Vertical Slices (fewer dependencies)
─────────────────────────────────────────────
Plan 01: User feature end-to-end (model + route + view)
Plan 02: Post feature end-to-end (model + route + view)
Plan 03: Comment feature end-to-end (model + route + view)

Benefit: Plans 01, 02, 03 can run in parallel
         Each slice is self-contained

Vertical slices reduce dependencies. A user feature doesn’t need the post feature to exist before it can work. This lets GSD run multiple slices simultaneously.

Specialized Agents Per Stage

Each phase uses different agent types. The orchestrator coordinates them.

Stage           Orchestrator does              Agents do
─────────────────────────────────────────────────────────────
Research        Coordinates, presents          4 parallel researchers:
                findings                       - Stack researcher
                                               - Features researcher
                                               - Architecture researcher
                                               - Pitfalls researcher

Planning        Validates, manages             Planner creates plans
                iteration                      Checker verifies (loops 3x)

Execution       Groups into waves,             Executors implement
                tracks progress                in parallel

Verification    Presents results,              Verifier checks
                routes next                    Debug agents diagnose
                                               if issues found

I ran through a planning phase that demonstrated the loop mechanism. The planner agent created implementation plans. The checker agent verified them. When the checker found issues, the orchestrator looped back—spawned the planner again with feedback. This happened three times until all plans passed.

┌──────────────┐     ┌──────────────┐
│   Planner    │────▶│   Checker    │
│   Agent      │     │   Agent      │
└──────────────┘     └──────┘───────┘
      ▲                     │
      │                     │
      │ pass?               │ fail (with feedback)
      │                     ▼
      │              ┌──────────────┐
      │              │   Loop back  │
      │              │   (max 3x)   │
      │              └──────────────┘
      │                     │
      └─────────────────────┘

Max loops: 3
Exit condition: Checker passes OR loop limit reached

The Result: Your Context Stays Lean

An entire phase—deep research, multiple verified plans, thousands of lines of code, automated verification—runs with your main context at 30-40%. The heavy work happens in fresh subagent contexts. Your session stays fast and responsive.

Traditional AI session (no subagents):
─────────────────────────────────────
Research:    +20% context
Planning:    +15% context
Execution:   +30% context
Verification: +10% context
─────────────────────────────────────
Total:       75% context used
Result:      Slower responses, degraded quality

GSD with subagent spawning:
─────────────────────────────────────
Research:    +5% (summaries only)
Planning:    +5% (summaries only)
Execution:   +8% (summaries + progress)
Verification: +5% (results only)
─────────────────────────────────────
Total:       23% context used
Result:      Fast responses, maintained quality

The difference is where the work happens. Subagents absorb the heavy context. Your main session only tracks progress and receives results.

What I Observed in Practice

Running a GSD project showed me the mechanics:

Research phase: Orchestrator spawned four agents. Each spent 10-15 minutes in deep analysis. My context received one-page summaries from each. Total context impact: 5%.
Planning phase: Planner agent created 12 implementation plans. Checker agent verified each. Two loops happened before passing. My context only saw the final approved plans. Total context impact: 8%.
Execution phase: Plans grouped into three waves. Wave 1 ran three parallel executor agents. Wave 2 ran two. Wave 3 ran one. My context tracked progress, not implementation details. Total context impact: 12%.
Verification phase: Verifier agent ran tests. Two debug agents spawned for issues found. My context received pass/fail results. Total context impact: 5%.

At the end of a full project cycle, my context was at 30% capacity. I could have run another project immediately.

Summary

In this post, I explained how GSD’s multi-agent orchestration keeps your context lean. The orchestrator/agent split means heavy work happens in fresh subagent contexts. Wave execution enables parallelism where possible. Vertical slices reduce dependencies, allowing more parallel execution. Your main session coordinates, receives summaries, and stays responsive.

The key insight: orchestrators coordinate, agents execute. This separation lets you run entire phases—research, planning, execution, verification—without quality degradation from context bloat.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!