How GSD's Multi-Agent Orchestration Runs Entire Phases Without Filling Your Context Window
I ran a full GSD project recently and watched something interesting happen. The research phase spawned four parallel agents. The planning phase spawned a planner and a checker. The execution phase ran multiple implementation agents simultaneously. My main context window stayed around 35% full throughout.
This isn’t magic. It’s the orchestrator/agent split built into GSD’s architecture. Understanding this pattern changed how I think about AI-assisted development.
The Orchestrator/Agent Split
GSD workflow files are thin coordinators. They don’t do heavy lifting. Instead, they:
- Load context via SDK queries
- Spawn specialized agents with focused prompts
- Collect results and route to the next step
- Update state between steps
The actual work happens in spawned agents. Each agent gets a fresh 200k-token context window. Your main session never absorbs their output. It only receives summaries.
┌─────────────────────────────────────────────────────────────────┐│ ORCHESTRATOR ││ (runs in your main context) ││ ││ • Load PROJECT.md, REQUIREMENTS.md, CONTEXT.md ││ • Spawn agents with focused prompts ││ • Collect results (summaries only) ││ • Route to next phase ││ • Track state and progress ││ ││ Context usage: 30-40% (stays lean) │└─────────────────────────────────────────────────────────────────┘ │ │ spawns ▼┌─────────────────────────────────────────────────────────────────┐│ AGENTS ││ (each gets fresh 200k context) ││ ││ • Deep research on specific topic ││ • Create detailed implementation plans ││ • Write and verify code ││ • Debug and diagnose issues ││ ││ Context usage: Can fill entire 200k ││ Results: Summarized back to orchestrator │└─────────────────────────────────────────────────────────────────┘I observed this during the research phase. My orchestrator loaded the project files, then spawned four research agents in parallel. Each agent dove deep into its topic—stack analysis, feature requirements, architecture patterns, potential pitfalls. They produced detailed reports. My main context received one-page summaries.
Wave Execution for Parallelism
Plans are grouped into waves based on dependencies. Each wave runs when previous waves complete.
Wave 1: Independent plans (no dependencies)┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐│ Plan A │ │ Plan B │ │ Plan C ││ (parallel) │ │ (parallel) │ │ (parallel) │└─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ all complete ▼Wave 2: Plans that depend on Wave 1┌─────────────────┐ ┌─────────────────┐│ Plan D │ │ Plan E ││ (depends on A) │ │ (depends on B,C)│└─────────────────┘ └─────────────────┘ │ │ all complete ▼Wave 3: Plans that depend on Wave 2┌─────────────────┐│ Plan F ││ (depends on D,E)│└─────────────────┘Independent plans run in parallel within a wave. Dependent plans wait in later waves. This maximizes throughput while respecting dependencies.
Why Vertical Slices Parallelize Better
I learned this distinction during execution planning. Vertical slices have fewer cross-plan dependencies than horizontal layers.
WRONG: Horizontal Layers (many dependencies)─────────────────────────────────────────────Plan 01: All models (User, Post, Comment, Tag)Plan 02: All routes (auth, posts, comments, tags)Plan 03: All views (login, dashboard, posts, comments)
Problem: Plan 02 depends on Plan 01 Plan 03 depends on Plan 02 Everything must be sequential
CORRECT: Vertical Slices (fewer dependencies)─────────────────────────────────────────────Plan 01: User feature end-to-end (model + route + view)Plan 02: Post feature end-to-end (model + route + view)Plan 03: Comment feature end-to-end (model + route + view)
Benefit: Plans 01, 02, 03 can run in parallel Each slice is self-containedVertical slices reduce dependencies. A user feature doesn’t need the post feature to exist before it can work. This lets GSD run multiple slices simultaneously.
Specialized Agents Per Stage
Each phase uses different agent types. The orchestrator coordinates them.
Stage Orchestrator does Agents do─────────────────────────────────────────────────────────────Research Coordinates, presents 4 parallel researchers: findings - Stack researcher - Features researcher - Architecture researcher - Pitfalls researcher
Planning Validates, manages Planner creates plans iteration Checker verifies (loops 3x)
Execution Groups into waves, Executors implement tracks progress in parallel
Verification Presents results, Verifier checks routes next Debug agents diagnose if issues foundI ran through a planning phase that demonstrated the loop mechanism. The planner agent created implementation plans. The checker agent verified them. When the checker found issues, the orchestrator looped back—spawned the planner again with feedback. This happened three times until all plans passed.
┌──────────────┐ ┌──────────────┐│ Planner │────▶│ Checker ││ Agent │ │ Agent │└──────────────┘ └──────┘───────┘ ▲ │ │ │ │ pass? │ fail (with feedback) │ ▼ │ ┌──────────────┐ │ │ Loop back │ │ │ (max 3x) │ │ └──────────────┘ │ │ └─────────────────────┘
Max loops: 3Exit condition: Checker passes OR loop limit reachedThe Result: Your Context Stays Lean
An entire phase—deep research, multiple verified plans, thousands of lines of code, automated verification—runs with your main context at 30-40%. The heavy work happens in fresh subagent contexts. Your session stays fast and responsive.
Traditional AI session (no subagents):─────────────────────────────────────Research: +20% contextPlanning: +15% contextExecution: +30% contextVerification: +10% context─────────────────────────────────────Total: 75% context usedResult: Slower responses, degraded quality
GSD with subagent spawning:─────────────────────────────────────Research: +5% (summaries only)Planning: +5% (summaries only)Execution: +8% (summaries + progress)Verification: +5% (results only)─────────────────────────────────────Total: 23% context usedResult: Fast responses, maintained qualityThe difference is where the work happens. Subagents absorb the heavy context. Your main session only tracks progress and receives results.
What I Observed in Practice
Running a GSD project showed me the mechanics:
-
Research phase: Orchestrator spawned four agents. Each spent 10-15 minutes in deep analysis. My context received one-page summaries from each. Total context impact: 5%.
-
Planning phase: Planner agent created 12 implementation plans. Checker agent verified each. Two loops happened before passing. My context only saw the final approved plans. Total context impact: 8%.
-
Execution phase: Plans grouped into three waves. Wave 1 ran three parallel executor agents. Wave 2 ran two. Wave 3 ran one. My context tracked progress, not implementation details. Total context impact: 12%.
-
Verification phase: Verifier agent ran tests. Two debug agents spawned for issues found. My context received pass/fail results. Total context impact: 5%.
At the end of a full project cycle, my context was at 30% capacity. I could have run another project immediately.
Summary
In this post, I explained how GSD’s multi-agent orchestration keeps your context lean. The orchestrator/agent split means heavy work happens in fresh subagent contexts. Wave execution enables parallelism where possible. Vertical slices reduce dependencies, allowing more parallel execution. Your main session coordinates, receives summaries, and stays responsive.
The key insight: orchestrators coordinate, agents execute. This separation lets you run entire phases—research, planning, execution, verification—without quality degradation from context bloat.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments