How to Write Implementation Plans That AI Agents Can Execute Reliably
Problem
I handed an AI agent an implementation plan I’d written. The plan said things like:
1. Add input validation to the form2. Create the API endpoint3. Write testsThe agent ran with it. Ten minutes later, I had validation on the wrong fields, an API endpoint with incorrect authentication, and tests that passed but tested nothing useful.
The problem wasn’t the agent’s intelligence - it was my plan. I’d written for a human reader who could fill in gaps with context and judgment. AI agents need something completely different.
The Core Insight
Here’s what I learned from the Superpowers writing-plans skill:
Write comprehensive implementation plans assuming the engineerhas ZERO CONTEXT for your codebase and QUESTIONABLE TASTE.This changes everything about how you write a plan. Every step must be explicit. Every file path must be exact. Every command must be verifiable.
What Makes a Plan Agent-Executable?
The skill defines a strict structure:
Each step is ONE ACTION (2-5 minutes):- "Write the failing test"- "Run it to make sure it fails"- "Implement minimal code"
NOT: "Add validation to the form" (too vague, multiple actions)NOT: "Implement the feature" (way too large)This granularity forces you to think through the actual implementation sequence. You can’t hand-wave.
The Plan Document Structure
Every plan must start with this header:
# [Feature Name] Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development
**Goal:** [One sentence describing what this builds]
**Architecture:** [2-3 sentences about approach]
**Tech Stack:** [Key technologies/libraries]This header does more than document - it sets context for agents that may know nothing about your project.
Task Structure: The Exact Format
Each task follows this template:
### Task N: [Component Name]
**Files:**- Create: `exact/path/to/file.py`- Modify: `exact/path/to/existing.py:123-145`- Test: `tests/exact/path/to/test.py`
- [ ] **Step 1: Write the failing test**
```python title="Example Test"def test_specific_behavior(): result = function(input) assert result == expected- Step 2: Run test to verify it fails
Run: pytest tests/path/test.py::test_name -v
Expected: FAIL with “function not defined”
Notice what’s included:
- Exact file paths - No relative paths, no “in the appropriate file”
- Complete code - Not “add validation” but the actual validation code
- Exact commands - The exact shell command with expected output
- Line numbers for modifications - Where to make the change
The Architecture Decision: Before Tasks
Before writing any tasks, map the file structure:
Before defining tasks:
1. Design units with clear boundaries2. Define interfaces between units3. Prefer smaller, focused files over large ones4. Files that change together should live togetherThis prevents the “where does this go?” confusion that leads agents to make bad architectural decisions.
A Concrete Example
Let me show you the difference. Here’s how I used to write a plan:
## Add User Authentication
1. Create the user model with password hashing2. Add login/logout routes3. Create session management4. Write tests for the auth flowHere’s how the Superpowers approach would structure it:
### Task 1: User Model
**Files:**- Create: `src/models/user.py`- Create: `tests/models/test_user.py`
- [ ] **Step 1: Write the failing test for password hashing**
```python title="tests/models/test_user.py"# tests/models/test_user.pyfrom src.models.user import User
def test_password_is_hashed_on_creation(): assert user.password != "plaintext" assert user.verify_password("plaintext") is True- Step 2: Run test to verify it fails
Run: pytest tests/models/test_user.py -v
Expected: FAIL with “ModuleNotFoundError: No module named ‘src.models.user’”
- Step 3: Create the User model
Create file src/models/user.py:
from passlib.context import CryptContext
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
class User: def __init__(self, email: str, password: str): self.email = email self.password = pwd_context.hash(password)
def verify_password(self, plain_password: str) -> bool: return pwd_context.verify(plain_password, self.password)- Step 4: Run test to verify it passes
Run: pytest tests/models/test_user.py -v
Expected: PASS
The second version leaves nothing to interpretation. An agent with zero context can execute it.
Key Principles Throughout
The skill emphasizes these principles must appear throughout the plan:
- DRY: Don't Repeat Yourself- YAGNI: You Ain't Gonna Need It- TDD: Test-Driven Development- Frequent commits after each task- Reference relevant skills with @ syntaxThese aren’t afterthoughts - they’re baked into every step.
The Review Loop: Quality Assurance
Plans go through a review loop before execution:
┌─────────────────────────────────────────────────────────────┐│ PLAN REVIEW LOOP │├─────────────────────────────────────────────────────────────┤│ ││ Write Plan ──► Dispatch plan-document-reviewer ──► Review ││ │ │ ││ │ ▼ ││ │ ┌────────┐ ││ │ │Issues? │ ││ │ └────┬───┘ ││ │ │ ││ │ ┌─────────────┴─────┐ ││ │ │ │ ││ │ Yes No ││ │ │ │ ││ │ ▼ ▼ ││ │ Fix & Retry Proceed ││ │ │ │ ││ └───────────┘ │ ││ │ ││ Max 3 iterations ││ │ ││ ┌────────────┴────────────┐ ││ │ │ ││ Success Surface to Human ││ │ │ ││ └─────────────────────────┘ │└─────────────────────────────────────────────────────────────┘This catches vague steps, missing file paths, and unclear commands before an agent wastes time.
Scope Check: Breaking Down Large Plans
One critical check happens before writing tasks:
If the spec covers multiple independent subsystems,it should have been broken into sub-project specs during brainstorming.
If it wasn't: Suggest breaking into separate plans Each plan produces working, testable softwareThis prevents the “one giant plan” anti-pattern that fails because it’s too complex to execute reliably.
The Execution Handoff
After saving the plan, you offer a choice:
Plan complete. Two execution options:
1. Subagent-Driven (recommended) - Fresh subagent per task - Review between tasks - Better isolation, easier debugging
2. Inline Execution - Batch execution with checkpoints - Faster for simple plans - Less visibility into intermediate states
Which approach?This acknowledges that different situations call for different execution strategies.
Common Mistakes I Made
Vague steps: “Add error handling” became 15 specific steps when I was forced to be explicit.
Missing file paths: I assumed the agent would know where to put things. It didn’t.
Incomplete code: I wrote “validate the input” instead of writing the validation code. The agent wrote validation for the wrong inputs.
Skipping expected output: I didn’t specify what commands should output. The agent ran commands, saw output, and assumed success even when the output was an error.
No TDD enforcement: My plans said “write tests” after implementation. The Superpowers approach enforces test-first with verification that tests fail.
Why This Approach Works
1. Eliminates ambiguity - Agents can't read between lines2. Enables verification - Every step has expected output3. Supports recovery - When a step fails, you know exactly where4. Allows parallelization - Different agents can take different tasks5. Creates documentation - The plan IS documentationSummary
In this post, I explained how to write implementation plans that AI agents can execute reliably. The key insight from the Superpowers writing-plans skill is to assume zero context and questionable taste - every step must be one action (2-5 minutes), with exact file paths, complete code, and specific commands with expected output.
The structured approach - header with goal and architecture, tasks with files and steps, embedded TDD principles, and a review loop - transforms vague plans into executable specifications. This dramatically improves agent execution reliability.
For teams working with AI agents on implementation, adopting this format is essential. The upfront investment in detailed planning pays off in reliable execution and reduced debugging time.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments