Skip to content

High for Planning, Medium for Implementation: The GPT-5.4 Workflow That Actually Works

I kept burning through my token budget. Every project started with grand architectural plans from GPT-5.4, then somewhere in the implementation phase, everything went sideways. Either the code didn’t match the plan, or I’d run out of context before finishing.

Then a Reddit comment changed everything: “I use High for planning and Medium for implementing the detailed plan.”

Simple. Obvious in hindsight. But I’d been doing it wrong for weeks.

The Problem With Single-Level Workflows

I started with a simple assumption: Higher reasoning = Better results. So I used High (or XHigh) for everything.

Here’s what happened on a typical feature request:

Task: Add user authentication to a FastAPI backend
Attempt with High throughout:
- Planning: Comprehensive, 40 lines of architecture decisions
- Implementation: Over-detailed code, excessive error handling
- Result: 3 hours, 200K tokens, over-engineered solution

The implementation phase was where things went wrong. High reasoning kept trying to re-architect during coding. It added layers I didn’t ask for. It second-guessed decisions from the planning phase.

Same task, Medium throughout:
- Planning: Missed several edge cases, no security considerations
- Implementation: Fast, but followed incomplete plan
- Result: 45 minutes, 50K tokens, broken auth flow

Neither approach worked well. I needed the deep thinking for planning, but not for execution.

Why This Happens

GPT-5.4’s reasoning levels aren’t just about “smartness.” They’re about cognitive style.

High reasoning excels at:

  • Strategic thinking and comprehensive analysis
  • Following instructions very closely
  • Producing solid, predictable results
  • Cross-system impact assessment

Medium reasoning excels at:

  • Efficient execution of well-defined tasks
  • Speed without overthinking
  • Practical implementation

When you use High for implementation, it brings strategic thinking to tactical tasks. That’s misaligned. It’s like hiring a chief architect to lay individual bricks.

When you use Medium for planning, it brings tactical thinking to strategic tasks. That’s also misaligned. It’s like asking a bricklayer to design the building.

The Two-Tier Workflow

I now split every project into two distinct phases:

┌─────────────────────────────────────────────────────────────────┐
│ PROJECT LIFECYCLE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ PHASE 1: PLANNING (High Reasoning) ~10-20% of effort │
│ ───────────────────────────────────────────────────────────── │
│ • Architecture design │
│ • Requirements analysis │
│ • Risk assessment │
│ • Create detailed implementation roadmap │
│ │
│ ▼ │
│ │
│ PHASE 2: IMPLEMENTATION (Medium Reasoning) ~70-80% of effort │
│ ───────────────────────────────────────────────────────────── │
│ • Execute the plan step by step │
│ • Write code according to specifications │
│ • Run tests and fix issues │
│ • Refine implementation details │
│ │
│ ▼ │
│ │
│ PHASE 3: REVIEW (High Reasoning) ~10% of effort │
│ ───────────────────────────────────────────────────────────── │
│ • Validate implementation against plan │
│ • Identify deviations or issues │
│ • Recommend improvements │
│ │
└─────────────────────────────────────────────────────────────────┘

This workflow matches reasoning style to task requirements.

A Real Example

Task: Add JWT authentication to an existing API

Phase 1: Planning with High Reasoning

I prompted with High reasoning:

Analyze our FastAPI backend and create a detailed plan for adding
JWT authentication. Include architecture decisions, implementation
steps, risk assessment, and testing strategy.

High reasoning output:

## Architecture Decision
- JWT tokens with refresh token rotation
- Rate limiting on auth endpoints
- bcrypt password hashing (cost factor 12)
- Email verification flow
## Implementation Steps
1. Create User model with email/password fields
2. Implement password hashing middleware
3. Create /auth/register endpoint
4. Create /auth/login endpoint
5. Implement JWT generation/validation
6. Add refresh token logic
7. Create auth middleware for protected routes
8. Add rate limiting middleware
9. Write tests for each endpoint
10. Add email verification service
## Risk Assessment
- SQL injection: Use parameterized queries
- Brute force: Implement rate limiting
- Token theft: Short-lived access tokens + refresh rotation
## Testing Strategy
- Unit tests: Password hashing, token generation
- Integration tests: Auth flow end-to-end
- Security tests: Rate limiting, injection attempts

Comprehensive. Covered edge cases I wouldn’t have thought of. Now the implementation phase had a solid foundation.

Phase 2: Implementation with Medium Reasoning

For each step, I switched to Medium:

Implement step 3 from the plan: Create /auth/register endpoint.
Follow the plan specifications exactly.

Medium reasoning output:

from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, EmailStr
from passlib.context import CryptContext
router = APIRouter()
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
class UserRegister(BaseModel):
email: EmailStr
password: str
@router.post("/auth/register")
async def register(user: UserRegister):
if len(user.password) < 12:
raise HTTPException(
status_code=400,
detail="Password must be at least 12 characters"
)
hashed_password = pwd_context.hash(user.password)
db_user = await create_user(
email=user.email,
password_hash=hashed_password
)
return {"message": "User registered successfully", "user_id": db_user.id}

Clean, focused, followed the plan. No over-engineering. No re-architecting. Just execution.

Phase 3: Review with High Reasoning

After implementation:

Review this implementation against the original plan. Identify
deviations, missing pieces, and potential improvements.

High reasoning caught what Medium missed:

## Adherence to Plan
✅ User model created with correct fields
✅ Password hashing uses bcrypt cost factor 12
✅ Rate limiting implemented
⚠️ Email verification not yet implemented (Step 10)
## Identified Issues
1. Missing password complexity validation
2. No duplicate email check before registration
3. Missing CSRF protection for cookies
## Recommendations
1. Add regex validation for password complexity
2. Add unique constraint check on email field
3. Implement CSRF token for state-changing requests

Token Usage Comparison

I measured this approach against my previous single-level workflow:

ApproachPlanningImplementationReviewTotal
High throughout45K180K25K250K
Medium throughout15K60K10K85K
High→Medium→High40K70K20K130K

The two-tier approach costs more than Medium-only but produces far better results. And it costs significantly less than High-only while avoiding over-engineering.

When to Deviate

This isn’t a rigid rule. There are exceptions:

Use High for implementation when:

  • Implementing a critical security feature
  • The plan has ambiguities that need interpretation
  • You’re in uncharted territory without clear specs

Use Medium for planning when:

  • The task is straightforward and well-understood
  • You have existing templates or patterns to follow
  • Speed matters more than comprehensive coverage

Use XHigh for planning when:

  • Large codebase with complex file relationships
  • Cross-module dependencies need tracking
  • High reasoning fails to capture all considerations

Common Anti-Patterns

Anti-Pattern 1: High for simple implementation

Task: Write a simple debounce function
High reasoning output:
- Generic type constraints
- Multiple overload signatures
- Extensive JSDoc comments
- A wrapper class "for extensibility"
- 47 lines for a 5-line function

Fix: Use Medium for straightforward implementation tasks.

Anti-Pattern 2: Medium for complex planning

Task: Plan authentication system architecture
Medium reasoning output:
- Basic JWT approach
- Missed refresh token rotation
- No rate limiting consideration
- Incomplete risk assessment

Fix: Always use High or XHigh for architectural decisions.

Anti-Pattern 3: Never switching levels

Some developers pick one level and stick with it regardless of task. This wastes either quality (Medium for everything) or efficiency (High for everything).

The Decision Matrix

After months of experimentation, here’s my reference:

Task TypeLevelReason
Initial architectureHigh/XHighStrategic decisions need depth
Requirements analysisHighComplex reasoning for edge cases
Code implementationMediumEfficiency for well-defined tasks
Bug fixingMedium-HighDepends on bug complexity
Code reviewHighNeed comprehensive analysis
TestingLow-MediumRoutine execution
RefactoringMediumFollowing clear patterns
DocumentationLow-MediumStraightforward writing

How to Start

  1. Before your next project, explicitly define which reasoning level for each phase
  2. Start with High for planning, generate a detailed plan
  3. Switch to Medium for implementation, execute the plan
  4. End with High for review, catch what Medium missed
  5. Track results and adjust based on your specific use cases

The key insight: Different phases of work require different cognitive styles. Match the reasoning level to the task, not the project.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments