High for Planning, Medium for Implementation: The GPT-5.4 Workflow That Actually Works

Mar 15, 2026

I kept burning through my token budget. Every project started with grand architectural plans from GPT-5.4, then somewhere in the implementation phase, everything went sideways. Either the code didn’t match the plan, or I’d run out of context before finishing.

Then a Reddit comment changed everything: “I use High for planning and Medium for implementing the detailed plan.”

Simple. Obvious in hindsight. But I’d been doing it wrong for weeks.

The Problem With Single-Level Workflows

I started with a simple assumption: Higher reasoning = Better results. So I used High (or XHigh) for everything.

Here’s what happened on a typical feature request:

Task: Add user authentication to a FastAPI backend

Attempt with High throughout:
- Planning: Comprehensive, 40 lines of architecture decisions
- Implementation: Over-detailed code, excessive error handling
- Result: 3 hours, 200K tokens, over-engineered solution

The implementation phase was where things went wrong. High reasoning kept trying to re-architect during coding. It added layers I didn’t ask for. It second-guessed decisions from the planning phase.

Same task, Medium throughout:
- Planning: Missed several edge cases, no security considerations
- Implementation: Fast, but followed incomplete plan
- Result: 45 minutes, 50K tokens, broken auth flow

Neither approach worked well. I needed the deep thinking for planning, but not for execution.

Why This Happens

GPT-5.4’s reasoning levels aren’t just about “smartness.” They’re about cognitive style.

High reasoning excels at:

Strategic thinking and comprehensive analysis
Following instructions very closely
Producing solid, predictable results
Cross-system impact assessment

Medium reasoning excels at:

Efficient execution of well-defined tasks
Speed without overthinking
Practical implementation

When you use High for implementation, it brings strategic thinking to tactical tasks. That’s misaligned. It’s like hiring a chief architect to lay individual bricks.

When you use Medium for planning, it brings tactical thinking to strategic tasks. That’s also misaligned. It’s like asking a bricklayer to design the building.

The Two-Tier Workflow

I now split every project into two distinct phases:

┌─────────────────────────────────────────────────────────────────┐
│                      PROJECT LIFECYCLE                          │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  PHASE 1: PLANNING (High Reasoning)        ~10-20% of effort    │
│  ─────────────────────────────────────────────────────────────  │
│  • Architecture design                                          │
│  • Requirements analysis                                        │
│  • Risk assessment                                              │
│  • Create detailed implementation roadmap                       │
│                                                                 │
│                          ▼                                      │
│                                                                 │
│  PHASE 2: IMPLEMENTATION (Medium Reasoning) ~70-80% of effort  │
│  ─────────────────────────────────────────────────────────────  │
│  • Execute the plan step by step                                │
│  • Write code according to specifications                       │
│  • Run tests and fix issues                                     │
│  • Refine implementation details                                │
│                                                                 │
│                          ▼                                      │
│                                                                 │
│  PHASE 3: REVIEW (High Reasoning)          ~10% of effort      │
│  ─────────────────────────────────────────────────────────────  │
│  • Validate implementation against plan                        │
│  • Identify deviations or issues                                │
│  • Recommend improvements                                       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

This workflow matches reasoning style to task requirements.

A Real Example

Task: Add JWT authentication to an existing API

Phase 1: Planning with High Reasoning

I prompted with High reasoning:

Analyze our FastAPI backend and create a detailed plan for adding
JWT authentication. Include architecture decisions, implementation
steps, risk assessment, and testing strategy.

High reasoning output:

## Architecture Decision
- JWT tokens with refresh token rotation
- Rate limiting on auth endpoints
- bcrypt password hashing (cost factor 12)
- Email verification flow

## Implementation Steps
1. Create User model with email/password fields
2. Implement password hashing middleware
3. Create /auth/register endpoint
4. Create /auth/login endpoint
5. Implement JWT generation/validation
6. Add refresh token logic
7. Create auth middleware for protected routes
8. Add rate limiting middleware
9. Write tests for each endpoint
10. Add email verification service

## Risk Assessment
- SQL injection: Use parameterized queries
- Brute force: Implement rate limiting
- Token theft: Short-lived access tokens + refresh rotation

## Testing Strategy
- Unit tests: Password hashing, token generation
- Integration tests: Auth flow end-to-end
- Security tests: Rate limiting, injection attempts

Comprehensive. Covered edge cases I wouldn’t have thought of. Now the implementation phase had a solid foundation.

Phase 2: Implementation with Medium Reasoning

For each step, I switched to Medium:

Implement step 3 from the plan: Create /auth/register endpoint.
Follow the plan specifications exactly.

Medium reasoning output:

from fastapi import APIRouter, HTTPException
from pydantic import BaseModel, EmailStr
from passlib.context import CryptContext

router = APIRouter()
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")

class UserRegister(BaseModel):
    email: EmailStr
    password: str

@router.post("/auth/register")
async def register(user: UserRegister):
    if len(user.password) < 12:
        raise HTTPException(
            status_code=400,
            detail="Password must be at least 12 characters"
        )

    hashed_password = pwd_context.hash(user.password)

    db_user = await create_user(
        email=user.email,
        password_hash=hashed_password
    )

    return {"message": "User registered successfully", "user_id": db_user.id}

Clean, focused, followed the plan. No over-engineering. No re-architecting. Just execution.

Phase 3: Review with High Reasoning

After implementation:

Review this implementation against the original plan. Identify
deviations, missing pieces, and potential improvements.

High reasoning caught what Medium missed:

## Adherence to Plan
✅ User model created with correct fields
✅ Password hashing uses bcrypt cost factor 12
✅ Rate limiting implemented
⚠️ Email verification not yet implemented (Step 10)

## Identified Issues
1. Missing password complexity validation
2. No duplicate email check before registration
3. Missing CSRF protection for cookies

## Recommendations
1. Add regex validation for password complexity
2. Add unique constraint check on email field
3. Implement CSRF token for state-changing requests

Token Usage Comparison

I measured this approach against my previous single-level workflow:

Approach	Planning	Implementation	Review	Total
High throughout	45K	180K	25K	250K
Medium throughout	15K	60K	10K	85K
High→Medium→High	40K	70K	20K	130K

The two-tier approach costs more than Medium-only but produces far better results. And it costs significantly less than High-only while avoiding over-engineering.

When to Deviate

This isn’t a rigid rule. There are exceptions:

Use High for implementation when:

Implementing a critical security feature
The plan has ambiguities that need interpretation
You’re in uncharted territory without clear specs

Use Medium for planning when:

The task is straightforward and well-understood
You have existing templates or patterns to follow
Speed matters more than comprehensive coverage

Use XHigh for planning when:

Large codebase with complex file relationships
Cross-module dependencies need tracking
High reasoning fails to capture all considerations

Common Anti-Patterns

Anti-Pattern 1: High for simple implementation

Task: Write a simple debounce function

High reasoning output:
- Generic type constraints
- Multiple overload signatures
- Extensive JSDoc comments
- A wrapper class "for extensibility"
- 47 lines for a 5-line function

Fix: Use Medium for straightforward implementation tasks.

Anti-Pattern 2: Medium for complex planning

Task: Plan authentication system architecture

Medium reasoning output:
- Basic JWT approach
- Missed refresh token rotation
- No rate limiting consideration
- Incomplete risk assessment

Fix: Always use High or XHigh for architectural decisions.

Anti-Pattern 3: Never switching levels

Some developers pick one level and stick with it regardless of task. This wastes either quality (Medium for everything) or efficiency (High for everything).

The Decision Matrix

After months of experimentation, here’s my reference:

Task Type	Level	Reason
Initial architecture	High/XHigh	Strategic decisions need depth
Requirements analysis	High	Complex reasoning for edge cases
Code implementation	Medium	Efficiency for well-defined tasks
Bug fixing	Medium-High	Depends on bug complexity
Code review	High	Need comprehensive analysis
Testing	Low-Medium	Routine execution
Refactoring	Medium	Following clear patterns
Documentation	Low-Medium	Straightforward writing

How to Start

Before your next project, explicitly define which reasoning level for each phase
Start with High for planning, generate a detailed plan
Switch to Medium for implementation, execute the plan
End with High for review, catch what Medium missed
Track results and adjust based on your specific use cases

The key insight: Different phases of work require different cognitive styles. Match the reasoning level to the task, not the project.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit Discussion on GPT-5.4 Reasoning Levels
👨‍💻 OpenAI GPT-5.4 Documentation

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!