Skip to content

Claude Code Effort Levels: Why Medium Often Beats Max (And When to Use Each)

Problem

When I first started using Claude Code, I assumed max effort would always produce better results. More thinking = better answers, right?

After burning through thousands of tokens watching Claude spiral on simple debugging tasks, I realized I was wrong. Here’s what happened when I ran a straightforward bug fix with max effort:

Claude is thinking deeply...
[2 minutes pass]
[More thinking...]
[Scope expands to "improve the entire codebase"]
[400 lines of unrelated changes proposed]
[Original bug still present]

I switched to medium effort and got the fix in 30 seconds with exactly one line changed.

What happened?

I tested Claude Code’s effort levels across dozens of tasks over several weeks. The pattern became clear: medium effort consistently outperformed max for routine development work.

Here’s what I found:

┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Low Effort │ │ Medium Effort │ │ Max Effort │
│ │ │ │ │ │
│ Fast, shallow │ │ Balanced, │ │ Deep, prone to │
│ good for │ │ focused, │ │ overthinking, │
│ simple tasks │ │ best for most │ │ best for │
│ │ │ tasks │ │ complex │
│ │ │ │ │ reasoning │
└─────────────────┘ └─────────────────┘ └─────────────────┘

The community evidence backs this up. Here’s what developers report:

"Opus 4.6 on extra effort is a nightmare for debugging - model spirals,
overthinks, goes down rabbit holes" (Score: 81)
"I found medium or even low effort much better. The big effort would
more often derail and spend much longer on simple things" (Score: 13)
"Only people that don't know what they're doing put Claude on max
thinking and leave it there" (Score: 5)
"Max effort don't always give better output I find medium is the most
optimal even though I am on max 200 plan." (Score: 2)

These aren’t isolated opinions. They reflect a consistent pattern: more thinking doesn’t always mean better output.

Why does max effort backfire?

The problem isn’t the model’s capability. It’s how extended reasoning affects output quality.

The Overthinking Trap

When Claude engages max effort, three problems emerge:

1. Scope Creep

The model starts “helping” in ways you didn’t ask for:

Your request: "Fix the null pointer in UserService"
Max effort response: "I noticed your entire auth module could be
refactored. I've also added caching, improved error messages, and
created three new utility files..."
Result: 47 file changes, original bug hidden in the noise

2. Spiraling

The model gets stuck in reasoning loops:

Thinking about the problem...
Considering approach A...
Actually, approach B might be better...
Wait, let me reconsider approach A...
[5 minutes later]
Actually, there's a third option...

Each reconsideration burns tokens without progress.

3. Quality Drop

Here’s the counterintuitive part: more reasoning can lead to simpler mistakes:

"I find that the model tends to get a bit dumber -- more
'scatter-brained'? -- and make more simple mistakes in code if
it reasons too hard."

When the model overthinks, it loses focus on the actual task and starts making errors it wouldn’t make with less deliberation.

When medium effort wins

I tested medium effort on these common tasks:

Task TypeMedium Effort ResultMax Effort Result
Simple bug fixesClean, focused fixScope expansion, unrelated changes
CRUD operationsWorking code quicklyOver-engineering, premature abstraction
Code refactoring (clear goals)Direct improvementsExcessive restructuring
Feature implementationFocused implementationFeature creep, unnecessary complexity
Documentation updatesConcise updatesVerbose explanations, tangential content

Medium effort produces focused output because the model doesn’t have time to second-guess itself or explore unnecessary paths.

Why Medium Works Better

Input: "Add input validation to this endpoint"
Medium effort:
- Identifies required fields
- Adds validation
- Returns clean code
Max effort:
- Identifies required fields
- Considers 5 validation libraries
- Debates Zod vs Yup vs custom solution
- Adds validation
- Adds error handling you didn't ask for
- Suggests rate limiting
- Returns 3x more code than needed

The medium effort result is what you wanted. The max effort result contains what you wanted plus noise.

When max effort actually helps

Max effort isn’t useless. It shines for specific problem types:

Complex Logical Reasoning

Tasks with many interacting constraints:

Problem: Design a caching strategy that:
- Handles 10K requests/second
- Supports eventual consistency
- Works across 3 regions
- Must handle network partitions
- Budget: $500/month
Max effort helps here because:
- Multiple valid approaches exist
- Trade-offs are non-obvious
- Wrong choice is expensive

Mathematical or Algorithmic Challenges

Problem: Optimize this query that joins 7 tables
and currently takes 45 seconds
Max effort helps because:
- Query optimization is genuinely complex
- Multiple index strategies to consider
- Need to reason about execution plans

Deeply Nested Debugging

When the bug hides in complex async flows:

Problem: Race condition in a distributed system
with 12 microservices
Max effort helps because:
- Need to trace through multiple systems
- Timing issues are subtle
- Wrong fix could cause worse bugs

Architectural Decisions

Problem: Choose between PostgreSQL, MongoDB, and
DynamoDB for a new service with specific requirements
Max effort helps because:
- Long-term consequences matter
- Trade-offs are numerous
- Documentation needed for team buy-in

A simple decision framework

┌────────────────────────────────────────────────────────────────┐
│ Which Effort Level? │
└────────────────────────────────────────────────────────────────┘
┌────────────────────────┐
│ Is this a routine task │
│ you've done before? │
└────────────────────────┘
│ │
Yes No
│ │
▼ ▼
┌─────────────┐ ┌────────────────────────┐
│ MEDIUM │ │ Does it require deep │
│ EFFORT │ │ reasoning with many │
└─────────────┘ │ constraints/tradeoffs? │
└────────────────────────┘
│ │
Yes No
│ │
▼ ▼
┌──────────┐ ┌──────────┐
│ MAX │ │ MEDIUM │
│ EFFORT │ │ EFFORT │
└──────────┘ └──────────┘

In practice, I use this rule of thumb:

Default: Medium effort
Switch to max effort when:
- Problem keeps you up at night
- Multiple solutions exist with unclear trade-offs
- Wrong answer has significant cost
- You're genuinely stuck after medium effort attempt

Real examples from my workflow

Example 1: Simple Bug Fix (Medium Wins)

Task: Fix NullPointerException in UserValidator
Medium effort:
- Found the null check missing
- Added check
- 1 line changed, bug fixed
- Time: 30 seconds
Max effort (previous attempt):
- Analyzed entire validation chain
- Proposed new validation framework
- Added logging throughout
- 15 files changed
- Bug still present (hidden in diff)
- Time: 8 minutes

Example 2: Complex Query Optimization (Max Wins)

Task: Reduce report generation from 2 minutes to under 30 seconds
Max effort:
- Analyzed execution plan
- Identified missing indexes
- Considered denormalization
- Proposed materialized view
- Found caching opportunity
- Result: 18 seconds
- Time investment: worth it

Example 3: Feature Implementation (Medium Wins)

Task: Add password reset functionality
Medium effort:
- Standard flow: request email, token, reset
- Generated working code in 2 minutes
- Clean, maintainable
Max effort:
- Suggested rate limiting (not requested)
- Proposed multi-factor reset (not requested)
- Added audit logging (not requested)
- Discussed 3 email providers
- Generated 3x more code
- Team rejected as over-engineered

Token cost comparison

Beyond quality, there’s a real cost difference:

Typical medium effort task: ~5,000 tokens
Typical max effort task: ~25,000 tokens
Difference: 5x cost for often worse output
Monthly budget impact:
- Medium-only usage: $50/month
- Max-only usage: $250/month (with similar or worse results)

The cost compounds when max effort triggers spiraling or scope creep.

My current approach

After months of experimentation, here’s my workflow:

1. Start with medium effort
2. Review output
3. If output is insufficient, consider:
- Is the task genuinely complex? → Switch to max
- Is my prompt unclear? → Refine prompt, stay medium
- Did I provide enough context? → Add context, stay medium
4. Only use max effort when the problem genuinely requires it

This approach maximizes both quality and cost efficiency.

Summary

In this post, I showed why medium effort consistently outperforms max for most Claude Code tasks. Max effort triggers overthinking that causes scope creep, reasoning spirals, and paradoxically worse output on simple problems.

The key point is that more thinking doesn’t automatically mean better results. Use medium effort as your default, and reserve max effort for genuinely complex problems that require deep reasoning with multiple constraints and non-obvious trade-offs.

The best developers know when to think hard and when to just code. The same principle applies to AI effort levels.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments