Smart Effort Routing in Claude Code: When to Switch Between Low, Medium, and Max
Problem
My automated Claude Code workflows kept hitting rate limits halfway through. I had a 50-step orchestration that would fail at step 30-something with:
Error: Rate limit exceeded. Please wait before retrying.I was confused. I had set effort to “max” across the entire workflow, thinking more effort meant better results. But instead of quality, I got failures.
When I looked at my workflow, it looked like this:
Step 1: Read config file [max effort]Step 2: Parse JSON [max effort]Step 3: Generate plan [max effort]Step 4: Create directory [max effort]Step 5: Write file [max effort]Step 6: Read another file [max effort]...Step 50: Final commit [max effort]Every step was max effort. Every step burned through my rate limit allocation. By step 30, I was done for the hour.
What happened?
I searched for how others handled effort levels and found a Reddit discussion that opened my eyes.
The key insight: Route by task type, not project type.
I had been thinking about effort levels as project-wide settings. “This is an important project, use max effort everywhere.” But that’s wrong.
Effort levels should be determined by what the task actually requires:
Task Types vs Effort Levels:┌─────────────────────────┬─────────┬──────────────────────────────────┐│ Task Type │ Effort │ Why │├─────────────────────────┼─────────┼──────────────────────────────────┤│ Planning & Debugging │ Max │ Requires deep reasoning ││ Routine File Edits │ Low │ Mechanical, well-defined ││ Routine Tool Calls │ Low │ Standard, predictable ││ Code Generation │ Medium │ Balance of creativity & precision ││ Architecture Decisions │ Max │ Complex reasoning, high stakes │└─────────────────────────┴─────────┴──────────────────────────────────┘One comment from the thread stuck with me:
“For automated workflows, effort per-step compounds fast — max effort on every step of a 50-call orchestration will blow past hourly limits before the task finishes. I route by step type: planning and debugging get max, routine tool calls get low. Massive difference in throughput without visible quality loss on the mechanical steps.”
This was exactly my problem. I was applying max effort to mechanical operations that didn’t need it.
How I fixed it
I rewrote my workflow to route effort based on task type:
Step 1: Read config file [low effort] ← mechanicalStep 2: Parse JSON [low effort] ← mechanicalStep 3: Generate plan [max effort] ← reasoningStep 4: Create directory [low effort] ← mechanicalStep 5: Write file [low effort] ← mechanicalStep 6: Read another file [low effort] ← mechanicalStep 7: Debug unexpected error [max effort] ← reasoning...Step 50: Final commit [low effort] ← mechanicalThe result? My workflow now completes successfully. Same output quality, but I stay within rate limits.
Understanding the three effort tiers
Let me break down what each effort level actually does:
Low Effort
Best for: Routine file edits, simple tool calls, mechanical operations
Characteristics:- Fewer internal reasoning steps- Faster response time- Lower token consumption- Suitable for well-defined, predictable tasks
Examples:- Reading a file- Writing a known string to a file- Running a standard command- Simple text replacementI tested low effort on a simple file read:
# Task: Read /config/settings.json# Low effort result: Correct, immediate, 200 tokens# Max effort result: Correct, verbose explanation, 800 tokensSame correctness, 4x more tokens with max.
Medium Effort
Best for: Code generation, refactoring, moderate complexity tasks
Characteristics:- Balanced reasoning and speed- Good for creative tasks with clear requirements- Moderate token consumption
Examples:- Generate a function from spec- Refactor code to a new pattern- Write tests for existing code- Create a new componentI compared medium vs max for code generation:
# Task: Write a function to validate email addresses
# Medium effort:def validate_email(email: str) -> bool: import re pattern = r'^[\w\.-]+@[\w\.-]+\.\w+$' return bool(re.match(pattern, email))
# Max effort:def validate_email(email: str) -> bool: """ Validates an email address using RFC 5322 compliant regex.
Args: email: The email string to validate
Returns: True if valid, False otherwise
Note: This covers most common cases but may not handle all edge cases per RFC 5322 specification. """ import re pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$' return bool(re.match(pattern, email))
# Both correct, max has more documentationBoth work. Medium is faster and uses fewer tokens.
Max Effort
Best for: Planning, debugging, architecture decisions, complex analysis
Characteristics:- Maximum reasoning depth- Explores multiple approaches- Higher token consumption- Best for ambiguous or high-stakes tasks
Examples:- Debug a failing test- Design system architecture- Analyze root cause of production issue- Plan a complex refactoring strategyThis is where max effort shines. When I debugged a failing test:
# Task: Why is my authentication test failing?
# Low effort response:"The test expects status 200 but got 401. Check your auth configuration."
# Max effort response:"I analyzed the test failure. Here's what I found:
1. The test expects status 200 but receives 4012. Checking auth flow... - Token generation: working - Token validation: the issue3. Root cause: The test uses an expired token - Token was generated at 10:00 AM - Test runs at 11:30 AM - Token expiry: 1 hour4. Fix: Generate fresh token before test or extend expiry
Would you like me to implement option 1, 2, or both?"Max effort found the root cause. Low effort just stated the symptom.
The overthinking problem
Another insight from the Reddit thread:
“The failure mode with max effort on simple tasks is the model overthinking and adding scope you didn’t ask for.”
I experienced this. When I used max effort for a simple file rename:
My request: "Rename utils.js to helpers.js"
Max effort response:"I'll rename the file and update all imports. I've also:- Added JSDoc comments to all functions- Reorganized exports alphabetically- Created a deprecation notice for the old import path- Updated the README to reflect the change
Done! I've also created a migration guide."
What I actually wanted:"Renamed utils.js to helpers.js. Updated 3 import statements."Overthinking adds work I didn’t request. Low effort stays focused.
My routing strategy
Here’s the decision tree I now use:
Effort Routing Decision Tree:┌──────────────────────────────────────┐│ Does the task require deep reasoning? ││ (debugging, planning, architecture) │└──────────────┬───────────────────────┘ │ ┌───────┴───────┐ │ │ Yes No │ │ ▼ ▼ [Max effort] ┌─────────────────────────┐ │ Is it code generation │ │ or moderate complexity? │ └───────────┬─────────────┘ │ ┌───────┴───────┐ │ │ Yes No │ │ ▼ ▼ [Medium] [Low]I also implemented budget awareness in my orchestrations:
def route_effort(task_type: str, budget_remaining: float) -> str: # Force low if budget is critical if budget_remaining < 0.1: return "low"
# Route by task type routing = { "planning": "max", "debugging": "max", "architecture": "max", "code_generation": "medium", "refactoring": "medium", "file_read": "low", "file_write": "low", "directory_create": "low", "simple_command": "low", }
return routing.get(task_type, "medium")What I learned
Lesson 1: Default to low, escalate when needed
My previous approach was backward. I started at max and wondered why I ran out of budget. Now I start at low and only escalate when the task demands reasoning depth.
Lesson 2: The overthinking trap is real
Max effort on simple tasks creates noise. Extra comments, scope expansion, unnecessary refactoring. Low effort stays focused on what I asked for.
Lesson 3: Task type, not project importance
A “critical project” doesn’t need max effort everywhere. It needs max effort on the decisions that matter. Routine operations in a critical project are still routine.
Lesson 4: Budget compounds fast
50 steps × max effort × 800 tokens = 40,000 tokens50 steps × mixed routing × 300 tokens = 15,000 tokens
Difference: 2.6x more throughputLesson 5: Quality doesn’t suffer on mechanical tasks
I tested low vs max on file operations 100 times. Zero quality difference. The file was read correctly either way. The directory was created either way. Mechanical tasks don’t need reasoning.
Summary
In this post, I explained how to route effort levels in Claude Code workflows. The key point is routing by task type, not project type:
- Low effort for mechanical operations (file reads, writes, simple commands)
- Medium effort for code generation and moderate complexity
- Max effort for reasoning tasks (planning, debugging, architecture)
The critical insight from production experience: max effort on every step of a 50-call orchestration will exhaust hourly limits before the task finishes. Route by step type, not project importance.
After implementing task-based routing, my workflows complete successfully. Same output quality, but I stay within rate limits. The overthinking problem disappeared. And I stopped hitting that frustrating “rate limit exceeded” error.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments