When Should I Use Claude Opus vs Sonnet in Claude Code? A Practical Guide
Problem
When I started using Claude Code, I had no idea which model to use. I’d pick one and stick with it for everything—planning, coding, debugging, refactoring. Sometimes I’d burn through my budget on simple tasks. Other times I’d get shallow architectural decisions because I picked the wrong model for complex problems.
Here’s what happened when I used Sonnet for everything:
Me: Design a microservices architecture for our e-commerce platform
Sonnet: Here's a simple microservices setup with 3 services:- User service- Product service- Order service
[Missing: security considerations, data consistency patterns, failure handling, scalability analysis...]The response was technically correct but dangerously incomplete. No discussion of trade-offs, no analysis of failure modes, no consideration of operational complexity.
Then I tried using Opus for everything:
Me: Add a validation function for email addresses
Opus: [15 seconds of thinking][Detailed analysis of email validation approaches][Consideration of RFC 5322 compliance vs pragmatic validation][Discussion of internationalization concerns][Implementation with multiple test cases][Total cost: $0.12 for a 5-line function]I spent $0.12 on a task Sonnet could handle for $0.01.
What I Found
I searched for best practices on model selection and found a Reddit discussion about Claude Code workflows. The key insight was this: use Opus for planning and Sonnet for execution.
One commenter explained it this way:
“Sonnet is faster and cheaper. Excellent for execution tasks where the path is already clear: writing boilerplate, refactoring based on a specific plan, implementing features where the architectural decisions are already made.”
Another added:
“Opus is slower and more expensive. Better for complex reasoning, planning, and tasks where you need Claude to think carefully about tradeoffs.”
This made sense. I was treating these models as interchangeable when they’re actually designed for different phases of work.
The Two-Phase Workflow
I restructured my approach into two distinct phases:
┌─────────────────────────────────────────────────────────────┐│ Phase 1: Opus ││ ┌─────────────────────────────────────────────────────┐ ││ │ Input: Ambiguous problem │ ││ │ Output: Clear plan, architecture decisions │ ││ │ Cost: Higher per message │ ││ │ Time: Slower, deeper reasoning │ ││ └─────────────────────────────────────────────────────┘ ││ │ ││ ▼ ││ [Handoff via CLAUDE.md] ││ │ ││ ▼ ││ ┌─────────────────────────────────────────────────────┐ ││ │ Input: Clear plan from Opus │ ││ │ Output: Implemented code │ ││ │ Cost: ~1/3 of Opus │ ││ │ Time: Faster execution │ ││ └─────────────────────────────────────────────────────┘ ││ │└─────────────────────────────────────────────────────────────┘Phase 1: Opus for Planning
I use Opus when I need:
- Architectural decisions: “Should we use a monolith or microservices?”
- Complex refactoring strategies: “How do we migrate from REST to GraphQL without breaking existing clients?”
- Debugging intricate interactions: “Why does this race condition only happen under load?”
- Trade-off analysis: “What’s the cost-benefit of using PostgreSQL vs MongoDB for our use case?”
Here’s a real example. I asked Opus to plan an authentication system:
Me: Plan an authentication system for our app. Consider security, UX, and scalability.
Opus: Let me analyze this systematically...
## Security Considerations- Password storage: Argon2id with minimum 64MB memory, 3 iterations- Session management: Short-lived JWTs (15 min) with refresh tokens- Rate limiting: 5 attempts per 15 minutes, exponential backoff- MFA: Support TOTP as baseline, WebAuthn for enhanced security
## UX Trade-offs- Password requirements: 12+ chars vs user friction- Session persistence: "Remember me" extends refresh token to 30 days- Social login: Reduces friction but adds dependency risk
## Scalability Implications- Token validation: Stateless JWTs vs database lookups- Session storage: Redis cluster for horizontal scaling- Rate limiting: Distributed counter in Redis
## Recommended Architecture[Diagram showing auth flow with specific technology choices]
## Implementation Phases1. Core auth (Week 1): Password hashing, JWT generation2. Security hardening (Week 2): Rate limiting, MFA3. Social login (Week 3): OAuth providersThis took about 30 seconds and cost me ~$0.40. But it saved me hours of backtracking later.
Phase 2: Sonnet for Execution
With a clear plan from Opus, I switch to Sonnet for implementation:
Me: Implement the core authentication system according to the plan in AUTH_PLAN.md
Sonnet: I'll implement the core authentication with:- Argon2id password hashing- JWT generation and validation- Refresh token rotation
[Creates files:]- src/auth/password.ts (Argon2id implementation)- src/auth/jwt.ts (JWT utilities)- src/auth/session.ts (Session management)- tests/auth.test.ts (Unit tests)
All implementations follow the security parameters from the plan.Sonnet executed the plan in half the time at 1/3 the cost.
The Handoff
The critical piece I was missing: how to pass context from Opus to Sonnet.
I use a CLAUDE.md file in my project root:
# Project Constraints
## Authentication- See AUTH_PLAN.md for architecture decisions- Use Argon2id for password hashing (not bcrypt)- JWTs expire after 15 minutes- Refresh tokens rotate on each use
## Code Style- Functional components with hooks- Tests required for all new modules- No console.log in production codeAs one Reddit commenter noted:
“Your CLAUDE.md ensures both models operate under the same constraints, so the handoff is clean.”
This file persists across sessions and models. When I start a Sonnet session after Opus planning, Sonnet reads this file and follows the established constraints.
Decision Matrix
I created a quick reference for choosing models:
| Task Type | Model | Why |
|---|---|---|
| New feature architecture | Opus | Requires trade-off analysis |
| Implementing planned feature | Sonnet | Clear execution path |
| Complex bug investigation | Opus | Deep reasoning needed |
| Simple bug fix (known cause) | Sonnet | Direct implementation |
| Large-scale refactoring | Opus -> Sonnet | Plan first, execute second |
| Adding tests | Sonnet | Routine implementation |
| System design | Opus | Architectural reasoning |
| Documentation updates | Sonnet | Clear execution |
| Debugging race conditions | Opus | Complex interaction analysis |
| CSS styling changes | Sonnet | Straightforward implementation |
Common Mistakes
I made all of these mistakes before I understood the workflow:
Mistake 1: Using Opus for Everything
Me (Opus): Fix the typo in README.mdOpus: [5 seconds of analysis about documentation standards] [Consideration of typo impact on user comprehension] [Implementation of fix] [Cost: $0.08 for a 2-second task]Fix: Reserve Opus for tasks requiring trade-off analysis.
Mistake 2: Using Sonnet for Architecture
Me (Sonnet): Design our API rate limiting strategySonnet: Use Redis for rate limiting. Set 100 requests per minute.
[Missing: What about distributed systems? What about different rate limits per endpoint? What about burst handling?]Fix: Always use Opus for architectural decisions with multiple valid approaches.
Mistake 3: Missing Handoff Documentation
Opus: We should use optimistic locking for concurrency...[No documentation created][Later, in Sonnet session]Sonnet: I'll implement pessimistic locking because it's simpler...[Architectural decision from Opus ignored]Fix: Document decisions in CLAUDE.md or a plan file.
Mistake 4: No Clear Decision Criteria
I used to pick models randomly based on gut feeling. This wasted mental energy and led to inconsistent results.
Fix: Use the decision matrix above. If the task requires weighing trade-offs, use Opus. If the path is clear, use Sonnet.
Real Workflow Example
Here’s how I approach a feature now:
Step 1: Opus Planning (cost ~$0.30)
Me (Opus): We need to add real-time notifications. Plan the architecture.
Opus: [Analyzes requirements] [Compares WebSocket vs Server-Sent Events vs Long Polling] [Considers scalability implications] [Recommends SSE with Redis pub/sub] [Creates implementation phases]
Output: NOTIFICATIONS_PLAN.mdStep 2: Sonnet Implementation (cost ~$0.10)
Me (Sonnet): Implement the notification system according to NOTIFICATIONS_PLAN.md
Sonnet: [Implements SSE server] [Creates Redis pub/sub layer] [Builds client-side hooks] [Writes tests]
Output: Working notification system in 15 minutesTotal cost: ~$0.40
Previous cost (Opus only): ~$1.20
That’s a 3x cost savings on a single feature.
Cost Comparison
Based on actual usage patterns:
Opus-Only Approach:- All tasks at Opus pricing- Average session: $0.50-$2.00- Monthly cost: $50-$200
Hybrid Approach:- Planning (20% of time): Opus- Execution (80% of time): Sonnet- Average session: $0.15-$0.60- Monthly cost: $15-$60
Savings: ~70%The key insight from the Reddit thread:
“Sonnet delivers 90% of Opus capability at 1/3 the cost for execution tasks.”
The 10% gap matters for planning and architecture. It doesn’t matter for implementing a planned feature.
Summary
In this post, I explained how to choose between Claude Opus and Sonnet in Claude Code.
The key principle: Opus plans, Sonnet executes. Use Opus for tasks requiring trade-off analysis, architectural reasoning, and complex debugging. Use Sonnet for implementing clear plans, writing tests, and straightforward coding tasks.
The workflow is:
- Start complex features with Opus planning
- Document decisions in CLAUDE.md or plan files
- Switch to Sonnet for implementation
- Reference the plan to maintain consistency
This approach delivers 90% of Opus capability at 1/3 the cost. I went from spending $150/month on AI assistance to under $50/month—without sacrificing quality on complex architectural decisions.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments