How to Use Haiku, Sonnet, and Opus Together for AI Coding in 2026
My AI coding costs were out of control. Every time I ran my coding agent, I burned through expensive tokens. I was using Opus for everything—simple file renames, cron job scripts, even basic refactoring. There had to be a better way.
After months of trial and error, I discovered a tiered model strategy that reduced my costs by 70-90% while maintaining quality. The key insight? Match model capability to task complexity.
The Cost Problem
Let me show you what I was doing wrong:
Task: Rename a variable across 5 filesModel: OpusCost: $0.15
Task: Write a cron job scriptModel: OpusCost: $0.08
Task: Fix a complex bug in auth flowModel: OpusCost: $0.45
Total: $0.68 per dayMonthly: ~$20 for routine workThe bug fix justified Opus. But renaming variables? Writing cron scripts? Overkill.
The Tiered Strategy
Here’s what I learned from the community and my own experiments:
+----------+---------------------------+---------------------------+| Model | Best For | Avoid For |+==========+===========================+===========================+| Haiku | - Default operations | - Complex architecture || | - Cron jobs | - Bug fixing || | - Simple refactors | - Creation tasks || | - File operations | - Multi-file reasoning |+----------+---------------------------+---------------------------+| Sonnet | - Investigation | - Complex creation || | - Planning | - Deep debugging || | - Medium complexity | - Architecture decisions |+----------+---------------------------+---------------------------+| Opus | - Creating new features | - Simple tasks || | - Fixing bugs | - Cron jobs || | - Complex reasoning | - File operations |+----------+---------------------------+---------------------------+Why This Works: The Staff Engineer Analogy
The best way to think about this came from a Reddit comment that changed my perspective:
“Opus acts as the staff engineer that guides the junior engineers (cheaper models), oversees and tests their work.”
This was a breakthrough moment. I wasn’t supposed to use Opus for everything. Opus should:
- Guide the cheaper models on what to do
- Review their work
- Handle only the complex problems
Here’s how I restructured my workflow:
Phase 1: Planning Model: Sonnet Task: "Investigate the auth bug and create a plan" Cost: $0.05
Phase 2: Execution Model: Haiku Task: "Follow the plan: rename variables, update imports" Cost: $0.01
Phase 3: Review Model: Opus Task: "Review the changes, test auth flow, fix any issues" Cost: $0.10
Total: $0.16 (vs $0.68 old approach)What Happens When You Use the Wrong Model
I learned this the hard way through several failed experiments.
Using Haiku for Complex Tasks
When I tried using Haiku for everything, two problems emerged:
- Over-engineering: Haiku adds unnecessary abstractions
- Poor reasoning: Multi-file changes often miss dependencies
# What I asked:"Add error handling to the database module"
# What Haiku created:- Created 5 new wrapper classes- Added a custom error type hierarchy- Implemented a retry system I didn't ask for- Missed the actual error cases I needed
# Result: More code, more problems, less valueUsing Opus for Simple Tasks
This was just wasteful. Opus did the job well, but at 10x the cost:
Task: Update a config fileHaiku cost: $0.002Opus cost: $0.02Difference: 10x for the same resultUsing Sonnet for Creation
Sonnet is great for investigation, but when I asked it to create a new feature:
# What I asked:"Create a new payment integration"
# What Sonnet did:- Good planning and file structure- Struggled with edge cases- Missed important error scenarios- Required multiple iterations
# Same task with Opus:- Handled edge cases upfront- Considered security implications- Got it right in fewer iterationsThe Anti-Patterns I Avoid Now
Anti-Pattern 1: “Just Use Opus for Everything”
This seems safe but it’s expensive and actually reduces quality. Opus sometimes overthinks simple tasks.
Anti-Pattern 2: “Cheapest Model Always”
Haiku for everything leads to over-engineering and poor results on complex tasks. You’ll spend more time fixing its output.
Anti-Pattern 3: “Skip the Planning Phase”
Rushing to execution without Sonnet planning leads to:
- More iterations
- Higher costs overall
- Poor architecture decisions
Memory Optimization: Extending the Strategy
A bonus tip from the community: optimize your context window too.
+------------------+-------------------------------+-----------------+| Memory Type | Solution | Benefit |+==================+===============================+=================+| Long-term | QMD (Query Model Directory) | Persistent || | | knowledge |+------------------+-------------------------------+-----------------+| Session/Short | Compaction or | Token savings || | lossless-clawd | within context |+------------------+-------------------------------+-----------------+These tools compress your context, letting you run longer sessions with fewer tokens.
My Current Workflow
Here’s how I approach different tasks now:
+-------------------+ | What's the task? | +-------------------+ | +-----------+-----------+ | | +-----v-----+ +----v----+ | Routine? | | Complex?| +-----------+ +---------+ | | +-----v-----+ +-----v----+ | Haiku | | Sonnet | +-----------+ +----------+ | +-------v-------+ | Needs deep | | reasoning? | +---------------+ | +-------v-------+ | Opus | +---------------+Routine Tasks → Haiku
- File renames
- Config updates
- Cron jobs
- Simple refactors
Investigation → Sonnet
- Understanding codebases
- Planning features
- Medium complexity debugging
Creation & Bugs → Opus
- New feature implementation
- Complex bug fixing
- Architecture decisions
The Results
After three months of this approach:
+-------------------+---------------+---------------+| Metric | Before | After |+===================+===============+===============+| Daily cost | ~$0.68 | ~$0.15 || Monthly cost | ~$20 | ~$4.50 || Iteration count | Higher | Lower || Code quality | Variable | Consistent |+-------------------+---------------+---------------+Savings: 77% cost reduction with better quality.
Key Takeaways
-
Use Haiku as default. It’s fast and cheap for routine work.
-
Use Sonnet for planning. It’s great at investigation and structuring work.
-
Reserve Opus for complexity. Creation, bugs, and deep reasoning.
-
Think staff engineer pattern. Opus guides and reviews, doesn’t do everything.
-
Optimize memory too. QMD and compaction save tokens on long sessions.
The tiered strategy isn’t about being cheap. It’s about being smart. You get better results by using each model for what it does best.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments