Skip to content

How to Use Haiku, Sonnet, and Opus Together for AI Coding in 2026

My AI coding costs were out of control. Every time I ran my coding agent, I burned through expensive tokens. I was using Opus for everything—simple file renames, cron job scripts, even basic refactoring. There had to be a better way.

After months of trial and error, I discovered a tiered model strategy that reduced my costs by 70-90% while maintaining quality. The key insight? Match model capability to task complexity.

The Cost Problem

Let me show you what I was doing wrong:

My old workflow (expensive)
Task: Rename a variable across 5 files
Model: Opus
Cost: $0.15
Task: Write a cron job script
Model: Opus
Cost: $0.08
Task: Fix a complex bug in auth flow
Model: Opus
Cost: $0.45
Total: $0.68 per day
Monthly: ~$20 for routine work

The bug fix justified Opus. But renaming variables? Writing cron scripts? Overkill.

The Tiered Strategy

Here’s what I learned from the community and my own experiments:

Model tiers by task type
+----------+---------------------------+---------------------------+
| Model | Best For | Avoid For |
+==========+===========================+===========================+
| Haiku | - Default operations | - Complex architecture |
| | - Cron jobs | - Bug fixing |
| | - Simple refactors | - Creation tasks |
| | - File operations | - Multi-file reasoning |
+----------+---------------------------+---------------------------+
| Sonnet | - Investigation | - Complex creation |
| | - Planning | - Deep debugging |
| | - Medium complexity | - Architecture decisions |
+----------+---------------------------+---------------------------+
| Opus | - Creating new features | - Simple tasks |
| | - Fixing bugs | - Cron jobs |
| | - Complex reasoning | - File operations |
+----------+---------------------------+---------------------------+

Why This Works: The Staff Engineer Analogy

The best way to think about this came from a Reddit comment that changed my perspective:

“Opus acts as the staff engineer that guides the junior engineers (cheaper models), oversees and tests their work.”

This was a breakthrough moment. I wasn’t supposed to use Opus for everything. Opus should:

  1. Guide the cheaper models on what to do
  2. Review their work
  3. Handle only the complex problems

Here’s how I restructured my workflow:

New workflow with staff engineer pattern
Phase 1: Planning
Model: Sonnet
Task: "Investigate the auth bug and create a plan"
Cost: $0.05
Phase 2: Execution
Model: Haiku
Task: "Follow the plan: rename variables, update imports"
Cost: $0.01
Phase 3: Review
Model: Opus
Task: "Review the changes, test auth flow, fix any issues"
Cost: $0.10
Total: $0.16 (vs $0.68 old approach)

What Happens When You Use the Wrong Model

I learned this the hard way through several failed experiments.

Using Haiku for Complex Tasks

When I tried using Haiku for everything, two problems emerged:

  1. Over-engineering: Haiku adds unnecessary abstractions
  2. Poor reasoning: Multi-file changes often miss dependencies
Haiku over-engineering example
# What I asked:
"Add error handling to the database module"
# What Haiku created:
- Created 5 new wrapper classes
- Added a custom error type hierarchy
- Implemented a retry system I didn't ask for
- Missed the actual error cases I needed
# Result: More code, more problems, less value

Using Opus for Simple Tasks

This was just wasteful. Opus did the job well, but at 10x the cost:

Opus for simple tasks (expensive)
Task: Update a config file
Haiku cost: $0.002
Opus cost: $0.02
Difference: 10x for the same result

Using Sonnet for Creation

Sonnet is great for investigation, but when I asked it to create a new feature:

Sonnet creation limitations
# What I asked:
"Create a new payment integration"
# What Sonnet did:
- Good planning and file structure
- Struggled with edge cases
- Missed important error scenarios
- Required multiple iterations
# Same task with Opus:
- Handled edge cases upfront
- Considered security implications
- Got it right in fewer iterations

The Anti-Patterns I Avoid Now

Anti-Pattern 1: “Just Use Opus for Everything”

This seems safe but it’s expensive and actually reduces quality. Opus sometimes overthinks simple tasks.

Anti-Pattern 2: “Cheapest Model Always”

Haiku for everything leads to over-engineering and poor results on complex tasks. You’ll spend more time fixing its output.

Anti-Pattern 3: “Skip the Planning Phase”

Rushing to execution without Sonnet planning leads to:

  • More iterations
  • Higher costs overall
  • Poor architecture decisions

Memory Optimization: Extending the Strategy

A bonus tip from the community: optimize your context window too.

Memory optimization layers
+------------------+-------------------------------+-----------------+
| Memory Type | Solution | Benefit |
+==================+===============================+=================+
| Long-term | QMD (Query Model Directory) | Persistent |
| | | knowledge |
+------------------+-------------------------------+-----------------+
| Session/Short | Compaction or | Token savings |
| | lossless-clawd | within context |
+------------------+-------------------------------+-----------------+

These tools compress your context, letting you run longer sessions with fewer tokens.

My Current Workflow

Here’s how I approach different tasks now:

Decision flow for model selection
+-------------------+
| What's the task? |
+-------------------+
|
+-----------+-----------+
| |
+-----v-----+ +----v----+
| Routine? | | Complex?|
+-----------+ +---------+
| |
+-----v-----+ +-----v----+
| Haiku | | Sonnet |
+-----------+ +----------+
|
+-------v-------+
| Needs deep |
| reasoning? |
+---------------+
|
+-------v-------+
| Opus |
+---------------+

Routine Tasks → Haiku

  • File renames
  • Config updates
  • Cron jobs
  • Simple refactors

Investigation → Sonnet

  • Understanding codebases
  • Planning features
  • Medium complexity debugging

Creation & Bugs → Opus

  • New feature implementation
  • Complex bug fixing
  • Architecture decisions

The Results

After three months of this approach:

Cost comparison
+-------------------+---------------+---------------+
| Metric | Before | After |
+===================+===============+===============+
| Daily cost | ~$0.68 | ~$0.15 |
| Monthly cost | ~$20 | ~$4.50 |
| Iteration count | Higher | Lower |
| Code quality | Variable | Consistent |
+-------------------+---------------+---------------+

Savings: 77% cost reduction with better quality.

Key Takeaways

  1. Use Haiku as default. It’s fast and cheap for routine work.

  2. Use Sonnet for planning. It’s great at investigation and structuring work.

  3. Reserve Opus for complexity. Creation, bugs, and deep reasoning.

  4. Think staff engineer pattern. Opus guides and reviews, doesn’t do everything.

  5. Optimize memory too. QMD and compaction save tokens on long sessions.

The tiered strategy isn’t about being cheap. It’s about being smart. You get better results by using each model for what it does best.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments