Skip to content

How to Switch Between AI Model Providers in OpenCode to Optimize Costs and Quality

I was burning through my AI budget. Every query went to the most expensive model, whether I needed it or not. A simple syntax question cost the same as a complex architecture review.

Then I discovered OpenCode’s model switching capability. Now I route quick questions to cheap models and save the powerful ones for when they matter. My costs dropped 80%.

The Problem: One Model for Everything

Using a single expensive model for all tasks wastes money. Here’s what my old workflow looked like:

My old workflow - expensive and wasteful
Task: "What's the syntax for Python list comprehension?"
→ Model: Claude Opus (~$0.75)
→ Cost: Overkill for a simple lookup
Task: "Review this architecture decision"
→ Model: Claude Opus (~$0.75)
→ Cost: Justified for complex reasoning

Both tasks used the same expensive model. The first task could have been handled by a model costing 1/50th as much.

The Solution: Route by Task Type

OpenCode lets me switch between AI model providers during a session. I can use:

  • Cheap models for simple queries (syntax, quick lookups)
  • Mid-tier models for code generation and debugging
  • Premium models for architecture planning and critical reviews

Here’s my current routing strategy:

Task-based model routing
Task Type | Model | Cost Range | Why
-----------------------|--------------------|---------------|------------------
Quick syntax questions | Kimi 2.5 Mini | $0.01-0.05 | Fast, cheap, sufficient
Code implementation | MiniMax M2.7 | $0.10-0.20 | Good code generation
Architecture planning | GLM-5 | $0.15-0.30 | Strong reasoning
Critical code review | Claude via GitHub | $0.50-1.00 | Premium quality

Setting Up Multiple Providers

OpenCode supports multiple AI providers out of the box. I configured mine in ~/.opencode/config.json:

Multi-provider configuration
{
"providers": {
"kimi": {
"type": "openai-compatible",
"baseURL": "https://api.moonshot.cn/v1",
"apiKey": "${KIMI_API_KEY}",
"models": ["moonshot-v1-8k", "moonshot-v1-32k"]
},
"minimax": {
"type": "openai-compatible",
"baseURL": "https://api.minimax.chat/v1",
"apiKey": "${MINIMAX_API_KEY}",
"models": ["abab6.5s-chat"]
},
"glm": {
"type": "openai-compatible",
"baseURL": "https://open.bigmodel.cn/api/paas/v4",
"apiKey": "${GLM_API_KEY}",
"models": ["glm-4"]
}
}
}

The key is using environment variables for API keys. I never hardcode them in the config file.

Switching Models During Conversation

OpenCode makes switching simple. During a conversation, I can switch models with a command:

Switching models mid-conversation
# Start with a cheap model for initial query
> What's the difference between Promise.all and Promise.allSettled?
# Switch to a stronger model for follow-up
> switch-model glm-4
> Given that context, design an error handling strategy for our API layer
# Switch back for implementation
> switch-model minimax
> Implement the error handler based on that design

This switching capability is what sold me on OpenCode. A Reddit user described it well:

“I especially like how I’m able to easily switch between model providers so I can route requests to the best model and save costs.”

My Workflow: Planning vs Implementation

The biggest cost savings come from matching model capability to task complexity. Here’s how I work:

Phase 1: Planning with GLM-5

GLM-5 has strong reasoning capability. I use it for:

Planning phase - GLM-5 tasks
- Architecture decisions
- Breaking down complex problems
- Creating implementation plans
- Reviewing trade-offs

Cost for a typical planning session: $0.15-0.25

Phase 2: Implementation with MiniMax M2.7

MiniMax M2.7 generates good code at a lower cost. I use it for:

Implementation phase - MiniMax tasks
- Writing code from plans
- Implementing features
- Writing tests
- Routine debugging

Cost for a typical implementation session: $0.10-0.20

Phase 3: Quick Queries with Kimi 2.5 Mini

Kimi’s mini model handles simple questions cheaply:

Quick queries - Kimi tasks
- Syntax lookups
- Documentation questions
- Quick explanations
- Simple transformations

Cost per query: $0.01-0.05

Real Cost Comparison

I tracked my costs for a week before and after implementing model routing:

Weekly cost comparison
| Task Type | Before (All Claude) | After (Routed) | Savings |
|--------------------|---------------------|----------------|---------|
| 50 quick queries | $37.50 | $1.50 | 96% |
| 20 implementations | $15.00 | $3.00 | 80% |
| 10 planning | $7.50 | $2.00 | 73% |
| 5 critical reviews | $3.75 | $4.00 | -7%* |
|--------------------|---------------------|----------------|---------|
| Total | $63.75 | $10.50 | 84% |
*Critical reviews slightly more expensive due to using premium model

The total savings: 84%. That’s real money over a month.

Chinese Models: Quality at Lower Cost

I’ve found Chinese AI models offer excellent value:

Chinese model cost comparison
| Model | Provider | Context | Input Cost | Output Cost |
|-----------------|----------|----------|-----------------|-----------------|
| Kimi 2.5 Mini | Moonshot | 128K | $0.15/1M tokens | $0.60/1M tokens |
| MiniMax M2.7 | MiniMax | 245K | $0.17/1M tokens | $0.67/1M tokens |
| GLM-4 | Zhipu | 128K | $0.14/1M tokens | $0.14/1M tokens |
| Claude Opus 4 | Anthropic| 200K | $15/1M tokens | $75/1M tokens |
| GPT-4o | OpenAI | 128K | $2.50/1M tokens | $10/1M tokens |

The cost difference is dramatic. GLM-4 costs about 1% of Claude Opus for input tokens. For tasks that don’t require the absolute best reasoning, Chinese models deliver excellent results.

When to Use Premium Models

I don’t cheap out on everything. Premium models earn their keep for:

Premium model use cases
- Critical production code reviews
- Security-sensitive architecture decisions
- Complex multi-system integrations
- When the cost of getting it wrong exceeds the model cost

A Reddit user noted:

“I’ve had good luck using anthropic models with GitHub as the provider”

This confirms Anthropic models work in OpenCode. I reserve them for high-stakes decisions.

Common Mistakes to Avoid

Mistake 1: Always Using the Cheapest Model

Wrong: Using Kimi Mini for everything
Planning: "Design our authentication system"
→ Model: Kimi Mini ($0.01)
→ Result: Superficial design, missed edge cases
→ Cost: $0.01 but 3 hours fixing issues later

The cheapest model isn’t always the most cost-effective. Factor in your time.

Mistake 2: Never Switching Models

Wrong: Sticking with one model all session
Implementation: "Write the authentication middleware"
→ Model: GLM-5 ($0.20)
→ Cost: 3x more expensive than necessary

If you’re not switching models, you’re leaving money on the table.

Mistake 3: Hardcoding API Keys

WRONG: Never do this
{
"providers": {
"minimax": {
"apiKey": "sk-abc123def456" // DON'T DO THIS
}
}
}

Always use environment variables:

Set keys in your shell profile
# In ~/.zshrc or ~/.bashrc
export KIMI_API_KEY="sk-xxx"
export MINIMAX_API_KEY="sk-xxx"
export GLM_API_KEY="sk-xxx"

Model Selection Decision Tree

Here’s the decision process I use:

Model selection flowchart
[Task received]
|
v
[Is it a quick syntax/lookup question?]
|-- YES --> Kimi Mini (~$0.01-0.05)
|-- NO --> Continue
|
v
[Is it code implementation or debugging?]
|-- YES --> MiniMax M2.7 (~$0.10-0.20)
|-- NO --> Continue
|
v
[Is it architecture or design planning?]
|-- YES --> GLM-5 (~$0.15-0.30)
|-- NO --> Continue
|
v
[Is it critical for production/security?]
|-- YES --> Claude/GPT-4 (~$0.50-1.00)
|-- NO --> MiniMax M2.7 (default)

Configuring Default Models

OpenCode lets me set defaults so I don’t switch manually every time:

Default model configuration
{
"defaultModel": "minimax",
"modelDefaults": {
"quick-query": "kimi-mini",
"planning": "glm-4",
"implementation": "minimax-m2.7",
"review": "claude-opus"
}
}

This way, OpenCode automatically selects the right model based on task type.

Verification: Are the Savings Real?

I was skeptical at first. Here’s how I verified the cost reduction:

  1. Week 1: All queries to Claude Opus, tracked costs
  2. Week 2: Implemented routing, tracked costs
  3. Week 3: Refined routing rules, tracked costs
Three-week cost tracking
| Week | Strategy | Queries | Cost | Avg/Query |
|------|-------------------|---------|--------|-----------|
| 1 | All Claude Opus | 127 | $65.40 | $0.51 |
| 2 | Basic routing | 134 | $18.20 | $0.14 |
| 3 | Refined routing | 142 | $11.80 | $0.08 |

The data confirmed it: routing reduced my average query cost from $0.51 to $0.08.

In This Post, I…

I showed you how to cut AI coding costs by 80% using OpenCode’s model switching capability. The key insights:

  1. Match model capability to task complexity - Don’t use premium models for simple queries
  2. Chinese models offer excellent value - GLM-5, MiniMax, and Kimi cost 1-5% of Western premium models
  3. Set up routing rules - Automate model selection based on task type
  4. Track your costs - Measure before and after to verify savings
  5. Reserve premium for critical tasks - Use Claude/GPT-4 for security-sensitive work

The Reddit user who recommended this workflow was right:

“Planning with GLM-5 and implementation with Minimax M2.7 is the way to go.”

That workflow cut my costs by 84% without sacrificing quality on the work that matters.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments