How to Switch Between AI Model Providers in OpenCode to Optimize Costs and Quality

Apr 4, 2026

I was burning through my AI budget. Every query went to the most expensive model, whether I needed it or not. A simple syntax question cost the same as a complex architecture review.

Then I discovered OpenCode’s model switching capability. Now I route quick questions to cheap models and save the powerful ones for when they matter. My costs dropped 80%.

The Problem: One Model for Everything

Using a single expensive model for all tasks wastes money. Here’s what my old workflow looked like:

Task: "What's the syntax for Python list comprehension?"
→ Model: Claude Opus (~$0.75)
→ Cost: Overkill for a simple lookup

Task: "Review this architecture decision"
→ Model: Claude Opus (~$0.75)
→ Cost: Justified for complex reasoning

Both tasks used the same expensive model. The first task could have been handled by a model costing 1/50th as much.

The Solution: Route by Task Type

OpenCode lets me switch between AI model providers during a session. I can use:

Cheap models for simple queries (syntax, quick lookups)
Mid-tier models for code generation and debugging
Premium models for architecture planning and critical reviews

Here’s my current routing strategy:

Task Type              | Model              | Cost Range    | Why
-----------------------|--------------------|---------------|------------------
Quick syntax questions | Kimi 2.5 Mini      | $0.01-0.05    | Fast, cheap, sufficient
Code implementation    | MiniMax M2.7       | $0.10-0.20    | Good code generation
Architecture planning | GLM-5              | $0.15-0.30    | Strong reasoning
Critical code review  | Claude via GitHub  | $0.50-1.00    | Premium quality

Setting Up Multiple Providers

OpenCode supports multiple AI providers out of the box. I configured mine in ~/.opencode/config.json:

{
  "providers": {
    "kimi": {
      "type": "openai-compatible",
      "baseURL": "https://api.moonshot.cn/v1",
      "apiKey": "${KIMI_API_KEY}",
      "models": ["moonshot-v1-8k", "moonshot-v1-32k"]
    },
    "minimax": {
      "type": "openai-compatible",
      "baseURL": "https://api.minimax.chat/v1",
      "apiKey": "${MINIMAX_API_KEY}",
      "models": ["abab6.5s-chat"]
    },
    "glm": {
      "type": "openai-compatible",
      "baseURL": "https://open.bigmodel.cn/api/paas/v4",
      "apiKey": "${GLM_API_KEY}",
      "models": ["glm-4"]
    }
  }
}

The key is using environment variables for API keys. I never hardcode them in the config file.

Switching Models During Conversation

OpenCode makes switching simple. During a conversation, I can switch models with a command:

# Start with a cheap model for initial query
> What's the difference between Promise.all and Promise.allSettled?

# Switch to a stronger model for follow-up
> switch-model glm-4
> Given that context, design an error handling strategy for our API layer

# Switch back for implementation
> switch-model minimax
> Implement the error handler based on that design

This switching capability is what sold me on OpenCode. A Reddit user described it well:

“I especially like how I’m able to easily switch between model providers so I can route requests to the best model and save costs.”

My Workflow: Planning vs Implementation

The biggest cost savings come from matching model capability to task complexity. Here’s how I work:

Phase 1: Planning with GLM-5

GLM-5 has strong reasoning capability. I use it for:

- Architecture decisions
- Breaking down complex problems
- Creating implementation plans
- Reviewing trade-offs

Cost for a typical planning session: $0.15-0.25

Phase 2: Implementation with MiniMax M2.7

MiniMax M2.7 generates good code at a lower cost. I use it for:

- Writing code from plans
- Implementing features
- Writing tests
- Routine debugging

Cost for a typical implementation session: $0.10-0.20

Phase 3: Quick Queries with Kimi 2.5 Mini

Kimi’s mini model handles simple questions cheaply:

- Syntax lookups
- Documentation questions
- Quick explanations
- Simple transformations

Cost per query: $0.01-0.05

Real Cost Comparison

I tracked my costs for a week before and after implementing model routing:

| Task Type          | Before (All Claude) | After (Routed) | Savings |
|--------------------|---------------------|----------------|---------|
| 50 quick queries   | $37.50              | $1.50          | 96%     |
| 20 implementations | $15.00              | $3.00          | 80%     |
| 10 planning        | $7.50               | $2.00          | 73%     |
| 5 critical reviews | $3.75               | $4.00          | -7%*    |
|--------------------|---------------------|----------------|---------|
| Total              | $63.75              | $10.50         | 84%     |

*Critical reviews slightly more expensive due to using premium model

The total savings: 84%. That’s real money over a month.

Chinese Models: Quality at Lower Cost

I’ve found Chinese AI models offer excellent value:

| Model           | Provider | Context  | Input Cost      | Output Cost     |
|-----------------|----------|----------|-----------------|-----------------|
| Kimi 2.5 Mini   | Moonshot | 128K     | $0.15/1M tokens | $0.60/1M tokens |
| MiniMax M2.7    | MiniMax  | 245K     | $0.17/1M tokens | $0.67/1M tokens |
| GLM-4           | Zhipu    | 128K     | $0.14/1M tokens | $0.14/1M tokens |
| Claude Opus 4   | Anthropic| 200K     | $15/1M tokens   | $75/1M tokens   |
| GPT-4o          | OpenAI   | 128K     | $2.50/1M tokens | $10/1M tokens  |

The cost difference is dramatic. GLM-4 costs about 1% of Claude Opus for input tokens. For tasks that don’t require the absolute best reasoning, Chinese models deliver excellent results.

When to Use Premium Models

I don’t cheap out on everything. Premium models earn their keep for:

- Critical production code reviews
- Security-sensitive architecture decisions
- Complex multi-system integrations
- When the cost of getting it wrong exceeds the model cost

A Reddit user noted:

“I’ve had good luck using anthropic models with GitHub as the provider”

This confirms Anthropic models work in OpenCode. I reserve them for high-stakes decisions.

Common Mistakes to Avoid

Mistake 1: Always Using the Cheapest Model

Planning: "Design our authentication system"
→ Model: Kimi Mini ($0.01)
→ Result: Superficial design, missed edge cases
→ Cost: $0.01 but 3 hours fixing issues later

The cheapest model isn’t always the most cost-effective. Factor in your time.

Mistake 2: Never Switching Models

Implementation: "Write the authentication middleware"
→ Model: GLM-5 ($0.20)
→ Cost: 3x more expensive than necessary

If you’re not switching models, you’re leaving money on the table.

Mistake 3: Hardcoding API Keys

{
  "providers": {
    "minimax": {
      "apiKey": "sk-abc123def456"  // DON'T DO THIS
    }
  }
}

Always use environment variables:

# In ~/.zshrc or ~/.bashrc
export KIMI_API_KEY="sk-xxx"
export MINIMAX_API_KEY="sk-xxx"
export GLM_API_KEY="sk-xxx"

Model Selection Decision Tree

Here’s the decision process I use:

[Task received]
     |
     v
[Is it a quick syntax/lookup question?]
     |-- YES --> Kimi Mini (~$0.01-0.05)
     |-- NO --> Continue
     |
     v
[Is it code implementation or debugging?]
     |-- YES --> MiniMax M2.7 (~$0.10-0.20)
     |-- NO --> Continue
     |
     v
[Is it architecture or design planning?]
     |-- YES --> GLM-5 (~$0.15-0.30)
     |-- NO --> Continue
     |
     v
[Is it critical for production/security?]
     |-- YES --> Claude/GPT-4 (~$0.50-1.00)
     |-- NO --> MiniMax M2.7 (default)

Configuring Default Models

OpenCode lets me set defaults so I don’t switch manually every time:

{
  "defaultModel": "minimax",
  "modelDefaults": {
    "quick-query": "kimi-mini",
    "planning": "glm-4",
    "implementation": "minimax-m2.7",
    "review": "claude-opus"
  }
}

This way, OpenCode automatically selects the right model based on task type.

Verification: Are the Savings Real?

I was skeptical at first. Here’s how I verified the cost reduction:

Week 1: All queries to Claude Opus, tracked costs
Week 2: Implemented routing, tracked costs
Week 3: Refined routing rules, tracked costs

| Week | Strategy          | Queries | Cost   | Avg/Query |
|------|-------------------|---------|--------|-----------|
| 1    | All Claude Opus   | 127     | $65.40 | $0.51     |
| 2    | Basic routing     | 134     | $18.20 | $0.14     |
| 3    | Refined routing   | 142     | $11.80 | $0.08     |

The data confirmed it: routing reduced my average query cost from $0.51 to $0.08.

In This Post, I…

I showed you how to cut AI coding costs by 80% using OpenCode’s model switching capability. The key insights:

Match model capability to task complexity - Don’t use premium models for simple queries
Chinese models offer excellent value - GLM-5, MiniMax, and Kimi cost 1-5% of Western premium models
Set up routing rules - Automate model selection based on task type
Track your costs - Measure before and after to verify savings
Reserve premium for critical tasks - Use Claude/GPT-4 for security-sensitive work

The Reddit user who recommended this workflow was right:

“Planning with GLM-5 and implementation with Minimax M2.7 is the way to go.”

That workflow cut my costs by 84% without sacrificing quality on the work that matters.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 OpenCode CLI
👨‍💻 MiniMax API
👨‍💻 GLM API
👨‍💻 Kimi API

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!