How to Switch Between AI Model Providers in OpenCode to Optimize Costs and Quality
I was burning through my AI budget. Every query went to the most expensive model, whether I needed it or not. A simple syntax question cost the same as a complex architecture review.
Then I discovered OpenCode’s model switching capability. Now I route quick questions to cheap models and save the powerful ones for when they matter. My costs dropped 80%.
The Problem: One Model for Everything
Using a single expensive model for all tasks wastes money. Here’s what my old workflow looked like:
Task: "What's the syntax for Python list comprehension?"→ Model: Claude Opus (~$0.75)→ Cost: Overkill for a simple lookup
Task: "Review this architecture decision"→ Model: Claude Opus (~$0.75)→ Cost: Justified for complex reasoningBoth tasks used the same expensive model. The first task could have been handled by a model costing 1/50th as much.
The Solution: Route by Task Type
OpenCode lets me switch between AI model providers during a session. I can use:
- Cheap models for simple queries (syntax, quick lookups)
- Mid-tier models for code generation and debugging
- Premium models for architecture planning and critical reviews
Here’s my current routing strategy:
Task Type | Model | Cost Range | Why-----------------------|--------------------|---------------|------------------Quick syntax questions | Kimi 2.5 Mini | $0.01-0.05 | Fast, cheap, sufficientCode implementation | MiniMax M2.7 | $0.10-0.20 | Good code generationArchitecture planning | GLM-5 | $0.15-0.30 | Strong reasoningCritical code review | Claude via GitHub | $0.50-1.00 | Premium qualitySetting Up Multiple Providers
OpenCode supports multiple AI providers out of the box. I configured mine in ~/.opencode/config.json:
{ "providers": { "kimi": { "type": "openai-compatible", "baseURL": "https://api.moonshot.cn/v1", "apiKey": "${KIMI_API_KEY}", "models": ["moonshot-v1-8k", "moonshot-v1-32k"] }, "minimax": { "type": "openai-compatible", "baseURL": "https://api.minimax.chat/v1", "apiKey": "${MINIMAX_API_KEY}", "models": ["abab6.5s-chat"] }, "glm": { "type": "openai-compatible", "baseURL": "https://open.bigmodel.cn/api/paas/v4", "apiKey": "${GLM_API_KEY}", "models": ["glm-4"] } }}The key is using environment variables for API keys. I never hardcode them in the config file.
Switching Models During Conversation
OpenCode makes switching simple. During a conversation, I can switch models with a command:
# Start with a cheap model for initial query> What's the difference between Promise.all and Promise.allSettled?
# Switch to a stronger model for follow-up> switch-model glm-4> Given that context, design an error handling strategy for our API layer
# Switch back for implementation> switch-model minimax> Implement the error handler based on that designThis switching capability is what sold me on OpenCode. A Reddit user described it well:
“I especially like how I’m able to easily switch between model providers so I can route requests to the best model and save costs.”
My Workflow: Planning vs Implementation
The biggest cost savings come from matching model capability to task complexity. Here’s how I work:
Phase 1: Planning with GLM-5
GLM-5 has strong reasoning capability. I use it for:
- Architecture decisions- Breaking down complex problems- Creating implementation plans- Reviewing trade-offsCost for a typical planning session: $0.15-0.25
Phase 2: Implementation with MiniMax M2.7
MiniMax M2.7 generates good code at a lower cost. I use it for:
- Writing code from plans- Implementing features- Writing tests- Routine debuggingCost for a typical implementation session: $0.10-0.20
Phase 3: Quick Queries with Kimi 2.5 Mini
Kimi’s mini model handles simple questions cheaply:
- Syntax lookups- Documentation questions- Quick explanations- Simple transformationsCost per query: $0.01-0.05
Real Cost Comparison
I tracked my costs for a week before and after implementing model routing:
| Task Type | Before (All Claude) | After (Routed) | Savings ||--------------------|---------------------|----------------|---------|| 50 quick queries | $37.50 | $1.50 | 96% || 20 implementations | $15.00 | $3.00 | 80% || 10 planning | $7.50 | $2.00 | 73% || 5 critical reviews | $3.75 | $4.00 | -7%* ||--------------------|---------------------|----------------|---------|| Total | $63.75 | $10.50 | 84% |
*Critical reviews slightly more expensive due to using premium modelThe total savings: 84%. That’s real money over a month.
Chinese Models: Quality at Lower Cost
I’ve found Chinese AI models offer excellent value:
| Model | Provider | Context | Input Cost | Output Cost ||-----------------|----------|----------|-----------------|-----------------|| Kimi 2.5 Mini | Moonshot | 128K | $0.15/1M tokens | $0.60/1M tokens || MiniMax M2.7 | MiniMax | 245K | $0.17/1M tokens | $0.67/1M tokens || GLM-4 | Zhipu | 128K | $0.14/1M tokens | $0.14/1M tokens || Claude Opus 4 | Anthropic| 200K | $15/1M tokens | $75/1M tokens || GPT-4o | OpenAI | 128K | $2.50/1M tokens | $10/1M tokens |The cost difference is dramatic. GLM-4 costs about 1% of Claude Opus for input tokens. For tasks that don’t require the absolute best reasoning, Chinese models deliver excellent results.
When to Use Premium Models
I don’t cheap out on everything. Premium models earn their keep for:
- Critical production code reviews- Security-sensitive architecture decisions- Complex multi-system integrations- When the cost of getting it wrong exceeds the model costA Reddit user noted:
“I’ve had good luck using anthropic models with GitHub as the provider”
This confirms Anthropic models work in OpenCode. I reserve them for high-stakes decisions.
Common Mistakes to Avoid
Mistake 1: Always Using the Cheapest Model
Planning: "Design our authentication system"→ Model: Kimi Mini ($0.01)→ Result: Superficial design, missed edge cases→ Cost: $0.01 but 3 hours fixing issues laterThe cheapest model isn’t always the most cost-effective. Factor in your time.
Mistake 2: Never Switching Models
Implementation: "Write the authentication middleware"→ Model: GLM-5 ($0.20)→ Cost: 3x more expensive than necessaryIf you’re not switching models, you’re leaving money on the table.
Mistake 3: Hardcoding API Keys
{ "providers": { "minimax": { "apiKey": "sk-abc123def456" // DON'T DO THIS } }}Always use environment variables:
# In ~/.zshrc or ~/.bashrcexport KIMI_API_KEY="sk-xxx"export MINIMAX_API_KEY="sk-xxx"export GLM_API_KEY="sk-xxx"Model Selection Decision Tree
Here’s the decision process I use:
[Task received] | v[Is it a quick syntax/lookup question?] |-- YES --> Kimi Mini (~$0.01-0.05) |-- NO --> Continue | v[Is it code implementation or debugging?] |-- YES --> MiniMax M2.7 (~$0.10-0.20) |-- NO --> Continue | v[Is it architecture or design planning?] |-- YES --> GLM-5 (~$0.15-0.30) |-- NO --> Continue | v[Is it critical for production/security?] |-- YES --> Claude/GPT-4 (~$0.50-1.00) |-- NO --> MiniMax M2.7 (default)Configuring Default Models
OpenCode lets me set defaults so I don’t switch manually every time:
{ "defaultModel": "minimax", "modelDefaults": { "quick-query": "kimi-mini", "planning": "glm-4", "implementation": "minimax-m2.7", "review": "claude-opus" }}This way, OpenCode automatically selects the right model based on task type.
Verification: Are the Savings Real?
I was skeptical at first. Here’s how I verified the cost reduction:
- Week 1: All queries to Claude Opus, tracked costs
- Week 2: Implemented routing, tracked costs
- Week 3: Refined routing rules, tracked costs
| Week | Strategy | Queries | Cost | Avg/Query ||------|-------------------|---------|--------|-----------|| 1 | All Claude Opus | 127 | $65.40 | $0.51 || 2 | Basic routing | 134 | $18.20 | $0.14 || 3 | Refined routing | 142 | $11.80 | $0.08 |The data confirmed it: routing reduced my average query cost from $0.51 to $0.08.
In This Post, I…
I showed you how to cut AI coding costs by 80% using OpenCode’s model switching capability. The key insights:
- Match model capability to task complexity - Don’t use premium models for simple queries
- Chinese models offer excellent value - GLM-5, MiniMax, and Kimi cost 1-5% of Western premium models
- Set up routing rules - Automate model selection based on task type
- Track your costs - Measure before and after to verify savings
- Reserve premium for critical tasks - Use Claude/GPT-4 for security-sensitive work
The Reddit user who recommended this workflow was right:
“Planning with GLM-5 and implementation with Minimax M2.7 is the way to go.”
That workflow cut my costs by 84% without sacrificing quality on the work that matters.
Final Words + More Resources
My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me
Here are also the most important links from this article along with some further resources that will help you in this scope:
- 👨💻 OpenCode CLI
- 👨💻 MiniMax API
- 👨💻 GLM API
- 👨💻 Kimi API
Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!
Comments