OpenViking Multi-Model Support: Use Claude, GPT-4, Qwen, and Ollama Together

Mar 16, 2026

Problem

I was building an AI agent with OpenViking and wanted to switch between GPT-4o for complex reasoning and DeepSeek for cost-sensitive operations. But OpenViking’s documentation showed only Volcengine (Doubao) and OpenAI configurations.

I needed to know: Can I use Claude? Can I run local models? How do I configure different providers?

What I Found

OpenViking supports three VLM provider types: volcengine, openai, and litellm. The litellm provider is the key - it unlocks access to Claude, DeepSeek, Gemini, Qwen, Ollama, vLLM, and 100+ other models.

Here’s the basic structure in ov.conf:

{
  "vlm": {
    "provider": "litellm",
    "model": "your-model-name",
    "api_key": "your-api-key"
  }
}

How to Configure Each Provider

OpenAI (GPT-4o)

This is the simplest configuration if you’re already using OpenAI:

{
  "vlm": {
    "provider": "openai",
    "model": "gpt-4o",
    "api_key": "your-openai-api-key",
    "api_base": "https://api.openai.com/v1"
  }
}

I tested this first and it worked immediately.

Anthropic (Claude)

To use Claude, switch the provider to litellm:

{
  "vlm": {
    "provider": "litellm",
    "model": "claude-3-5-sonnet-20240620",
    "api_key": "your-anthropic-api-key"
  }
}

LiteLLM automatically detects that this is an Anthropic model from the model name and routes it to the correct API.

DeepSeek

DeepSeek is a cost-effective alternative for many tasks:

{
  "vlm": {
    "provider": "litellm",
    "model": "deepseek-chat",
    "api_key": "your-deepseek-api-key"
  }
}

I tested DeepSeek for simple extraction tasks and it worked well at a fraction of the cost.

Qwen (DashScope)

For Qwen models through Alibaba’s DashScope:

{
  "vlm": {
    "provider": "litellm",
    "model": "dashscope/qwen-turbo",
    "api_key": "your-dashscope-api-key",
    "api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1"
  }
}

Note the dashscope/ prefix and the custom api_base - these are required for Qwen.

Local Models with Ollama

For privacy-sensitive data or offline use, I configured Ollama:

# Start Ollama server
ollama serve

# Pull a model
ollama pull llama3.1

Then configure OpenViking:

{
  "vlm": {
    "provider": "litellm",
    "model": "ollama/llama3.1",
    "api_base": "http://localhost:11434"
  }
}

No API key needed for local models.

vLLM for Production

For production deployments with self-hosted models:

{
  "vlm": {
    "provider": "litellm",
    "model": "hosted_vllm/llama-3.1-8b",
    "api_base": "http://localhost:8000/v1"
  }
}

Model Selection Strategy

I created a decision table for when to use each model:

Use Case	Model	Why
Complex reasoning	Claude 3.5 Sonnet	Best for nuanced analysis
Production semantic processing	GPT-4o	Reliable, well-tested
Cost-sensitive operations	DeepSeek	10x cheaper than GPT-4
Chinese language tasks	Qwen-turbo	Optimized for Chinese
Privacy-sensitive data	Ollama (local)	Data never leaves your machine
High-volume processing	vLLM	No API rate limits

How LiteLLM Works

I was curious why LiteLLM can handle so many providers with minimal configuration. The answer is automatic model detection.

When I specify claude-3-5-sonnet-20240620, LiteLLM:

Recognizes this is an Anthropic model
Routes to Anthropic’s API
Formats the request in Anthropic’s format
Returns a standardized response

The same happens for DeepSeek, Gemini, and others. No manual routing needed.

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│  OpenViking  │────▶│   LiteLLM    │────▶│   Provider   │
│  Request     │     │   Router     │     │   API        │
└──────────────┘     └──────────────┘     └──────────────┘
                            │
                            ▼
                     Model Detection
                     claude-*  -> Anthropic
                     deepseek-* -> DeepSeek
                     ollama/* -> Local
                     gpt-* -> OpenAI

Embedding Models

OpenViking also supports multiple embedding providers:

{
  "embedding": {
    "dense": {
      "provider": "openai",
      "model": "text-embedding-3-large",
      "dimension": 3072,
      "api_key": "your-openai-key"
    }
  }
}

Supported embedding providers:

volcengine: Doubao embeddings
openai: text-embedding-3-large/small
jina: Jina embeddings

Full Configuration Example

Here’s my complete ov.conf for a multi-model setup:

{
  "storage": {
    "workspace": "/home/user/openviking_workspace"
  },
  "embedding": {
    "dense": {
      "provider": "openai",
      "model": "text-embedding-3-large",
      "dimension": 3072,
      "api_key": "your-openai-key"
    }
  },
  "vlm": {
    "provider": "litellm",
    "model": "claude-3-5-sonnet-20240620",
    "api_key": "your-anthropic-key"
  }
}

Switching Models at Runtime

I sometimes need to switch models without changing the config file. Here’s how I do it:

import json

def switch_model(provider: str, model: str, api_key: str = None):
    with open('ov.conf', 'r') as f:
        config = json.load(f)

    config['vlm']['provider'] = provider
    config['vlm']['model'] = model
    if api_key:
        config['vlm']['api_key'] = api_key

    with open('ov.conf', 'w') as f:
        json.dump(config, f, indent=2)

# Switch to DeepSeek for cost savings
switch_model('litellm', 'deepseek-chat', 'your-deepseek-key')

# Switch to local model for privacy
switch_model('litellm', 'ollama/llama3.1')

Common Issues

Issue 1: Model Not Found

When I used gpt-4o with litellm provider:

Error: Model gpt-4o not found

The fix: Use the correct model identifier. For LiteLLM with OpenAI, just use the model name directly:

{
  "vlm": {
    "provider": "litellm",
    "model": "gpt-4o",
    "api_key": "your-openai-key"
  }
}

Or use the OpenAI provider directly:

{
  "vlm": {
    "provider": "openai",
    "model": "gpt-4o",
    "api_key": "your-openai-key"
  }
}

Issue 2: Ollama Connection Refused

When I tried to use Ollama without starting the server:

Error: Connection refused to localhost:11434

The fix: Make sure Ollama is running:

# Check if Ollama is running
curl http://localhost:11434/api/tags

# If not, start it
ollama serve

Issue 3: Qwen API Base Missing

When I configured Qwen without the custom API base:

Error: Invalid API endpoint

The fix: Always include the DashScope API base:

{
  "vlm": {
    "provider": "litellm",
    "model": "dashscope/qwen-turbo",
    "api_key": "your-key",
    "api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1"
  }
}

Summary

In this post, I showed how to configure OpenViking to work with multiple LLM providers through LiteLLM. The key point is that LiteLLM acts as a unified interface - you specify the model name, and it handles the provider-specific formatting automatically.

I covered configurations for OpenAI, Claude, DeepSeek, Qwen, Ollama, and vLLM. Each has its use case: GPT-4o for production, Claude for complex reasoning, DeepSeek for cost savings, Qwen for Chinese tasks, and Ollama for privacy.

This flexibility future-proofs your agent infrastructure against provider lock-in, cost changes, and availability issues. Switch models by changing one line in your config file.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 OpenViking GitHub Repository
👨‍💻 LiteLLM Documentation

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!