Skip to content

OpenViking Multi-Model Support: Use Claude, GPT-4, Qwen, and Ollama Together

Problem

I was building an AI agent with OpenViking and wanted to switch between GPT-4o for complex reasoning and DeepSeek for cost-sensitive operations. But OpenViking’s documentation showed only Volcengine (Doubao) and OpenAI configurations.

I needed to know: Can I use Claude? Can I run local models? How do I configure different providers?

What I Found

OpenViking supports three VLM provider types: volcengine, openai, and litellm. The litellm provider is the key - it unlocks access to Claude, DeepSeek, Gemini, Qwen, Ollama, vLLM, and 100+ other models.

Here’s the basic structure in ov.conf:

ov.conf
{
"vlm": {
"provider": "litellm",
"model": "your-model-name",
"api_key": "your-api-key"
}
}

How to Configure Each Provider

OpenAI (GPT-4o)

This is the simplest configuration if you’re already using OpenAI:

ov.conf
{
"vlm": {
"provider": "openai",
"model": "gpt-4o",
"api_key": "your-openai-api-key",
"api_base": "https://api.openai.com/v1"
}
}

I tested this first and it worked immediately.

Anthropic (Claude)

To use Claude, switch the provider to litellm:

ov.conf
{
"vlm": {
"provider": "litellm",
"model": "claude-3-5-sonnet-20240620",
"api_key": "your-anthropic-api-key"
}
}

LiteLLM automatically detects that this is an Anthropic model from the model name and routes it to the correct API.

DeepSeek

DeepSeek is a cost-effective alternative for many tasks:

ov.conf
{
"vlm": {
"provider": "litellm",
"model": "deepseek-chat",
"api_key": "your-deepseek-api-key"
}
}

I tested DeepSeek for simple extraction tasks and it worked well at a fraction of the cost.

Qwen (DashScope)

For Qwen models through Alibaba’s DashScope:

ov.conf
{
"vlm": {
"provider": "litellm",
"model": "dashscope/qwen-turbo",
"api_key": "your-dashscope-api-key",
"api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1"
}
}

Note the dashscope/ prefix and the custom api_base - these are required for Qwen.

Local Models with Ollama

For privacy-sensitive data or offline use, I configured Ollama:

Terminal
# Start Ollama server
ollama serve
# Pull a model
ollama pull llama3.1

Then configure OpenViking:

ov.conf
{
"vlm": {
"provider": "litellm",
"model": "ollama/llama3.1",
"api_base": "http://localhost:11434"
}
}

No API key needed for local models.

vLLM for Production

For production deployments with self-hosted models:

ov.conf
{
"vlm": {
"provider": "litellm",
"model": "hosted_vllm/llama-3.1-8b",
"api_base": "http://localhost:8000/v1"
}
}

Model Selection Strategy

I created a decision table for when to use each model:

Use CaseModelWhy
Complex reasoningClaude 3.5 SonnetBest for nuanced analysis
Production semantic processingGPT-4oReliable, well-tested
Cost-sensitive operationsDeepSeek10x cheaper than GPT-4
Chinese language tasksQwen-turboOptimized for Chinese
Privacy-sensitive dataOllama (local)Data never leaves your machine
High-volume processingvLLMNo API rate limits

How LiteLLM Works

I was curious why LiteLLM can handle so many providers with minimal configuration. The answer is automatic model detection.

When I specify claude-3-5-sonnet-20240620, LiteLLM:

  1. Recognizes this is an Anthropic model
  2. Routes to Anthropic’s API
  3. Formats the request in Anthropic’s format
  4. Returns a standardized response

The same happens for DeepSeek, Gemini, and others. No manual routing needed.

┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ OpenViking │────▶│ LiteLLM │────▶│ Provider │
│ Request │ │ Router │ │ API │
└──────────────┘ └──────────────┘ └──────────────┘
Model Detection
claude-* -> Anthropic
deepseek-* -> DeepSeek
ollama/* -> Local
gpt-* -> OpenAI

Embedding Models

OpenViking also supports multiple embedding providers:

ov.conf
{
"embedding": {
"dense": {
"provider": "openai",
"model": "text-embedding-3-large",
"dimension": 3072,
"api_key": "your-openai-key"
}
}
}

Supported embedding providers:

  • volcengine: Doubao embeddings
  • openai: text-embedding-3-large/small
  • jina: Jina embeddings

Full Configuration Example

Here’s my complete ov.conf for a multi-model setup:

ov.conf
{
"storage": {
"workspace": "/home/user/openviking_workspace"
},
"embedding": {
"dense": {
"provider": "openai",
"model": "text-embedding-3-large",
"dimension": 3072,
"api_key": "your-openai-key"
}
},
"vlm": {
"provider": "litellm",
"model": "claude-3-5-sonnet-20240620",
"api_key": "your-anthropic-key"
}
}

Switching Models at Runtime

I sometimes need to switch models without changing the config file. Here’s how I do it:

switch_model.py
import json
def switch_model(provider: str, model: str, api_key: str = None):
with open('ov.conf', 'r') as f:
config = json.load(f)
config['vlm']['provider'] = provider
config['vlm']['model'] = model
if api_key:
config['vlm']['api_key'] = api_key
with open('ov.conf', 'w') as f:
json.dump(config, f, indent=2)
# Switch to DeepSeek for cost savings
switch_model('litellm', 'deepseek-chat', 'your-deepseek-key')
# Switch to local model for privacy
switch_model('litellm', 'ollama/llama3.1')

Common Issues

Issue 1: Model Not Found

When I used gpt-4o with litellm provider:

Error: Model gpt-4o not found

The fix: Use the correct model identifier. For LiteLLM with OpenAI, just use the model name directly:

{
"vlm": {
"provider": "litellm",
"model": "gpt-4o",
"api_key": "your-openai-key"
}
}

Or use the OpenAI provider directly:

{
"vlm": {
"provider": "openai",
"model": "gpt-4o",
"api_key": "your-openai-key"
}
}

Issue 2: Ollama Connection Refused

When I tried to use Ollama without starting the server:

Error: Connection refused to localhost:11434

The fix: Make sure Ollama is running:

Terminal window
# Check if Ollama is running
curl http://localhost:11434/api/tags
# If not, start it
ollama serve

Issue 3: Qwen API Base Missing

When I configured Qwen without the custom API base:

Error: Invalid API endpoint

The fix: Always include the DashScope API base:

{
"vlm": {
"provider": "litellm",
"model": "dashscope/qwen-turbo",
"api_key": "your-key",
"api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1"
}
}

Summary

In this post, I showed how to configure OpenViking to work with multiple LLM providers through LiteLLM. The key point is that LiteLLM acts as a unified interface - you specify the model name, and it handles the provider-specific formatting automatically.

I covered configurations for OpenAI, Claude, DeepSeek, Qwen, Ollama, and vLLM. Each has its use case: GPT-4o for production, Claude for complex reasoning, DeepSeek for cost savings, Qwen for Chinese tasks, and Ollama for privacy.

This flexibility future-proofs your agent infrastructure against provider lock-in, cost changes, and availability issues. Switch models by changing one line in your config file.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments