Skip to content

LiteLLM SDK vs Proxy: Which Deployment Mode is Safer?

After the March 2026 LiteLLM supply chain attack, I needed to understand a critical question: does deployment mode affect security? Versions 1.82.7 and 1.82.8 contained credential-stealing malware, and the impact varied dramatically based on how teams deployed LiteLLM.

The Direct Answer

Using LiteLLM as a proxy server is safer than running it as an SDK locally. When a malicious package executes in your process (SDK mode), it has full access to your environment, files, and credentials. Proxy mode isolates the attack surface to a separate server environment that can be sandboxed, firewalled, and has limited access to your sensitive data.

What the Original Malware Reporter Said

From the Reddit discussion, the person who originally reported the malware shared this insight:

“fwiw using any of these as a proxy layer will isolate you more from attacks vs running it locally as an SDK”

And when asked about their own setup:

“Unfortunately we were using a mix of both”

This “mix of both” created an inconsistent security posture - some services had full access (SDK), others were isolated (proxy).

Attack Surface Comparison

I compared the attack surfaces between SDK and Proxy modes:

Attack Surface Comparison Table
+------------------+------------------------+---------------------------+
| Aspect | SDK Mode (Local) | Proxy Mode (Server) |
+------------------+------------------------+---------------------------+
| Code execution | In your process | Isolated server |
| File access | All files process can | Limited to server files |
| Environment vars | Full access to all | Only server env vars |
| Credentials | SSH keys, cloud creds, | Only proxy API keys |
| | DB passwords | |
| Network | Full network access | Limited to API endpoints |
| Sandboxing | Hard to sandbox | Easy to containerize |
+------------------+------------------------+---------------------------+

Why Deployment Mode Matters After Supply Chain Attacks

The March 2026 LiteLLM supply chain attack exposed a critical security consideration that many teams overlooked: where the code runs matters as much as what the code does.

When versions 1.82.7 and 1.82.8 were compromised with credential-stealing malware, the impact varied dramatically based on deployment mode:

  1. SDK users: Malicious code ran directly in their application process, harvesting SSH keys, cloud credentials, and database passwords from their local environment.

  2. Proxy users: Malicious code ran in a separate server/container with limited access to sensitive credentials.

How SDK Mode Works (Higher Risk)

When you use LiteLLM as a Python SDK, it runs inside your application process:

SDK Mode - runs IN your process
from litellm import completion, acompletion
import os
# LiteLLM code has access to EVERYTHING your app has access to
os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["AWS_ACCESS_KEY_ID"] = "AKIA..." # LiteLLM can read this
os.environ["DATABASE_URL"] = "postgres://..." # LiteLLM can read this
response = completion(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}]
)

Security implications of SDK mode:

  1. Full environment access: Malicious code can read all environment variables
  2. File system access: Can read any file your process can access (SSH keys, configs, secrets)
  3. Network access: Full outbound network access for exfiltration
  4. Process context: Runs with your user permissions
  5. No isolation: Compromises your entire application

I found this pattern in a local codebase at runtime/tasks/2492-openviking/todo_repos/input_repo/openviking/models/vlm/backends/litellm_vlm.py:

Real SDK usage with full environment access
import os
os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = "True"
import litellm
from litellm import acompletion, completion
# SDK runs in-process with full access to:
# - All environment variables
# - All files the process can read
# - Network access
# - Any credentials in the environment

How Proxy Mode Works (Lower Risk)

When you use LiteLLM as a proxy server, it runs in an isolated environment:

Proxy Mode - runs in SEPARATE process/container
import openai
client = openai.OpenAI(
base_url="http://litellm-proxy:4000", # Separate server
api_key="sk-litellm-proxy-key" # Only the proxy key
)
# Your sensitive credentials stay in YOUR environment
# The proxy only sees the API key you send it
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}]
)

Security benefits of proxy mode:

  1. Credential isolation: Proxy only sees the API key you send, not your SSH keys or DB credentials
  2. Network isolation: Can firewall the proxy to only access LLM APIs
  3. Containerization: Easy to run in Docker with limited capabilities
  4. Least privilege: Proxy can run with minimal permissions
  5. Blast radius: Compromise is limited to the proxy environment

Attack Surface Diagram

I created a visual comparison of both modes:

SDK Mode Attack Surface (Higher Risk)
+--------------------------------------------------+
| YOUR APPLICATION PROCESS |
| +----------------------------------------------+ |
| | LITELLM SDK (potentially malicious) | |
| | | |
| | Has access to: | |
| | - ALL environment variables | |
| | - ALL files (SSH keys, configs, secrets) | |
| | - Full network access | |
| +----------------------------------------------+ |
| |
| Your AWS keys, DB passwords, SSH keys |
+--------------------------------------------------+
Proxy Mode Attack Surface (Lower Risk)
+----------------------------+ +----------------------------+
| YOUR APPLICATION PROCESS | | LITELLM PROXY (isolated) |
| | | |
| AWS keys, DB passwords, | | Limited to: |
| SSH keys, secrets | | - Proxy API key |
| | | - LLM provider endpoints |
| Only sends: proxy API key |--->| |
+----------------------------+ +----------------------------+
Isolated by network/firewall

Why This Matters for MCP Servers

The original reporter’s MCP (Model Context Protocol) server depended on LiteLLM. This is particularly relevant because:

  1. MCP servers run locally with your development environment
  2. They often have access to: Your code, file system, and potentially credentials
  3. SDK dependency means: Malicious code in LiteLLM runs with full MCP server access

If your MCP server uses LiteLLM SDK mode, a compromised package can:

  • Read your source code
  • Access your .env files
  • Steal API keys from your environment
  • Exfiltrate data through the MCP server

Migration from SDK to Proxy

Here’s how I would migrate from SDK to proxy mode:

Before: SDK mode (runs in your process)
from litellm import completion
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["AWS_ACCESS_KEY_ID"] = "AKIA..." # Exposed to LiteLLM!
response = completion(model="gpt-4", messages=[...])
After: Proxy mode (runs in isolated container)
import openai
# Only proxy key is exposed
client = openai.OpenAI(
base_url="http://litellm-proxy:4000",
api_key="sk-proxy-key"
)
response = client.chat.completions.create(model="gpt-4", messages=[...])
# Your AWS keys, DB passwords, SSH keys stay in YOUR environment

Proxy Server Configuration

I recommend this configuration for running LiteLLM as an isolated proxy:

litellm-config.yaml
model_list:
- model_name: gpt-4
litellm_params:
model: openai/gpt-4
api_key: os.environ/OPENAI_API_KEY # Only this key is in proxy env
general_settings:
master_key: "sk-your-proxy-master-key"
docker-compose.yml for isolated deployment
version: "3.8"
services:
litellm-proxy:
image: ghcr.io/berriai/litellm:main-latest
ports:
- "4000:4000"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY} # Only LLM key
# NO AWS keys, DB passwords, or SSH keys here
volumes:
- ./litellm-config.yaml:/app/config.yaml
# Network isolation - only access to LLM APIs
networks:
- llm-network

Common Mistakes to Avoid

Mistake 1: Using “Mix of Both” (Like the Original Reporter)

This creates complexity in security posture. Some services have full access (SDK), others are isolated (proxy). Choose one architecture and be consistent.

Mistake 2: Running Proxy with Full Credentials

WRONG: Proxy has access to all your secrets
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} # Don't do this!
- DATABASE_URL=${DATABASE_URL} # Don't do this!
CORRECT: Proxy only has LLM API keys
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}

Mistake 3: Not Network-Isolating the Proxy

WRONG: Proxy has full network access
network_mode: "host"
CORRECT: Proxy only accesses LLM APIs
networks:
llm-network:
driver: bridge
# Configure firewall rules to only allow:
# - api.openai.com
# - api.anthropic.com
# - generativelanguage.googleapis.com

Mistake 4: Skipping Sandbox/Containerization

Always run the proxy in a container with:

  • Read-only filesystem where possible
  • No privilege escalation
  • Limited capabilities
  • Network egress filtering

Summary

For security-conscious teams, LiteLLM Proxy mode is the safer choice.

Final Comparison
+----------------+-------------------+----------------------+
| Factor | SDK Mode | Proxy Mode |
+----------------+-------------------+----------------------+
| Attack surface | Your entire env | Isolated server |
| Credential | All secrets | Only LLM API keys |
| exposure | | |
| Blast radius | Full compromise | Limited to proxy |
| Isolation | None | Container/network |
| Remediation | Rotate ALL secrets| Rotate proxy key only|
+----------------+-------------------+----------------------+

The supply chain attack proved this principle: code running in your process can access everything you can access. By moving LiteLLM to a proxy server, you limit the damage of any future compromise to just the LLM API keys the proxy manages.

Recommendations

  1. Immediate: If using SDK mode, rotate all credentials that were in your environment
  2. Short-term: Deploy LiteLLM as a proxy in a containerized, network-isolated environment
  3. Long-term: Consider a Go-based alternative (like Bifrost) for compiled-binary security benefits

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments