LiteLLM SDK vs Proxy: Which Deployment Mode is Safer?

Mar 26, 2026

After the March 2026 LiteLLM supply chain attack, I needed to understand a critical question: does deployment mode affect security? Versions 1.82.7 and 1.82.8 contained credential-stealing malware, and the impact varied dramatically based on how teams deployed LiteLLM.

The Direct Answer

Using LiteLLM as a proxy server is safer than running it as an SDK locally. When a malicious package executes in your process (SDK mode), it has full access to your environment, files, and credentials. Proxy mode isolates the attack surface to a separate server environment that can be sandboxed, firewalled, and has limited access to your sensitive data.

What the Original Malware Reporter Said

From the Reddit discussion, the person who originally reported the malware shared this insight:

“fwiw using any of these as a proxy layer will isolate you more from attacks vs running it locally as an SDK”

And when asked about their own setup:

“Unfortunately we were using a mix of both”

This “mix of both” created an inconsistent security posture - some services had full access (SDK), others were isolated (proxy).

Attack Surface Comparison

I compared the attack surfaces between SDK and Proxy modes:

+------------------+------------------------+---------------------------+
| Aspect           | SDK Mode (Local)       | Proxy Mode (Server)       |
+------------------+------------------------+---------------------------+
| Code execution   | In your process        | Isolated server           |
| File access      | All files process can | Limited to server files   |
| Environment vars | Full access to all     | Only server env vars      |
| Credentials      | SSH keys, cloud creds, | Only proxy API keys       |
|                  | DB passwords           |                           |
| Network          | Full network access    | Limited to API endpoints  |
| Sandboxing       | Hard to sandbox        | Easy to containerize      |
+------------------+------------------------+---------------------------+

Why Deployment Mode Matters After Supply Chain Attacks

The March 2026 LiteLLM supply chain attack exposed a critical security consideration that many teams overlooked: where the code runs matters as much as what the code does.

When versions 1.82.7 and 1.82.8 were compromised with credential-stealing malware, the impact varied dramatically based on deployment mode:

SDK users: Malicious code ran directly in their application process, harvesting SSH keys, cloud credentials, and database passwords from their local environment.
Proxy users: Malicious code ran in a separate server/container with limited access to sensitive credentials.

How SDK Mode Works (Higher Risk)

When you use LiteLLM as a Python SDK, it runs inside your application process:

from litellm import completion, acompletion
import os

# LiteLLM code has access to EVERYTHING your app has access to
os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["AWS_ACCESS_KEY_ID"] = "AKIA..."  # LiteLLM can read this
os.environ["DATABASE_URL"] = "postgres://..."  # LiteLLM can read this

response = completion(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}]
)

Security implications of SDK mode:

Full environment access: Malicious code can read all environment variables
File system access: Can read any file your process can access (SSH keys, configs, secrets)
Network access: Full outbound network access for exfiltration
Process context: Runs with your user permissions
No isolation: Compromises your entire application

I found this pattern in a local codebase at runtime/tasks/2492-openviking/todo_repos/input_repo/openviking/models/vlm/backends/litellm_vlm.py:

import os
os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = "True"

import litellm
from litellm import acompletion, completion

# SDK runs in-process with full access to:
# - All environment variables
# - All files the process can read
# - Network access
# - Any credentials in the environment

How Proxy Mode Works (Lower Risk)

When you use LiteLLM as a proxy server, it runs in an isolated environment:

import openai

client = openai.OpenAI(
    base_url="http://litellm-proxy:4000",  # Separate server
    api_key="sk-litellm-proxy-key"  # Only the proxy key
)

# Your sensitive credentials stay in YOUR environment
# The proxy only sees the API key you send it
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}]
)

Security benefits of proxy mode:

Credential isolation: Proxy only sees the API key you send, not your SSH keys or DB credentials
Network isolation: Can firewall the proxy to only access LLM APIs
Containerization: Easy to run in Docker with limited capabilities
Least privilege: Proxy can run with minimal permissions
Blast radius: Compromise is limited to the proxy environment

Attack Surface Diagram

I created a visual comparison of both modes:

+--------------------------------------------------+
| YOUR APPLICATION PROCESS                          |
| +----------------------------------------------+ |
| | LITELLM SDK (potentially malicious)          | |
| |                                              | |
| | Has access to:                               | |
| | - ALL environment variables                  | |
| | - ALL files (SSH keys, configs, secrets)    | |
| | - Full network access                        | |
| +----------------------------------------------+ |
|                                                  |
| Your AWS keys, DB passwords, SSH keys           |
+--------------------------------------------------+

+----------------------------+    +----------------------------+
| YOUR APPLICATION PROCESS   |    | LITELLM PROXY (isolated)   |
|                            |    |                            |
| AWS keys, DB passwords,    |    | Limited to:                |
| SSH keys, secrets          |    | - Proxy API key            |
|                            |    | - LLM provider endpoints   |
| Only sends: proxy API key  |--->|                            |
+----------------------------+    +----------------------------+
        Isolated by network/firewall

Why This Matters for MCP Servers

The original reporter’s MCP (Model Context Protocol) server depended on LiteLLM. This is particularly relevant because:

MCP servers run locally with your development environment
They often have access to: Your code, file system, and potentially credentials
SDK dependency means: Malicious code in LiteLLM runs with full MCP server access

If your MCP server uses LiteLLM SDK mode, a compromised package can:

Read your source code
Access your .env files
Steal API keys from your environment
Exfiltrate data through the MCP server

Migration from SDK to Proxy

Here’s how I would migrate from SDK to proxy mode:

from litellm import completion
import os

os.environ["OPENAI_API_KEY"] = "sk-..."
os.environ["AWS_ACCESS_KEY_ID"] = "AKIA..."  # Exposed to LiteLLM!

response = completion(model="gpt-4", messages=[...])

import openai

# Only proxy key is exposed
client = openai.OpenAI(
    base_url="http://litellm-proxy:4000",
    api_key="sk-proxy-key"
)

response = client.chat.completions.create(model="gpt-4", messages=[...])
# Your AWS keys, DB passwords, SSH keys stay in YOUR environment

Proxy Server Configuration

I recommend this configuration for running LiteLLM as an isolated proxy:

model_list:
  - model_name: gpt-4
    litellm_params:
      model: openai/gpt-4
      api_key: os.environ/OPENAI_API_KEY  # Only this key is in proxy env

general_settings:
  master_key: "sk-your-proxy-master-key"

version: "3.8"
services:
  litellm-proxy:
    image: ghcr.io/berriai/litellm:main-latest
    ports:
      - "4000:4000"
    environment:
      - OPENAI_API_KEY=${OPENAI_API_KEY}  # Only LLM key
      # NO AWS keys, DB passwords, or SSH keys here
    volumes:
      - ./litellm-config.yaml:/app/config.yaml
    # Network isolation - only access to LLM APIs
    networks:
      - llm-network

Common Mistakes to Avoid

Mistake 1: Using “Mix of Both” (Like the Original Reporter)

This creates complexity in security posture. Some services have full access (SDK), others are isolated (proxy). Choose one architecture and be consistent.

Mistake 2: Running Proxy with Full Credentials

environment:
  - OPENAI_API_KEY=${OPENAI_API_KEY}
  - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}  # Don't do this!
  - DATABASE_URL=${DATABASE_URL}            # Don't do this!

environment:
  - OPENAI_API_KEY=${OPENAI_API_KEY}
  - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}

Mistake 3: Not Network-Isolating the Proxy

network_mode: "host"

networks:
  llm-network:
    driver: bridge
    # Configure firewall rules to only allow:
    # - api.openai.com
    # - api.anthropic.com
    # - generativelanguage.googleapis.com

Mistake 4: Skipping Sandbox/Containerization

Always run the proxy in a container with:

Read-only filesystem where possible
No privilege escalation
Limited capabilities
Network egress filtering

Summary

For security-conscious teams, LiteLLM Proxy mode is the safer choice.

+----------------+-------------------+----------------------+
| Factor         | SDK Mode          | Proxy Mode           |
+----------------+-------------------+----------------------+
| Attack surface | Your entire env   | Isolated server      |
| Credential     | All secrets       | Only LLM API keys    |
| exposure       |                   |                      |
| Blast radius   | Full compromise   | Limited to proxy     |
| Isolation      | None              | Container/network    |
| Remediation    | Rotate ALL secrets| Rotate proxy key only|
+----------------+-------------------+----------------------+

The supply chain attack proved this principle: code running in your process can access everything you can access. By moving LiteLLM to a proxy server, you limit the damage of any future compromise to just the LLM API keys the proxy manages.

Recommendations

Immediate: If using SDK mode, rotate all credentials that were in your environment
Short-term: Deploy LiteLLM as a proxy in a containerized, network-isolated environment
Long-term: Consider a Go-based alternative (like Bifrost) for compiled-binary security benefits

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Reddit Discussion: LiteLLM Alternatives
👨‍💻 No Prompt Injection Required - Attack Analysis
👨‍💻 LiteLLM Documentation

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!