Skip to content

How to Secure AI Agents in Production When Deploying OpenClaw and Similar Systems

I deployed my first AI agent to production last year, and within hours I realized I’d made a critical mistake. The agent had access to everything—my email, my calendar, my file system—and while it hadn’t done anything wrong yet, the potential for disaster was keeping me up at night.

When you’re running AI agents like OpenClaw in production, security isn’t optional. It’s the difference between a powerful automation tool and a data breach waiting to happen. Here’s what I’ve learned about securing AI agents the right way.

The Core Problem: AI Agents Are Powerful and Unpredictable

AI agents can execute commands, access files, and make decisions on your behalf. That’s their strength, but it’s also their danger. A compromised or misbehaving agent can cause real damage—and the more capable the model, the more potential for both good and harm.

The OpenClaw documentation makes this clear: weaker, over-quantized models are more vulnerable to prompt injection and unsafe behavior. This isn’t theoretical. Smaller models may miss subtle adversarial prompts or fail to recognize unsafe operations.

Security Best Practice 1: Use Strong, Latest-Generation Models for Security-Sensitive Agents

I learned this the hard way. I initially tried using a smaller, cheaper model for my production agent, thinking I’d save on costs. Bad idea. The model occasionally misinterpreted my instructions and one time nearly deleted the wrong directory.

Model Selection in OpenClaw Config
# Recommended: Use the strongest available model for security-sensitive operations
ANTHROPIC_MODEL=claude-3-5-sonnet-20241022
# Avoid for production security:
# ANTHROPIC_MODEL=claude-3-haiku-20240307 # Too lightweight for sensitive tasks

The cost savings from cheaper models aren’t worth the security trade-off. For agents with access to sensitive systems, always use the most capable model available.

Security Best Practice 2: Hardened Authentication with Gateway Tokens

OpenClaw enforces gateway token authentication by default, including on loopback connections. This caught me off guard initially—I expected localhost to be implicitly trusted. It’s not.

For production or multi-user workloads, the OpenClaw documentation explicitly recommends Anthropic API key authentication over subscription-based auth. Here’s how I set up my authentication:

Generate Secure Gateway Token
# Generate a cryptographically secure token
openssl rand -hex 32 > ~/.openclaw/gateway_token
# Set appropriate permissions
chmod 600 ~/.openclaw/gateway_token
Configure Token Authentication
{
"auth": {
"method": "gateway_token",
"token_path": "~/.openclaw/gateway_token",
"require_auth_on_loopback": true
},
"security": {
"token_rotation_days": 30,
"max_failed_attempts": 5,
"lockout_duration_minutes": 15
}
}

For even stronger security, I switched to ephemeral tokens stored in a vault:

Environment Variable Setup
# Using HashiCorp Vault for ephemeral tokens
export OPENCLAW_AUTH_TOKEN=$(vault kv get -field=token secret/openclaw/prod)
# Or using environment-based ephemeral tokens
export OPENCLAW_EPHEMERAL_TOKEN=$(openclaw token generate --ttl=24h)

A Reddit user on r/squareb put it well: “Switch your API to using ephemeral tokens or better yet digital keys in vaults.” This is now my standard practice.

Security Best Practice 3: Network Segmentation and Isolation

One of the most effective security measures I implemented was network isolation. My OpenClaw instance runs in its own VLAN with no direct access to critical services.

Production VPS Setup
# Create isolated network namespace
sudo ip netns add openclaw-ns
# Create virtual ethernet pair
sudo ip link add veth-openclaw type veth peer name veth-host
# Move one end to the namespace
sudo ip link set veth-openclaw netns openclaw-ns
# Configure isolated network
sudo ip netns exec openclaw-ns ip addr add 10.200.1.2/24 dev veth-openclaw
sudo ip netns exec openclaw-ns ip link set veth-openclaw up
# Run OpenClaw in isolated namespace
sudo ip netns exec openclaw-ns openclaw serve

I also configured Tailscale for secure identity-based access:

Enable Tailscale Identity
{
"network": {
"tailscale": {
"enabled": true,
"auth_key_path": "/etc/openclaw/tailscale_key",
"hostname": "openclaw-prod",
"advertise_tags": ["tag:openclaw"]
},
"allowed_ips": ["100.64.0.0/10"],
"blocked_ips": ["0.0.0.0/0"]
}
}

The Reddit community recommendation was spot-on: “Implement network segmentation between workers” and “Use isolated VLANs with no access to critical services (email, calendar, bank accounts).”

Security Best Practice 4: Strict Access Controls and Allowlists

By default, I configure my agents with minimal permissions. Every tool access is explicitly allowed rather than implicitly granted.

Strict Tool Allowlist Configuration
{
"tools": {
"allowlist": [
"filesystem.read",
"web.search",
"code.analyze"
],
"blocked": [
"filesystem.write",
"system.run",
"network.request"
],
"require_confirmation": [
"filesystem.write",
"system.run"
]
},
"sandboxing": {
"enabled": true,
"allowed_paths": ["/data/openclaw/workspace"],
"denied_paths": ["/etc", "/root", "/home"]
}
}

The OpenClaw documentation warns specifically about device pairing: “Only pair devices you trust, as macOS node pairing allows system.run on that machine.” This is critical—a paired device gets execution privileges.

I also implemented a human-in-the-loop confirmation for sensitive operations:

Human-in-Loop Configuration
{
"human_oversight": {
"enabled": true,
"require_approval_for": [
"any_file_deletion",
"external_api_calls",
"system_command_execution",
"data_export"
],
"timeout_seconds": 300,
"fallback_action": "deny"
}
}

Security Best Practice 5: Production Deployment Architecture

For reliable 24/7 operation, I run OpenClaw on a dedicated VPS with appropriate resources:

Production VPS Setup Script
#!/bin/bash
# Minimum recommended specs for production OpenClaw
# - 4 CPU cores
# - 8GB RAM
# - 100GB SSD storage
# Create openclaw user (no shell access)
sudo useradd -r -s /usr/sbin/nologin openclaw
# Create necessary directories
sudo mkdir -p /opt/openclaw/{config,data,logs}
sudo chown -R openclaw:openclaw /opt/openclaw
# Install as systemd service
cat << 'EOF' | sudo tee /etc/systemd/system/openclaw.service
[Unit]
Description=OpenClaw AI Agent
After=network.target
[Service]
Type=simple
User=openclaw
Group=openclaw
WorkingDirectory=/opt/openclaw
ExecStart=/usr/local/bin/openclaw serve --config /opt/openclaw/config/production.json
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal
# Security hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/opt/openclaw/data
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable openclaw
sudo systemctl start openclaw

Security Best Practice 6: Monitoring, Session Management, and Maintenance

AI agents can accumulate context over long sessions, which increases both memory usage and security risk. I implemented automatic session pruning:

Configure Context Pruning
{
"session_management": {
"max_session_duration_hours": 8,
"context_pruning": {
"enabled": true,
"max_messages": 100,
"prune_older_than_hours": 24,
"preserve_recent_messages": 20
},
"rate_limiting": {
"max_requests_per_minute": 60,
"max_tokens_per_hour": 100000,
"backoff_strategy": "exponential"
}
},
"monitoring": {
"log_level": "info",
"audit_log": true,
"alert_on": ["auth_failure", "tool_denied", "context_overflow"]
}
}

I also set up alerting for suspicious activity:

Monitoring Setup
# Prometheus metrics endpoint
openclaw config set monitoring.prometheus.enabled true
openclaw config set monitoring.prometheus.port 9090
# Alert on repeated auth failures
cat << 'EOF' > /etc/prometheus/alerts/openclaw.yml
groups:
- name: openclaw_security
rules:
- alert: OpenClawAuthFailures
expr: increase(openclaw_auth_failures_total[5m]) > 3
for: 1m
labels:
severity: critical
annotations:
summary: "Multiple OpenClaw authentication failures"
description: "{{ $value }} auth failures in the last 5 minutes"
EOF

Security Best Practice 7: Gradual Trust Model

This is perhaps the most important lesson I learned. As one Reddit user on r/squareb put it:

“OpenClaw should not be running anything in your business right now… It’s an advisor till I am confident I have…”

I started with read-only access and advisory outputs. Here’s my trust escalation framework:

Trust Escalation Levels
Level 0: Read-Only Advisory
- Agent can only read and analyze
- All outputs require manual implementation
- No file system access
- No external API calls
Level 1: Controlled Write Access
- Agent can write to designated directories
- File operations require confirmation
- No system commands
- Limited API access (read-only)
Level 2: Expanded Capabilities
- Agent can execute approved commands
- Human confirmation for sensitive operations
- Broader API access
- Time-limited sessions
Level 3: Autonomous Operation (Only After Extensive Testing)
- Agent operates independently within bounds
- Audit logging for all operations
- Regular review of actions taken
- Immediate revocation capability

I spent months at Level 0 before gradually increasing permissions. Each escalation required documented testing and security review.

Common Mistakes to Avoid

Through trial and error (fortunately, mostly error-free), I’ve identified several anti-patterns:

Mistake 1: Trusting Loopback Connections

OpenClaw requires authentication even on localhost. I initially thought I could skip this for local development—wrong. Always require authentication.

Mistake 2: Overly Permissive Tool Access

I started with allow_all_tools: true because I wanted convenience. This lasted about a day before I realized the risk. Start minimal and add only what’s needed.

Mistake 3: Ignoring Model Selection for Security

Using smaller models for cost savings is fine for some tasks, but not for agents with system access. The security risk from prompt injection vulnerability far outweighs any cost savings.

Mistake 4: No Session Limits

Long-running sessions accumulate context and risk. Implement automatic pruning and session limits from day one.

Summary

Securing AI agents in production requires a defense-in-depth approach. I’ve found success with these core principles:

  1. Strong authentication using gateway tokens, API keys, or vault-stored ephemeral credentials
  2. Network isolation keeping agents in separate VLANs with controlled access
  3. Minimal permissions starting with read-only and expanding gradually
  4. Human oversight for all sensitive operations
  5. Continuous monitoring with alerts on suspicious activity
  6. Regular maintenance including session pruning and token rotation
  7. Gradual trust escalation treating agents as advisors until proven reliable

The key insight from the community that stuck with me: treat AI agents as powerful tools that require oversight, not as autonomous operators. As I gain confidence in my setup, I can expand capabilities—but the default should always be least privilege with maximum visibility.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments