← Back to Learn
agent-safetyguardrailsbest-practicesdeployment

Code Interpreter Safety Sandboxing

Authensor

Code interpreters let AI agents write and execute code. This is one of the most powerful and most dangerous agent capabilities. Without proper sandboxing, an agent can access the file system, make network requests, consume unlimited resources, and execute arbitrary system commands.

Sandbox Architecture

Run agent-generated code in an isolated environment that limits what the code can access. The sandbox should restrict: file system access (read and write paths), network access (outbound connections), system calls (process creation, signal handling), resource consumption (CPU time, memory, disk), and available libraries.

Container-Based Sandboxing

Docker containers with restricted capabilities provide a practical sandbox. Configure the container with: a read-only root filesystem, no network access (or restricted to specific endpoints), CPU and memory limits, a short execution timeout (30 seconds is typical), dropped Linux capabilities, and a non-root user.

Mount only the specific files the code needs as read-only volumes. Output files go to a dedicated writable directory that is scanned before results are returned to the agent.

Pre-Execution Scanning

Before executing code, scan it for dangerous patterns. Authensor's Aegis scanner checks for: import statements for dangerous modules (os, subprocess, socket), file system operations outside the sandbox, network requests, attempts to access environment variables, and obfuscated code that might hide malicious intent.

This is not foolproof since code analysis is undecidable in general. Treat it as a first filter, not a complete defense. The sandbox provides the actual containment.

Policy Integration

Authensor's policy engine evaluates code execution requests as action envelopes. The policy can restrict: which languages are allowed, maximum code length, required sandbox configuration, which agents can execute code, and rate limits on execution requests.

Output Scanning

After execution, scan the output for sensitive information before returning it to the agent. Code might extract environment variables, file contents, or system information that should not be exposed.

Resource Monitoring

Track resource usage per execution. Authensor's Sentinel engine monitors for patterns like repeated resource-intensive executions that might indicate a denial-of-service attempt or cryptocurrency mining.

Set hard limits and kill executions that exceed them. A runaway loop should not consume cluster resources.

Keep learning

Explore more guides on AI agent safety, prompt injection, and building secure systems.

View All Guides