Tools

Code Sandbox

The code_execute tool lets agents run code in an isolated environment. Three sandbox modes are available, controlled by the CODE_EXEC_SANDBOX environment variable. Mode selection trades isolation strength against operational complexity.

Sandbox modes

ModeIsolationRequirementsUse case
dockerStrong — containerized filesystem, no network, ephemeralDocker daemon accessible to gatewayProduction — untrusted user-provided code
processModerate — subprocess with timeout and ulimitsNoneTrusted environments, CI, internal tools
noneNone — direct execution in gateway processNoneLocal development only — never use in production

Configuration

bash
# Set sandbox mode via environment variable
CODE_EXEC_SANDBOX=docker    # Recommended for production
CODE_EXEC_SANDBOX=process   # Subprocess with timeout (moderate isolation)
CODE_EXEC_SANDBOX=none      # Direct execution — local dev only, never in production

Docker mode

Docker mode is the recommended production configuration. Each code execution runs in a fresh container with:

  • No outbound network access
  • Read-only filesystem except /tmp
  • Container removed immediately after execution
  • CPU and memory limits set via Docker resource constraints
bash
# Docker mode requirements
# 1. Docker daemon must be running and accessible to the gateway process
# 2. The sandbox image must be pre-pulled (done automatically on first use)
docker pull ghcr.io/open-astra/sandbox:latest

# Each code execution spins up a new container, runs the code, and removes the container.
# Containers have no network access and a read-only filesystem except /tmp.

Process mode

bash
# Process mode: subprocess with configurable timeout
# The code runs in a child process with:
# - A configurable timeout (default 10s)
# - Resource limits via ulimit (file descriptors, memory)
# - No network restriction (unlike Docker mode)
# Use for trusted code in sandboxed environments or testing.
CODE_EXEC_SANDBOX=none executes code directly in the gateway process with no isolation. Any code the agent runs has the same permissions as the gateway. Only use this for local development with trusted agents.

Tool usage

The code_execute tool is available to agents via the tool loop. It accepts a language identifier and code string and returns stdout output:

text
# The code_execute tool is available to agents with the appropriate skill
# Agents call it via the tool loop — it is not directly callable via REST.

# Example: agent turn that triggers code execution
# User: "What is 2^64?"
# Agent calls: code_execute({ language: "python", code: "print(2**64)" })
# Result: "18446744073709551616"

Security considerations

  • Always use docker mode in production for any agent that accepts user-provided code or operates on untrusted input.
  • Enable Approvals for the code_execute tool on sensitive agents — require human approval before any code runs.
  • Combine with Ethical Check to block code that attempts privilege escalation, data exfiltration, or destructive filesystem operations.
  • The Docker sandbox image is minimal — it does not include cloud CLIs, credentials, or network access to internal services.

See also: Security Hardening Checklist for a complete list of controls for production deployments.