astra.yml
astra.yml is the single declarative configuration file that defines your agent teams, system settings, quotas, and integrations. The gateway loads it on startup and seeds agents into the Postgres database. Restart the gateway after any change to reload it.
astra.yml in the root of your project directory (same level as package.json). The gateway resolves it relative to the working directory at startup.Top-level structure
settings: # Global defaults applied to all agents
quotas: # Rate and cost limits
approval: # Human-in-the-loop approval workflows
selfHealing: # Auto-restart and compaction configuration
dream: # Overnight simulation mode
research: # Deep research configuration
evolution: # Persona A/B testing and auto-adoption
localModels: # Local router thresholds for Ollama/vLLM
hotReload: # Watch config files for live updates
channels: # Messaging integrations (Telegram, Slack, etc.)
agents: # List of agent definitionssettings
Global defaults that apply to all agents unless overridden at the agent level.
settings:
defaultProvider: openai
defaultModel: gpt-4o
maxContextTokens: 128000
maxOutputTokens: 4096
temperature: 0.7
logLevel: info # debug | info | warn | error
timezone: UTC
embeddingModel: text-embedding-3-smallquotas
Token, cost, and spawn rate limits enforced per agent per workspace.
quotas:
tokens:
maxPerHour: 200000
maxPerDay: 1000000
cost:
maxPerDay: 10.00 # USD
spawn:
maxConcurrent: 5 # Max simultaneously running sub-agents
maxDepth: 2 # Max nesting depth for spawned agentsapproval
Configure which tools or agent actions require human approval before execution.
approval:
requireApproval:
- tool: file_write
- tool: shell_exec
- agent: deploy-agent # Any action by this agent
timeoutMs: 300000 # 5 minutes
defaultOnTimeout: deny # deny | allowselfHealing
Configure automatic recovery from agent failures and context overflow.
selfHealing:
enabled: true
maxConsecutiveFailures: 3 # Failures before agent is paused
restartDelayMs: 2000 # Initial backoff (doubles on each retry)
compactionThreshold: 0.85 # Compact context when 85% of token limit useddream
Dream mode runs agents overnight in a low-cost simulation to consolidate memory and generate insights.
dream:
enabled: true
intensity: medium # light | medium | deep
scheduleHour: 3 # 3 AM in the configured timezone
useLocalOnly: true # Use Ollama/vLLM only — no cloud API costs
promptUser: false # Don't prompt for confirmation before startingchannels
Enable messaging platform integrations. Each channel reads credentials from environment variables.
channels:
telegram:
enabled: true
defaultAgent: default-agent
discord:
enabled: true
defaultAgent: default-agent
slack:
enabled: false
whatsapp:
enabled: false
linear:
enabled: true
webhookSecret: your-webhook-secret # overrides LINEAR_WEBHOOK_SECRET
apiKey: lin_api_xxxxxxxxxxxx # overrides LINEAR_API_KEYagents
Define one or more agent configurations. Each agent can override any setting from the global defaults.
agents:
- id: default-agent
displayName: Astra
tier: standard
model:
provider: openai
modelId: gpt-4o
maxContextTokens: 128000
maxOutputTokens: 4096
temperature: 0.7
systemPromptTemplate: |
You are Astra, a helpful AI assistant.
Today is {{date}}. The user is {{user.name}}.
{{#if memory}}Relevant memory:
{{memory}}{{/if}}
skills:
- git_ops
- web_search
tools:
allow:
- web_search
- file_read
- shell_exec
deny:
- file_delete
spawn:
enabled: true
allowedTargets:
- research-agent
- code-agent
maxDepth: 2
fileAccess:
restricted: true
allowedPaths:
- ./workspace
- ./src
- id: research-agent
displayName: Research Agent
tier: premium
model:
provider: claude
modelId: claude-opus-4-6
maxContextTokens: 200000
maxOutputTokens: 8192
temperature: 0.3
systemPromptTemplate: |
You are a research specialist. Conduct thorough investigations
and produce structured, cited reports.
skills:
- web_research
- document_analysis
spawn:
enabled: falsehotReload
Enable live reload of agent configs and workspace files without restarting the gateway.
hotReload:
enabled: true
watchPaths:
- ./workspace
- ./astra.yml
debounceMs: 500localModels
Configure the local router that routes simple queries to Ollama/vLLM and complex queries to cloud providers.
localModels:
enabled: true
complexityThreshold: 0.6 # Queries above this go to cloud
contextLengthThreshold: 4096 # Queries longer than this go to cloud
localProvider: ollama
localModel: llama3.2
fallbackProvider: openai
fallbackModel: gpt-4o