Cost & Budget Management

AI inference costs can spiral fast when agents run unsupervised. Open Astra solves this with real-time cost dashboards, per-agent budget caps, and automatic enforcement — so you always know exactly what you're spending and no single agent can blow your budget. Most teams reduce their inference costs 30–50% in the first month just by seeing the breakdown.

ℹCost management is included in every license — Solo and Team. No usage fees, no per-seat charges, no surprise invoices from us. You only pay your inference providers directly.

Why this matters

Predictable spend — hard caps prevent runaway agents from burning through your inference budget overnight
Per-agent attribution — know exactly which agent is costing you money, down to the tool call
Provider visibility — see spend by provider (Claude, OpenAI, Groq, etc.) to optimize your provider mix
Budget inheritance — when agents spawn sub-agents, budgets flow downward with strict limits

Cost dashboard

The cost dashboard aggregates spend across your workspace by agent, provider, and time period. The default lookback is 3 months.

bash

# Get workspace cost summary (default: last 3 months)
curl http://localhost:3000/costs \
  -H "Authorization: Bearer ${JWT_TOKEN}"

# Response
{
  "workspaceId": "ws_abc123",
  "totalCost": 42.17,
  "byAgent": {
    "research-agent": 18.50,
    "code-agent": 14.22,
    "support-agent": 9.45
  },
  "byProvider": {
    "claude": 28.90,
    "openai": 10.12,
    "groq": 3.15
  },
  "byMonth": [
    { "month": "2026-01", "cost": 15.30 },
    { "month": "2026-02", "cost": 14.80 },
    { "month": "2026-03", "cost": 12.07 }
  ]
}

Detailed breakdown

For finer-grained analysis, use the breakdown endpoint with optional date range filtering.

bash

# Detailed breakdown with date range
curl "http://localhost:3000/costs/breakdown?startDate=2026-02-01&endDate=2026-02-28" \
  -H "Authorization: Bearer ${JWT_TOKEN}"

# Response
{
  "meta": {
    "workspaceId": "ws_abc123",
    "months": 3,
    "startDate": "2026-02-01",
    "endDate": "2026-02-28",
    "generatedAt": "2026-03-07T12:00:00.000Z"
  },
  "totalCost": 14.80,
  "byAgent": { ... },
  "byProvider": { ... },
  "byTool": { ... }
}

Parameter	Type	Default	Description
`months`	integer	3	Lookback period in months (used when no date range specified)
`startDate`	string	—	ISO date for range start
`endDate`	string	—	ISO date for range end

Budget caps

Every agent can have resource constraints defined in astra.yml or via the API. Budgets control tokens, cost, tool calls, execution time, and spawn permissions.

yaml

# astra.yml — per-agent budget constraints
agents:
  - id: research-agent
    budget:
      maxTotalTokens: 50000
      maxCostCents: 50        # $0.50 per request
      maxToolCalls: 20
      maxDuration: 60000      # 60 seconds
      maxChildAgents: 0       # cannot spawn sub-agents

System hard caps

These limits cannot be exceeded by any agent or swarm, regardless of configuration:

text

# System hard caps (cannot be exceeded by any agent)
maxPromptTokens:     500,000
maxCompletionTokens: 100,000
maxTotalTokens:      600,000
maxCostCents:        500        # $5.00 per swarm
maxToolCalls:        100
maxDuration:         300,000 ms  # 5 minutes
maxSpawnDepth:       5
maxChildAgents:      20

Role-based defaults

When an agent has a role but no explicit budget, these defaults apply:

text

# Role-based defaults (applied when no explicit budget is set)
researcher:  50,000 tokens  |  $0.50  |  20 tool calls  |  60s   |  no spawn
analyst:     80,000 tokens  |  $1.00  |  15 tool calls  |  90s   |  no spawn
writer:      30,000 tokens  |  $0.30  |   5 tool calls  |  30s   |  no spawn
mediator:    40,000 tokens  |  $0.40  |   5 tool calls  |  60s   |  no spawn

Budget inheritance

When a parent agent spawns a sub-agent, budgets follow strict inheritance rules:

Child budget is always ≤ parent budget (hard constraint)
maxSpawnDepth decrements by 1 at each level
Tool permissions: child gets the intersection of parent's allowed tools and union of denied tools
Memory permissions use AND logic — child can only have what parent allows

ℹBudget checks run before every inference call via the Budget Pre-Flight system. If a budget is exhausted, the agent receives an "exhausted" status and stops processing.

Cost ledger

The cost ledger tracks execution metrics at two levels:

Ledger	Scope	What it tracks
`role_ledger`	Global per agent role	Total executions, success rate, latency, tokens, cost, fitness score
`user_role_ledger`	Per user per agent role	Same metrics scoped to individual users
`execution_log`	Per execution	Individual execution records with latency, tokens, errors

Leaderboard

The leaderboard ranks agents by usage and performance. Useful for identifying which agents consume the most resources and which perform best.

Endpoint	Description
`GET /leaderboard`	Top agents by period (`day`, `week`, `month`). Default limit: 20, max: 100
`GET /leaderboard/:agentId`	Hourly stats for a specific agent

Differential privacy budget

For workspaces with differential privacy enabled, Open Astra tracks a privacy budget that limits the amount of information that can be extracted about any individual user. Owners can reset the budget when needed.

bash

# Get differential privacy budget status
curl http://localhost:3000/security/dp-budget \
  -H "Authorization: Bearer ${JWT_TOKEN}"

# Reset DP budget (owner only)
curl -X POST http://localhost:3000/security/dp-budget/reset \
  -H "Authorization: Bearer ${JWT_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{ "uid": "uid_alice" }'

Budget Pre-Flight — how budgets are checked before inference
Quotas — per-agent rate limits and token quotas
Cost Tagging — per-tool cost attribution

Cost & Budget Management

Why this matters

Cost dashboard

Detailed breakdown

Budget caps

System hard caps

Role-based defaults

Budget inheritance

Cost ledger

Leaderboard

Differential privacy budget

Related