Costs

Cost & Budget Management

AI inference costs can spiral fast when agents run unsupervised. Open Astra solves this with real-time cost dashboards, per-agent budget caps, and automatic enforcement — so you always know exactly what you're spending and no single agent can blow your budget. Most teams reduce their inference costs 30–50% in the first month just by seeing the breakdown.

Cost management is included in every license — Solo and Team. No usage fees, no per-seat charges, no surprise invoices from us. You only pay your inference providers directly.

Why this matters

  • Predictable spend — hard caps prevent runaway agents from burning through your inference budget overnight
  • Per-agent attribution — know exactly which agent is costing you money, down to the tool call
  • Provider visibility — see spend by provider (Claude, OpenAI, Groq, etc.) to optimize your provider mix
  • Budget inheritance — when agents spawn sub-agents, budgets flow downward with strict limits

Cost dashboard

The cost dashboard aggregates spend across your workspace by agent, provider, and time period. The default lookback is 3 months.

bash
# Get workspace cost summary (default: last 3 months)
curl http://localhost:3000/costs \
  -H "Authorization: Bearer ${JWT_TOKEN}"

# Response
{
  "workspaceId": "ws_abc123",
  "totalCost": 42.17,
  "byAgent": {
    "research-agent": 18.50,
    "code-agent": 14.22,
    "support-agent": 9.45
  },
  "byProvider": {
    "claude": 28.90,
    "openai": 10.12,
    "groq": 3.15
  },
  "byMonth": [
    { "month": "2026-01", "cost": 15.30 },
    { "month": "2026-02", "cost": 14.80 },
    { "month": "2026-03", "cost": 12.07 }
  ]
}

Detailed breakdown

For finer-grained analysis, use the breakdown endpoint with optional date range filtering.

bash
# Detailed breakdown with date range
curl "http://localhost:3000/costs/breakdown?startDate=2026-02-01&endDate=2026-02-28" \
  -H "Authorization: Bearer ${JWT_TOKEN}"

# Response
{
  "meta": {
    "workspaceId": "ws_abc123",
    "months": 3,
    "startDate": "2026-02-01",
    "endDate": "2026-02-28",
    "generatedAt": "2026-03-07T12:00:00.000Z"
  },
  "totalCost": 14.80,
  "byAgent": { ... },
  "byProvider": { ... },
  "byTool": { ... }
}
ParameterTypeDefaultDescription
monthsinteger3Lookback period in months (used when no date range specified)
startDatestringISO date for range start
endDatestringISO date for range end

Budget caps

Every agent can have resource constraints defined in astra.yml or via the API. Budgets control tokens, cost, tool calls, execution time, and spawn permissions.

yaml
# astra.yml — per-agent budget constraints
agents:
  - id: research-agent
    budget:
      maxTotalTokens: 50000
      maxCostCents: 50        # $0.50 per request
      maxToolCalls: 20
      maxDuration: 60000      # 60 seconds
      maxChildAgents: 0       # cannot spawn sub-agents

System hard caps

These limits cannot be exceeded by any agent or swarm, regardless of configuration:

text
# System hard caps (cannot be exceeded by any agent)
maxPromptTokens:     500,000
maxCompletionTokens: 100,000
maxTotalTokens:      600,000
maxCostCents:        500        # $5.00 per swarm
maxToolCalls:        100
maxDuration:         300,000 ms  # 5 minutes
maxSpawnDepth:       5
maxChildAgents:      20

Role-based defaults

When an agent has a role but no explicit budget, these defaults apply:

text
# Role-based defaults (applied when no explicit budget is set)
researcher:  50,000 tokens  |  $0.50  |  20 tool calls  |  60s   |  no spawn
analyst:     80,000 tokens  |  $1.00  |  15 tool calls  |  90s   |  no spawn
writer:      30,000 tokens  |  $0.30  |   5 tool calls  |  30s   |  no spawn
mediator:    40,000 tokens  |  $0.40  |   5 tool calls  |  60s   |  no spawn

Budget inheritance

When a parent agent spawns a sub-agent, budgets follow strict inheritance rules:

  1. Child budget is always ≤ parent budget (hard constraint)
  2. maxSpawnDepth decrements by 1 at each level
  3. Tool permissions: child gets the intersection of parent's allowed tools and union of denied tools
  4. Memory permissions use AND logic — child can only have what parent allows
Budget checks run before every inference call via the Budget Pre-Flight system. If a budget is exhausted, the agent receives an "exhausted" status and stops processing.

Cost ledger

The cost ledger tracks execution metrics at two levels:

LedgerScopeWhat it tracks
role_ledgerGlobal per agent roleTotal executions, success rate, latency, tokens, cost, fitness score
user_role_ledgerPer user per agent roleSame metrics scoped to individual users
execution_logPer executionIndividual execution records with latency, tokens, errors

Leaderboard

The leaderboard ranks agents by usage and performance. Useful for identifying which agents consume the most resources and which perform best.

EndpointDescription
GET /leaderboardTop agents by period (day, week, month). Default limit: 20, max: 100
GET /leaderboard/:agentIdHourly stats for a specific agent

Differential privacy budget

For workspaces with differential privacy enabled, Open Astra tracks a privacy budget that limits the amount of information that can be extracted about any individual user. Owners can reset the budget when needed.

bash
# Get differential privacy budget status
curl http://localhost:3000/security/dp-budget \
  -H "Authorization: Bearer ${JWT_TOKEN}"

# Reset DP budget (owner only)
curl -X POST http://localhost:3000/security/dp-budget/reset \
  -H "Authorization: Bearer ${JWT_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{ "uid": "uid_alice" }'