Agent Ops

Run a team of specialized agents like a well-organized engineering org — with automatic task routing, cost controls, health monitoring, and daily standups. Think of it as having a 24/7 engineering team that costs pennies per task and never burns out.

What you'll have

Specialized agents for different domains (code, data, ops, triage)
Work queues that route tasks to the right agent automatically
Per-agent budgets that prevent runaway costs
Health dashboards showing latency, error rates, and throughput
Daily standups and cost reports

Step 1: Define your agents

yaml

# astra.yml — agent ops setup
agents:
  - id: triage-agent
    systemPromptTemplate: |
      You triage incoming tasks. Assess complexity, required skills,
      and priority. Route to the appropriate work queue.
    providers: [groq]
    budget:
      maxCostCents: 10
      maxToolCalls: 5

  - id: code-agent
    systemPromptTemplate: |
      You are a senior software engineer. Write clean, tested code.
    providers: [claude]
    budget:
      maxCostCents: 100
      maxToolCalls: 30

  - id: data-agent
    systemPromptTemplate: |
      You are a data analyst. Query databases, analyze datasets,
      and produce visualizations.
    providers: [claude, gemini]
    budget:
      maxCostCents: 80
      maxToolCalls: 20

  - id: ops-agent
    systemPromptTemplate: |
      You manage infrastructure. Monitor services, deploy updates,
      and respond to incidents.
    providers: [claude]
    budget:
      maxCostCents: 150
      maxToolCalls: 25

Step 2: Set up work queues

Work queues distribute tasks to agents by domain. Agents dequeue tasks, and idle agents can steal tasks from other queues for load balancing.

bash

# Create work queues for different task types
curl -X POST http://localhost:3000/work-queues \
  -H "Authorization: Bearer ${JWT_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{ "name": "engineering", "teamId": "eng-team" }'

curl -X POST http://localhost:3000/work-queues \
  -H "Authorization: Bearer ${JWT_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{ "name": "data-analysis", "teamId": "data-team" }'

Step 3: Dynamic team formation

For complex tasks that need multiple agents, auto-form a team. The system scores agents on skill match (60%) and reputation (40%) to select the best combination.

bash

# Auto-form a team for a complex task
curl -X POST http://localhost:3000/teams/form \
  -H "Authorization: Bearer ${JWT_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Migrate user auth from sessions to JWT with zero downtime",
    "maxMembers": 3,
    "requiredSkills": ["security", "backend", "deployment"]
  }'

Step 4: Monitor and optimize

bash

# Check team health
curl "http://localhost:3000/team-health?window=day" \
  -H "Authorization: Bearer ${JWT_TOKEN}"

# Check cost dashboard
curl http://localhost:3000/costs \
  -H "Authorization: Bearer ${JWT_TOKEN}"

# Generate daily standup
curl -X POST http://localhost:3000/standups/generate \
  -H "Authorization: Bearer ${JWT_TOKEN}"

Optimization tips

Use Groq for the triage agent — fast and cheap, perfect for routing decisions
Set tight budgets on triage (low cost, few tool calls) and generous budgets on code agents (they need room to iterate)
Review the reputation scores weekly to identify agents that need prompt tuning
Use parameter sweeps to find the optimal temperature for each agent
Set up escalation chains so blocked agents route to more capable ones instead of spinning