Meta-Controller
The meta-controller manages multiple concurrent swarms simultaneously. It monitors resource allocation across swarms, rebalances overloaded agents, detects overflow conditions where a swarm is consuming disproportionate resources, and dynamically delegates tasks between swarms to maintain throughput and fairness.
How it works
The meta-controller operates as a supervisory process that sits above individual swarms. It has three core subsystems:
- Swarm registry — maintains a live index of all active swarms, their root agents, current agent counts, and resource consumption metrics. Every swarm registers itself on creation and deregisters on teardown.
- Resource monitor — polls the registry at a configurable interval and computes per-swarm token spend rate, agent count, and queue depth. If any swarm exceeds the
rebalanceThresholdon any metric, a rebalance is triggered. - Rebalancing — when a swarm is overloaded, the meta-controller can pause lower-priority agents within that swarm, migrate queued subtasks to an underutilized swarm, or reject new spawn requests until resource usage falls below the threshold.
Configuration
metaController:
enabled: true
maxConcurrentSwarms: 10 # Hard cap on simultaneously active swarms
rebalanceThreshold: 0.80 # Fraction of capacity that triggers rebalancing
overflowStrategy: queue # queue | reject | migrate
pollIntervalMs: 5000 # How often the resource monitor samples swarms
priorityField: metadata.priority # Agent field used for priority ordering during rebalanceThe overflowStrategy controls what happens when a swarm exceeds capacity:
| Strategy | Behavior |
|---|---|
queue | New spawn requests are held in a FIFO queue until capacity is available. Default. |
reject | New spawn requests beyond capacity are rejected with a 503 error. |
migrate | Queued subtasks are delegated to another swarm with available capacity. Requires compatible root agents across swarms. |
Swarm lifecycle
The meta-controller tracks swarms through four lifecycle stages:
- Create — A swarm is registered when its root agent receives its first task. The meta-controller assigns it an execution ID and begins monitoring. If
maxConcurrentSwarmsis already reached, the request is handled peroverflowStrategy. - Monitor — The resource monitor periodically samples each active swarm. Metrics are written to a rolling buffer used to compute the health scorecard for each swarm's constituent agents.
- Rebalance — When a swarm breaches
rebalanceThreshold, the meta-controller pauses the lowest-priority sub-agents (bypriorityField) and redistributes their queued work. Rebalancing events are emitted on the event bus asswarm.rebalanced. - Teardown — When a swarm's root agent completes synthesis, the swarm is deregistered. Any child agents still running are terminated. Resources are released and the execution ID is archived for audit.
API endpoints
The meta-controller exposes a REST API for managing swarms:
# Create a new swarm (assigns an execution ID and registers with meta-controller)
POST /swarms
{
"rootAgentId": "orchestrator",
"task": "Audit the authentication module for security vulnerabilities",
"metadata": { "priority": 2 }
}
# List all active swarms with resource usage
GET /swarms
# Tear down a swarm and terminate all child agents
DELETE /swarms/:idThe GET /swarms response includes per-swarm agent count, token spend rate, queue depth, and current lifecycle stage. Use this endpoint to build dashboards or drive auto-scaling decisions.