Health Scorecard
The health scorecard provides a real-time reliability and performance dashboard for each agent. It tracks three dimensions: reliability (uptime, crash rate), efficiency (token spend, tool call ratio), and memory hygiene (staleness, contradiction rate). Scores are normalized to a 0–1 scale and updated after every agent turn.
Metrics
Each agent receives three composite scores computed from underlying signals:
| Score | Formula | Signals used |
|---|---|---|
reliability | 1 - (crashRate * 0.6 + downtimeRatio * 0.4) | Consecutive failures, uptime over rolling 24h window |
efficiency | 1 - clamp((tokenSpend / tokenBudget) * 0.5 + (toolCallRatio - 1) * 0.5, 0, 1) | Token spend vs. budget, ratio of tool calls to successful outcomes |
memoryHygiene | 1 - (stalenessRatio * 0.5 + contradictionRate * 0.5) | Fraction of stale memory entries, rate of contradiction detections |
The composite health score is the weighted average: reliability * 0.4 + efficiency * 0.35 + memoryHygiene * 0.25.
Viewing scores
Scorecard data is available via the REST API. Both endpoints return scores rounded to four decimal places along with the raw signal values used to compute them.
Fetch the scorecard for a single agent:
GET /agents/:id/health
# Response
{
"agentId": "research-agent",
"scores": {
"reliability": 0.9400,
"efficiency": 0.8120,
"memoryHygiene": 0.7760,
"composite": 0.8522
},
"status": "healthy",
"updatedAt": "2025-11-14T09:31:00Z"
}Fetch scorecards for all agents in the workspace:
GET /agents/health
# Returns an array of scorecard objects, sorted by composite score ascending
# so the most degraded agents appear first.Score thresholds
Each composite score falls into one of three status bands:
| Status | Composite score range | Behavior |
|---|---|---|
| Healthy | > 0.8 | No action taken. Agent operates normally. |
| Degraded | 0.6 – 0.8 | Warning emitted on event bus. Alert webhooks fire if configured. |
| Critical | < 0.6 | Alert webhook fires and the agent is flagged for review. Self-healing restarts if selfHealing.enabled is true. |
Alerts
Configure webhook alerts to be notified when an agent enters the degraded or critical band. Alerts fire at most once per cooldownMinutes window per agent to prevent notification spam:
healthScorecard:
alerts:
enabled: true
cooldownMinutes: 15
webhooks:
- url: https://hooks.example.com/openastra
on:
- degraded
- critical
headers:
Authorization: Bearer ${WEBHOOK_SECRET}
# Optional: only alert if a specific sub-score crosses a threshold
thresholds:
reliability: 0.7
efficiency: 0.6
memoryHygiene: 0.65How scores are calculated
Scores are computed at the end of every agent turn using a rolling window of the last 50 turns (configurable via healthScorecard.windowSize). This means a single failure does not immediately crash the score — it is amortized across the window.
Raw signals are collected passively from the agent runtime:
- crashRate — fraction of turns in the window that ended in an unhandled error
- downtimeRatio — fraction of the last 24 hours the agent was in a paused or failed state
- tokenSpend — total tokens consumed in the window vs. the agent's configured
quotas.tokenBudget - toolCallRatio — average number of tool calls per turn; values above 1.0 inflate the efficiency penalty
- stalenessRatio — fraction of the agent's memory entries older than
memory.stalenessThresholdDays - contradictionRate — fraction of memory writes in the window that triggered a contradiction detection
healthScorecard:
enabled: true
windowSize: 50 # Number of turns to include in rolling window
weights:
reliability: 0.40
efficiency: 0.35
memoryHygiene: 0.25