Agents

Agent Loop

Every message processed by Open Astra passes through a deterministic 7-step execution cycle defined in agents/loop.ts. Understanding this loop is essential for debugging agent behavior and optimizing performance.

Loop overview

text
Incoming message
  │
  ▼
1. Resolve session         {uid, surface, surfaceId} → session record
  │
  ▼
2. Persist user message    Save to sessions table in PostgreSQL
  │
  ▼
3. Compaction check        If context > threshold, summarize old messages
  │
  ▼
4. Assemble context        SOUL.md → workspace files → system prompt → memory → history
  │
  ▼
5. Inference call          Provider client → ChatMessage[] → response
  │
  ▼
6. Tool loop               Execute tool calls → feed results back (max 5 rounds)
  │
  ▼
7. Post-turn save          Persist response + auto-save memory + emit events + fire webhooks

Step 1 — Resolve session

Sessions are identified by the composite key {uid, surface, surfaceId}. The uid is the authenticated user, surface is the channel (e.g. telegram, api, cli), and surfaceId is the channel-specific identifier (e.g. a Telegram chat ID).

If no session exists for this key, one is created. If a session exists, the existing conversation history is loaded from PostgreSQL.

Step 2 — Persist user message

The user's message is written to the session_messages table before inference begins. This ensures durability — if the agent or gateway crashes mid-turn, the user message is not lost and can be retried.

Step 3 — Compaction check

If the current context exceeds the configured compaction threshold (default 85% of the model's max context tokens), the oldest messages are summarized using a lightweight model call. The summary replaces the raw messages, keeping context within budget while preserving key information.

This threshold is configured via selfHealing.compactionThreshold in astra.yml. See Self-Healing for details.

Step 4 — Assemble context

The context assembler in context/assembler.ts builds the final ChatMessage[] array. The order is fixed and each layer has a token budget. See Context Assembly for full details.

Step 5 — Inference call

The assembled ChatMessage[] is passed to the configured provider client via inference/factory.ts. The factory caches clients by provider:modelId:endpoint key and wraps each in a resilient layer with retry logic and optional fallback provider.

The streaming variant (agents/loop-stream.ts) yields an AsyncGenerator<AgentStreamEvent> for SSE and WebSocket delivery, emitting each token delta as it arrives.

Step 6 — Tool loop

If the model returns one or more tool calls, the loop executes them sequentially:

  1. Validates the call against the agent's allow/deny policy
  2. Looks up the tool in the registry by name
  3. Parses parameters against the tool's Zod schema
  4. Calls execute(params, ctx) and collects the result
  5. Appends { role: "tool", ... } messages to context
  6. Calls the model again with the updated context

This repeats up to 5 rounds. After 5 rounds, the loop stops. Individual tool failures return an error string in ToolResult.error — the agent loop is resilient and will continue rather than crash.

Step 7 — Post-turn save

After the final response is obtained, agents/post-turn-save.ts runs these side effects:

  • Persists the response to session_messages
  • Extracts facts, decisions, and entities for memory storage across appropriate tiers
  • Emits an agent.completed event on the typed event bus
  • Fires any registered outbound webhooks with the event payload
  • Records token usage in the billing tables for the cost dashboard

Streaming variant

The streaming loop (agents/loop-stream.ts) follows the same steps but yields events rather than returning a final result. It emits the following event types:

Event typeWhen emitted
agent.stream.deltaEach token chunk from the provider
agent.stream.tool_callWhen a tool call is detected
agent.stream.tool_resultAfter a tool finishes executing
agent.stream.completeWhen the full turn is done