Agent Loop

Every message processed by Open Astra passes through a deterministic 7-step execution cycle defined in agents/loop.ts. Understanding this loop is essential for debugging agent behavior and optimizing performance.

Loop overview

text

Incoming message
  │
  ▼
1. Resolve session         {uid, surface, surfaceId} → session record
  │
  ▼
2. Persist user message    Save to sessions table in PostgreSQL
  │
  ▼
3. Compaction check        If context > threshold, summarize old messages
  │
  ▼
4. Assemble context        SOUL.md → workspace files → system prompt → memory → history
  │
  ▼
5. Inference call          Provider client → ChatMessage[] → response
  │
  ▼
6. Tool loop               Execute tool calls → feed results back (max 5 rounds)
  │
  ▼
7. Post-turn save          Persist response + auto-save memory + emit events + fire webhooks

Step 1 — Resolve session

Sessions are identified by the composite key {uid, surface, surfaceId}. The uid is the authenticated user, surface is the channel (e.g. telegram, api, cli), and surfaceId is the channel-specific identifier (e.g. a Telegram chat ID).

If no session exists for this key, one is created. If a session exists, the existing conversation history is loaded from PostgreSQL.

Step 2 — Persist user message

The user's message is written to the session_messages table before inference begins. This ensures durability — if the agent or gateway crashes mid-turn, the user message is not lost and can be retried.

Step 3 — Compaction check

If the current context exceeds the configured compaction threshold (default 85% of the model's max context tokens), the oldest messages are summarized using a lightweight model call. The summary replaces the raw messages, keeping context within budget while preserving key information.

This threshold is configured via selfHealing.compactionThreshold in astra.yml. See Self-Healing for details.

Step 4 — Assemble context

The context assembler in context/assembler.ts builds the final ChatMessage[] array. The order is fixed and each layer has a token budget. See Context Assembly for full details.

Step 5 — Inference call

The assembled ChatMessage[] is passed to the configured provider client via inference/factory.ts. The factory caches clients by provider:modelId:endpoint key and wraps each in a resilient layer with retry logic and optional fallback provider.

The streaming variant (agents/loop-stream.ts) yields an AsyncGenerator<AgentStreamEvent> for SSE and WebSocket delivery, emitting each token delta as it arrives.

Step 6 — Tool loop

If the model returns one or more tool calls, the loop executes them sequentially:

Validates the call against the agent's allow/deny policy
Looks up the tool in the registry by name
Parses parameters against the tool's Zod schema
Calls execute(params, ctx) and collects the result
Appends { role: "tool", ... } messages to context
Calls the model again with the updated context

This repeats up to 5 rounds. After 5 rounds, the loop stops. Individual tool failures return an error string in ToolResult.error — the agent loop is resilient and will continue rather than crash.

Step 7 — Post-turn save

After the final response is obtained, agents/post-turn-save.ts runs these side effects:

Persists the response to session_messages
Extracts facts, decisions, and entities for memory storage across appropriate tiers
Emits an agent.completed event on the typed event bus
Fires any registered outbound webhooks with the event payload
Records token usage in the billing tables for the cost dashboard

Streaming variant

The streaming loop (agents/loop-stream.ts) follows the same steps but yields events rather than returning a final result. It emits the following event types:

Event type	When emitted
`agent.stream.delta`	Each token chunk from the provider
`agent.stream.tool_call`	When a tool call is detected
`agent.stream.tool_result`	After a tool finishes executing
`agent.stream.complete`	When the full turn is done