Agent Loop
Every message processed by Open Astra passes through a deterministic 7-step execution cycle defined in agents/loop.ts. Understanding this loop is essential for debugging agent behavior and optimizing performance.
Loop overview
Incoming message
│
▼
1. Resolve session {uid, surface, surfaceId} → session record
│
▼
2. Persist user message Save to sessions table in PostgreSQL
│
▼
3. Compaction check If context > threshold, summarize old messages
│
▼
4. Assemble context SOUL.md → workspace files → system prompt → memory → history
│
▼
5. Inference call Provider client → ChatMessage[] → response
│
▼
6. Tool loop Execute tool calls → feed results back (max 5 rounds)
│
▼
7. Post-turn save Persist response + auto-save memory + emit events + fire webhooksStep 1 — Resolve session
Sessions are identified by the composite key {uid, surface, surfaceId}. The uid is the authenticated user, surface is the channel (e.g. telegram, api, cli), and surfaceId is the channel-specific identifier (e.g. a Telegram chat ID).
If no session exists for this key, one is created. If a session exists, the existing conversation history is loaded from PostgreSQL.
Step 2 — Persist user message
The user's message is written to the session_messages table before inference begins. This ensures durability — if the agent or gateway crashes mid-turn, the user message is not lost and can be retried.
Step 3 — Compaction check
If the current context exceeds the configured compaction threshold (default 85% of the model's max context tokens), the oldest messages are summarized using a lightweight model call. The summary replaces the raw messages, keeping context within budget while preserving key information.
This threshold is configured via selfHealing.compactionThreshold in astra.yml. See Self-Healing for details.
Step 4 — Assemble context
The context assembler in context/assembler.ts builds the final ChatMessage[] array. The order is fixed and each layer has a token budget. See Context Assembly for full details.
Step 5 — Inference call
The assembled ChatMessage[] is passed to the configured provider client via inference/factory.ts. The factory caches clients by provider:modelId:endpoint key and wraps each in a resilient layer with retry logic and optional fallback provider.
The streaming variant (agents/loop-stream.ts) yields an AsyncGenerator<AgentStreamEvent> for SSE and WebSocket delivery, emitting each token delta as it arrives.
Step 6 — Tool loop
If the model returns one or more tool calls, the loop executes them sequentially:
- Validates the call against the agent's allow/deny policy
- Looks up the tool in the registry by name
- Parses parameters against the tool's Zod schema
- Calls
execute(params, ctx)and collects the result - Appends
{ role: "tool", ... }messages to context - Calls the model again with the updated context
This repeats up to 5 rounds. After 5 rounds, the loop stops. Individual tool failures return an error string in ToolResult.error — the agent loop is resilient and will continue rather than crash.
Step 7 — Post-turn save
After the final response is obtained, agents/post-turn-save.ts runs these side effects:
- Persists the response to
session_messages - Extracts facts, decisions, and entities for memory storage across appropriate tiers
- Emits an
agent.completedevent on the typed event bus - Fires any registered outbound webhooks with the event payload
- Records token usage in the billing tables for the cost dashboard
Streaming variant
The streaming loop (agents/loop-stream.ts) follows the same steps but yields events rather than returning a final result. It emits the following event types:
| Event type | When emitted |
|---|---|
agent.stream.delta | Each token chunk from the provider |
agent.stream.tool_call | When a tool call is detected |
agent.stream.tool_result | After a tool finishes executing |
agent.stream.complete | When the full turn is done |