Gateway
The gateway is the single HTTP + WebSocket process that powers Open Astra. It runs an Express server with a shared http.Server for REST, SSE streaming, and full-duplex WebSocket — all on the same port. Every agent conversation, channel webhook, admin operation, and diagnostic check flows through this process.
What the gateway does
- Authenticates users via JWT (HS256 with access + refresh token rotation)
- Routes requests across 17 route groups and 80+ endpoints
- Streams agent responses over SSE (HTTP) and WebSocket (full-duplex)
- Bootstraps all 12 channel adapters (Telegram, Discord, Slack, WhatsApp, etc.) at startup
- Runs DB migrations, seeds agents from
astra.yml, and starts background workers - Manages the plugin/skill hot-reload file watcher
- Starts the cron scheduler, self-healing monitor, and presence tracker
- Optionally starts a gRPC sidecar on a separate port for inter-agent spawn communication
Source layout
| File | Purpose |
|---|---|
gateway/index.ts | Express app, bootstrap, start/shutdown lifecycle |
gateway/ws.ts | WebSocket server — auth, connection map, push helpers |
gateway/ws-handler.ts | Bidirectional WS message handling and agent streaming |
gateway/commands.ts | In-chat slash commands (/reset, /status, /help, etc.) |
gateway/presence.ts | Agent presence state — real-time status pushed over WS |
gateway/errors.ts | Structured error class hierarchy (AppError subclasses) |
gateway/middleware/ | 15 middleware modules (auth, RBAC, rate limiting, CSRF, brute-force, RLS, etc.) |
gateway/routes/ | 17 route files — one per domain (chat, memory, agents, etc.) |
Middleware stack
Every request passes through the global middleware in order. Agent-scoped routes add six additional layers for auth, budgets, workspace resolution, and rate limiting.
# Request lifecycle — every HTTP request passes through in order
1. securityHeaders() X-Content-Type-Options, HSTS, CSP, X-Frame-Options, Referrer-Policy
2. correlationId() Reads or generates X-Request-ID, echoes it in response
3. cors() Production: ALLOWED_ORIGINS allowlist; dev: allow all
4. express.json() 1 MB limit, captures rawBody for HMAC verification
5. csrfProtection() Double-submit cookie (Bearer token holders exempt)
6. requestLogger() Structured JSON log: method / path / status / duration / uid
# Agent-scoped routes add these after global middleware:
7. jwtAuth() Validates Bearer JWT, attaches req.uid
8. tokenBudget() Per-user token budget — hard cap → 429, soft cap → override model
9. workspaceResolver() X-Workspace-Id header → validates membership, attaches req.workspace
10. rateLimiter() Postgres-backed sliding window (30 req / 60s), in-memory fallback
11. userRateLimit() In-memory per-uid: USER_RATE_LIMIT_RPM (default 60) + burst (10)
12. requestTimeout() Hard 120s timeout — SSE streams closed gracefullyrate_limit_entries table so limits survive restarts and work across horizontal replicas. If Postgres is unreachable, it falls back to an in-memory sliding window transparently. Dual transport — SSE and WebSocket
The gateway provides the same agent streaming over two transports. Use whichever fits your client.
WebSocket
Connect to ws://host/ws?token=<jwt>. The gateway authenticates via the query-string JWT, supports multi-device (one uid can have N concurrent sockets), and sends a heartbeat ping every 30 seconds — sockets that don't pong back are terminated.
// Inbound: client → server
{ type: "message", id: "req-1", agentId: "chad", message: "Hello",
surface: "chat", surfaceId: "main" }
{ type: "ping" }
// Outbound: server → client
{ type: "connected", timestamp }
{ type: "session_resolved", id, sessionId, isNew }
{ type: "stream_chunk", id, content }
{ type: "tool_call", id, index, toolCallId, name }
{ type: "tool_result", id, name, durationMs, error }
{ type: "stream_done", id, content, sessionId, usage, toolsUsed }
{ type: "presence", agentId, sessionId, status, currentTool }
{ type: "error", id, message }
{ type: "pong" }AbortController — closing the socket aborts the request. Back-pressure: the gateway monitors socket.bufferedAmount and pauses streaming when it exceeds 64 KB, polling every 50ms until drained.
SSE (Server-Sent Events)
POST /agent/chat/stream returns an SSE stream with typed events. Stateless — no persistent connection required.
event: session_resolved
data: { "sessionId": "ses_abc", "isNew": true }
event: content
data: { "content": "Here's what I found..." }
event: tool_call
data: { "index": 0, "name": "web_search", "toolCallId": "tc_1" }
event: tool_call_complete
data: { "index": 0, "name": "web_search", "arguments": "{...}" }
event: tool_result
data: { "name": "web_search", "durationMs": 320 }
event: done
data: { "content": "...", "sessionId": "ses_abc", "usage": {...}, "toolsUsed": [...] }Presence system
The gateway tracks real-time agent status per user per session. Every state change pushes a presence frame to all of the user's WebSocket connections and emits to the event bus. Stale entries are cleaned up every 60 seconds (5-minute threshold).
starting → streaming → tool_calling → completed
↘ errorRoute groups
The gateway organizes endpoints into 17 route groups. Each group has its own auth and middleware requirements.
| Group | Auth | Key endpoints |
|---|---|---|
| Auth | Public | /auth/register, /login, /refresh, /logout |
| Chat | JWT + budget + workspace | /agent/chat, /agent/chat/stream, /agent/reset |
| Memory | JWT + budget + workspace | /agent/memory/profile, /agent/memory/search |
| Agents | JWT + workspace | CRUD /agents, /agents/:id/capabilities, versioning, rollback |
| Workspaces | JWT | CRUD /workspaces, members, stats, grants, restrictions |
| Sessions | JWT | /sessions, /sessions/:id, search, handoff |
| Webhooks | JWT | CRUD /webhooks, deliveries, test |
| Costs | JWT + workspace | /costs, /costs/breakdown |
| Heartbeat | JWT + workspace | CRUD /heartbeat, run history |
| Skills | Public (cache mgmt: JWT) | /skills, /skills/:id, metrics, cache |
| Traces | JWT + workspace | /traces, /traces/:id |
| Jobs | JWT + workspace | CRUD /jobs |
| Memory Profiles | JWT + workspace | CRUD /memory-profiles, agent assignment |
| Onboarding | JWT | /onboarding/setup |
| Autonomous | Internal API key | /autonomous/run, /autonomous/batch |
| Admin | Internal API key | /admin/health, usage, sessions, audit |
| Diagnostics | Internal API key | /diagnostics, circuit breakers |
| GDPR | JWT (self-only) | DELETE /memory/user/:uid — full data purge |
POST /agent/chat accepts an X-Idempotency-Key header. Responses are cached in-process for 5 minutes — duplicate requests return the cached response without running the agent again. Health, readiness, and metrics
Three public endpoints for monitoring. No authentication required.
// GET /health — no auth required
{
"status": "ok", // "ok" | "degraded"
"checks": {
"typesense": true,
"postgres": true
},
"wsConnections": 42,
"timestamp": "2026-02-27T12:00:00.000Z"
}
// GET /readiness — Kubernetes probe
{ "ready": true } // 200 if both Postgres + Typesense reachable, 503 otherwise
// GET /metrics — operational dashboard
{
"uptime": 86400,
"wsConnections": 42,
"tools": 106,
"skills": 48,
"sessions": { "active": 15, "total": 1240 },
"messages": 28400,
"billing": { "month": "2026-02", "totalCost": 12.40 }
}Structured errors
Every domain error extends AppError and serializes to a consistent JSON shape. The errorHandler() middleware catches them automatically.
// All domain errors extend AppError → structured JSON response
{
"error": "Rate limit exceeded",
"code": "RATE_LIMIT",
"status": 429,
"details": {
"retryAfter": 12,
"limit": 30,
"window": "60s"
}
}
// Error subclasses:
// AuthError (401) ForbiddenError (403) ValidationError (400)
// NotFoundError (404) ConflictError (409) RateLimitError (429)
// AgentError (500) ToolError (500) QuotaExceededError (429)
// InferenceError (502) ConfigError (500)RBAC
Four capability roles form a hierarchy: owner > editor > tool_runner > viewer. Mapped from workspace_members.role values. The requireRole(minimum) middleware can be applied to any route.
Notable features
| Feature | Detail |
|---|---|
| Session handoff | POST /sessions/:id/handoff transfers a session to a different agent, preserving full history |
| Slash commands | /reset, /new, /compact, /status, /help — intercepted before the agent loop on both HTTP and WS |
| gRPC sidecar | Optional second server on GRPC_PORT (50051) for inter-agent spawn communication. Supports mTLS |
| Hot reload | File watcher on plugins and skills directories — 1000ms debounce, syntax errors roll back |
| Self-healing | Monitors sessions for consecutive failures, restarts with exponential backoff, triggers compaction |
| SSRF guard | DNS-pinning fetch with blocklists for private IPs, cloud metadata, IPv4-mapped IPv6, hex/octal encodings. Optional ALLOWED_OUTBOUND_HOSTS allowlist |
| Intrusion detection | Tracks SSRF, SQLi, path traversal, auth failures, sandbox escapes in a sliding 10-minute window. Escalates to SECURITY_WEBHOOK_URL when thresholds are breached |
| GDPR purge | DELETE /memory/user/:uid — purges all user data across 16 tables (sessions, memory, tokens, traces, billing, audit). Self-only access |
| Brute-force protection | Progressive delay (1–60s) after 5 failed logins, full IP block after 20. Applied on /auth/login |
| WebSocket rate limiting | 10 connections/IP/min, 100 messages/connection/min. Violations close the socket (code 1008) |
| Row-Level Security | Per-request RLS context middleware on sessions, traces, and memory-profiles routes |
| ETag caching | GET /agents and GET /skills return weak ETags — 304 on If-None-Match match |
| HMAC verification | Shared middleware for all channel webhooks — SHA256/SHA1, hex/base64, timingSafeEqual |
Key configuration
| Variable | Default | Purpose |
|---|---|---|
PORT | 8080 | HTTP listen port |
ALLOWED_ORIGINS | http://localhost:3000 | Comma-separated CORS allowlist |
JWT_SECRET | — | HMAC-HS256 signing key (min 32 chars in production) |
JWT_ACCESS_EXPIRES | 15m | Access token TTL |
JWT_REFRESH_EXPIRES | 7d | Refresh token TTL |
JWT_SECRET_PREV | — | Previous signing key for rotation |
INTERNAL_API_KEY | — | API key for admin, autonomous, diagnostics routes |
USER_RATE_LIMIT_RPM | 60 | Per-user requests per minute |
USER_RATE_LIMIT_BURST | 10 | Burst allowance above RPM |
GRPC_ENABLED | false | Enable gRPC sidecar |
GRPC_PORT | 50051 | gRPC listen port |
GATEWAY_URL | — | Public URL for webhook self-registration |
SECURITY_WEBHOOK_URL | — | POST target for intrusion detection alerts |
ALLOWED_OUTBOUND_HOSTS | — | Comma-separated allowlist for SSRF guard (if unset, blocklist-only mode) |
Explore in depth
| Topic | What it covers |
|---|---|
| Middleware | Full middleware reference — auth, CSRF, rate limiting, RBAC, security headers |
| WebSocket | Connection lifecycle, frame types, back-pressure, multi-device |
| Routes | Complete endpoint reference across all 17 route groups |
| Channel Adapters | How channel webhooks register on the gateway, HMAC verification |
| Auth Hardening | JWT rotation, CSRF, device binding, key management |
| API Reference | Full REST API documentation |