A self-hosted agent runtime with vectorized memory across pgvector and Typesense, 10 LLM providers, and the channels your team already uses. Production infrastructure you actually own.
Slack, X, email — or all of them. Agents live where your team already works, no new interface to learn.
5-tier memory fuses pgvector and Typesense via RRF to surface exactly what's relevant before every response.
Tools run, sub-agents spawn, results come back through the same channel. Memory updates. The loop closes.
Ephemeral → daily notes → user profile → knowledge graph with multi-hop traversal → procedural workflows. RRF fusion search across Typesense and pgvector.
A root agent decomposes tasks, spawns permission-sandboxed sub-agents, shares state via blackboard, mediates conflicts with a debate protocol, and synthesizes.
Grok, Groq, OpenAI, Gemini, Claude, Ollama, vLLM, Bedrock, Mistral, OpenRouter — all behind one interface with per-provider prompt caching (up to 90% savings).
Drop AGENTS.md, USER.md, IDENTITY.md, or any .md into ./workspace/. Hot-reloaded on save, SHA-256 checksummed, injected before every agent turn.
Auto-restart crashed agents with exponential backoff. Auto-compact sessions approaching context limits. Dead session cleanup. All configurable in astra.yml.
Autonomous multi-day PhD-style research. Spawns sub-agent swarms for parallel investigation, synthesizes findings, delivers structured reports.
Overnight simulation using local models only (Ollama). 100% on-device, privacy-first. Generates morning reports with extracted insights and entity updates.
Per-agent, per-user, per-model token spend with trend visualization. Real-time quota enforcement. npx astra costs or the /costs API.
12 checks across 6 categories: connectivity, config, migrations, agents, providers, memory. Run with npx astra doctor.
Proactive background agents that monitor triggers — email, RSS, price thresholds, news keywords — and fire an agent when conditions are met. Cron-scheduled, zero manual polling.
Skills and plugins reload instantly when files change on disk — no gateway restart required. Filesystem watcher with 1000ms debounce. Syntax errors roll back to the previous working version.
Agents A/B test prompt variants against each other and automatically promote the best performer. Feedback-driven — explicit ratings, task completion, tool efficiency — with safety guards and full rollback.
Vault KV v2, AWS Secrets Manager, or env — swappable via one env var. Per-workspace API key overrides let tenants bring their own credentials. Every secret read is audit-logged.
Bloom filter → turn hash → SWR semantic cache (≥ 0.97 similarity) → negative result cache → graph edge LRU → swarm L1 → embedding batcher. Adaptive TTL, per-workspace budget, and version-pinned invalidation. Cache metrics at GET /cache/stats.
Zero-downtime JWT key rotation via JWT_SECRET_PREV — old tokens stay valid during rollover. Optional device fingerprint binding ties tokens to User-Agent + IP to block replay attacks.
Connect any MCP-compatible server — stdio, SSE, or streamable-HTTP. Tools are auto-registered as mcp:{server}:{tool} and available to all agents. Declare servers in astra.yml, no code required.
Production Helm chart with HPA, PodDisruptionBudget, init-container Postgres readiness check, and external secret support. helm install astra ./helm — one command to k8s.
astra scaffold generates skill, route, and plugin boilerplate. astra bench measures agent latency and cost. astra replay re-runs any session in the terminal. astra migrate manages DB schema — all from the same CLI.
Install with one command and you're running — on your computer or a server.
Quick Start →