Imagine an AI agent that runs on your machine, knows your context through vectorized memory across pgvector and Typesense, and talks to you through the apps you already use. Open Astra is that agent — self-hosted and built for engineering teams who need production infrastructure they actually own.
Slack, X, email — or all of them. Agents live where your team already works, no new interface to learn.
5-tier memory fuses pgvector and Typesense via RRF to surface exactly what's relevant before every response.
Tools run, sub-agents spawn, results come back through the same channel. Memory updates. The loop closes.
Ephemeral → daily notes → user profile → knowledge graph with multi-hop traversal → procedural workflows. RRF fusion search across Typesense and pgvector.
A root agent decomposes tasks, spawns permission-sandboxed sub-agents, shares state via blackboard, mediates conflicts with a debate protocol, and synthesizes.
Grok, Groq, OpenAI, Gemini, Claude, Ollama, vLLM, Bedrock, Mistral, OpenRouter — all behind one interface with per-provider prompt caching (up to 90% savings).
Drop AGENTS.md, USER.md, IDENTITY.md, or any .md into ./workspace/. Hot-reloaded on save, SHA-256 checksummed, injected before every agent turn.
Auto-restart crashed agents with exponential backoff. Auto-compact sessions approaching context limits. Dead session cleanup. All configurable in astra.yml.
Autonomous multi-day PhD-style research. Spawns sub-agent swarms for parallel investigation, synthesizes findings, delivers structured reports.
Overnight simulation using local models only (Ollama). 100% on-device, privacy-first. Generates morning reports with extracted insights and entity updates.
Per-agent, per-user, per-model token spend with trend visualization. Real-time quota enforcement. npx astra costs or the /costs API.
12 checks across 6 categories: connectivity, config, migrations, agents, providers, memory. Run with npx astra doctor.
Proactive background agents that monitor triggers — email, RSS, price thresholds, news keywords — and fire an agent when conditions are met. Cron-scheduled, zero manual polling.
Skills and plugins reload instantly when files change on disk — no gateway restart required. Filesystem watcher with 1000ms debounce. Syntax errors roll back to the previous working version.
Agents A/B test prompt variants against each other and automatically promote the best performer. Feedback-driven — explicit ratings, task completion, tool efficiency — with safety guards and full rollback.
Vault KV v2, AWS Secrets Manager, or env — swappable via one env var. Per-workspace API key overrides let tenants bring their own credentials. Every secret read is audit-logged.
Bloom filter → turn hash → SWR semantic cache (≥ 0.97 similarity) → negative result cache → graph edge LRU → swarm L1 → embedding batcher. Adaptive TTL, per-workspace budget, and version-pinned invalidation. Cache metrics at GET /cache/stats.
Zero-downtime JWT key rotation via JWT_SECRET_PREV — old tokens stay valid during rollover. Optional device fingerprint binding ties tokens to User-Agent + IP to block replay attacks.
Connect any MCP-compatible server — stdio, SSE, or streamable-HTTP. Tools are auto-registered as mcp:{server}:{tool} and available to all agents. Declare servers in astra.yml, no code required.
Production Helm chart with HPA, PodDisruptionBudget, init-container Postgres readiness check, and external secret support. helm install astra ./helm — one command to k8s.
astra scaffold generates skill, route, and plugin boilerplate. astra bench measures agent latency and cost. astra replay re-runs any session in the terminal. astra migrate manages DB schema — all from the same CLI.
Install with one command and you're running — on your computer or a server.
Quick Start →