Imagine an AI agent that runs on your machine, knows your context through vectorized memory across pgvector and Typesense, and talks to you through the apps you already use. Open Astra is that agent — self-hosted and built for engineering teams who need production infrastructure they actually own.
Slack, X, email — or all of them. Agents live where your team already works, no new interface to learn.
5-tier memory fuses pgvector and Typesense via RRF to surface exactly what's relevant before every response.
Tools run, sub-agents spawn, results come back through the same channel. Memory updates. The loop closes.
Ephemeral → daily notes → user profile → knowledge graph with multi-hop traversal → procedural workflows. RRF fusion search across Typesense and pgvector.
A root agent decomposes tasks, spawns permission-sandboxed sub-agents, shares state via blackboard, mediates conflicts with a debate protocol, and synthesizes.
Grok, Groq, OpenAI, Gemini, Claude, Ollama, vLLM, Bedrock, Mistral, OpenRouter — all behind one interface with per-provider prompt caching (up to 90% savings).
Drop AGENTS.md, USER.md, IDENTITY.md, or any .md into ./workspace/. Hot-reloaded on save, SHA-256 checksummed, injected before every agent turn.
Auto-restart crashed agents with exponential backoff. Auto-compact sessions approaching context limits. Dead session cleanup. All configurable in astra.yml.
Autonomous multi-day PhD-style research. Spawns sub-agent swarms for parallel investigation, synthesizes findings, delivers structured reports.
Overnight simulation using local models only (Ollama). 100% on-device, privacy-first. Generates morning reports with extracted insights and entity updates.
Per-agent, per-user, per-model token spend with trend visualization. Real-time quota enforcement. npx astra costs or the /costs API.
12 checks across 6 categories: connectivity, config, migrations, agents, providers, memory. Run with npx astra doctor.
Proactive background agents that monitor triggers — email, RSS, price thresholds, news keywords — and fire an agent when conditions are met. Cron-scheduled, zero manual polling.
Skills and plugins reload instantly when files change on disk — no gateway restart required. Filesystem watcher with 1000ms debounce. Syntax errors roll back to the previous working version.
Agents A/B test prompt variants against each other and automatically promote the best performer. Feedback-driven — explicit ratings, task completion, tool efficiency — with safety guards and full rollback.
Vault KV v2, AWS Secrets Manager, or env — swappable via one env var. Per-workspace API key overrides let tenants bring their own credentials. Every secret read is audit-logged.
pgvector cosine similarity cache (≥ 0.97 threshold) deduplicates near-identical queries before they hit the model. Backed by a SHA-256 keyed embedding cache that survives restarts.
Zero-downtime JWT key rotation via JWT_SECRET_PREV — old tokens stay valid during rollover. Optional device fingerprint binding ties tokens to User-Agent + IP to block replay attacks.
Install with one command and you're running — on your computer or a server.
Quick Start →