Open Astra

The agent runtime
built for engineers.

Imagine an AI agent that runs on your machine, knows your context through vectorized memory across pgvector and Typesense, and talks to you through the apps you already use. Open Astra is that agent — self-hosted and built for engineering teams who need production infrastructure they actually own.

Self-Hosted · Full REST API
Runs locally · Node 20+ · Docker optional
$npx astra

Zero community plugins. Zero external MCP servers. Every tool, skill, and memory operation is code you can read and audit. If it runs in your agent loop, it is in this repository.

Why this matters →
106
Built-in Skills
66
Core Tools
10
LLM Providers
12
Channels
5-tier
Memory System
How it works

From message to action in three steps

01

Connect a channel

Slack, X, email — or all of them. Agents live where your team already works, no new interface to learn.

02

Agents pull context from memory

5-tier memory fuses pgvector and Typesense via RRF to surface exactly what's relevant before every response.

03

Act, report, and learn

Tools run, sub-agents spawn, results come back through the same channel. Memory updates. The loop closes.

What's inside

Everything you need to ship agents to production

5-Tier Memory

Ephemeral → daily notes → user profile → knowledge graph with multi-hop traversal → procedural workflows. RRF fusion search across Typesense and pgvector.

Hierarchical Swarms

A root agent decomposes tasks, spawns permission-sandboxed sub-agents, shares state via blackboard, mediates conflicts with a debate protocol, and synthesizes.

10 Inference Providers

Grok, Groq, OpenAI, Gemini, Claude, Ollama, vLLM, Bedrock, Mistral, OpenRouter — all behind one interface with per-provider prompt caching (up to 90% savings).

Workspace Files

Drop AGENTS.md, USER.md, IDENTITY.md, or any .md into ./workspace/. Hot-reloaded on save, SHA-256 checksummed, injected before every agent turn.

Self-Healing

Auto-restart crashed agents with exponential backoff. Auto-compact sessions approaching context limits. Dead session cleanup. All configurable in astra.yml.

Deep Research

Autonomous multi-day PhD-style research. Spawns sub-agent swarms for parallel investigation, synthesizes findings, delivers structured reports.

Dream Mode

Overnight simulation using local models only (Ollama). 100% on-device, privacy-first. Generates morning reports with extracted insights and entity updates.

Cost Dashboard

Per-agent, per-user, per-model token spend with trend visualization. Real-time quota enforcement. npx astra costs or the /costs API.

Diagnostics

12 checks across 6 categories: connectivity, config, migrations, agents, providers, memory. Run with npx astra doctor.

Heartbeat Daemon

Proactive background agents that monitor triggers — email, RSS, price thresholds, news keywords — and fire an agent when conditions are met. Cron-scheduled, zero manual polling.

Hot Reload

Skills and plugins reload instantly when files change on disk — no gateway restart required. Filesystem watcher with 1000ms debounce. Syntax errors roll back to the previous working version.

Persona Evolution

Agents A/B test prompt variants against each other and automatically promote the best performer. Feedback-driven — explicit ratings, task completion, tool efficiency — with safety guards and full rollback.

Secrets Management

Vault KV v2, AWS Secrets Manager, or env — swappable via one env var. Per-workspace API key overrides let tenants bring their own credentials. Every secret read is audit-logged.

Semantic Response Cache

pgvector cosine similarity cache (≥ 0.97 threshold) deduplicates near-identical queries before they hit the model. Backed by a SHA-256 keyed embedding cache that survives restarts.

Auth Hardening

Zero-downtime JWT key rotation via JWT_SECRET_PREV — old tokens stay valid during rollover. Optional device fingerprint binding ties tokens to User-Agent + IP to block replay attacks.

Get Started

Get started in minutes

Install with one command and you're running — on your computer or a server.

Quick Start →