Diagnostics

Open Astra includes a built-in diagnostic system that runs 12 checks against your deployment — database connectivity, provider key validation, migration status, memory system health, and live inference tests. Results are available via API, CLI, and are persisted for historical tracking.

All 12 diagnostic checks

All checks run in parallel for speed. Each returns a status of pass, warn, or fail.

text

Check                         What it tests
─────────────────────────────────────────────────────────────
postgres-connectivity         Can we reach PostgreSQL?
typesense-connectivity        Can we reach Typesense?
ollama-connectivity           Is Ollama running? (optional)
config-env-vars               PG_HOST, PG_DATABASE, PG_USER present?
config-provider-keys          At least one inference provider key set?
config-astra-yml              Does astra.yml parse without errors?
migrations                    Any pending database migrations?
agents                        Are registered agents valid + keys present?
provider-inference             Live "Say OK" test per configured provider
memory-typesense-collection   Does memory_chunks collection exist?
memory-embeddings             Can we generate an embedding?
memory-pgvector               Is the pgvector extension installed?

Overall status

text

// Overall status derived from individual checks:
any check = "fail"  →  "unhealthy"
any check = "warn"  →  "degraded"     (no fails)
all checks = "pass" →  "healthy"

Running diagnostics

CLI

The astra doctor command runs all checks and displays results in the terminal with color-coded status indicators.

bash

# Run diagnostics from the command line
npx astra doctor

# Or via the standalone CLI
astra doctor

API

The /diagnostics endpoint runs all checks and returns structured JSON. This endpoint requires the INTERNAL_API_KEY header — it is not available to regular JWT-authenticated users.

json

// GET /diagnostics (requires INTERNAL_API_KEY header)
{
  "overallStatus": "degraded",
  "durationMs": 1240,
  "checks": [
    { "name": "postgres-connectivity", "status": "pass", "message": "Connected", "durationMs": 12 },
    { "name": "typesense-connectivity", "status": "pass", "message": "Healthy", "durationMs": 45 },
    { "name": "ollama-connectivity", "status": "warn", "message": "OLLAMA_BASE_URL not set", "durationMs": 0 },
    { "name": "config-provider-keys", "status": "pass", "message": "3 provider keys configured", "durationMs": 1 },
    { "name": "migrations", "status": "warn", "message": "2 pending migrations", "durationMs": 89 },
    { "name": "provider-inference", "status": "pass", "message": "claude: OK, openai: OK, gemini: OK", "durationMs": 890 }
  ]
}

Connectivity-only mode. /diagnostics?mode=connectivity runs only the fast checks (Postgres, Typesense, Ollama) — useful for quick health probes without the overhead of live inference tests.

Provider key detection

The config-provider-keys check verifies that at least one inference provider is configured. The agents check goes further — for each registered agent, it verifies the provider key for that agent's configured model is present.

text

// Provider → environment variable mapping:
grok   → GROK_API_KEY
groq   → GROQ_API_KEY
openai → OPENAI_API_KEY
gemini → GEMINI_API_KEY
claude → ANTHROPIC_API_KEY
ollama → OLLAMA_BASE_URL
vllm   → VLLM_API_KEY

Live inference test

The provider-inference check sends a minimal "Say OK" prompt to each configured provider in parallel. This catches issues that key validation alone cannot — expired keys, rate limits, model deprecation, and network-level blocks.

Token cost. Each live inference test costs a few tokens per provider. Running diagnostics frequently (e.g., every minute) is not recommended. The Health Check endpoint at /health is free and suitable for high-frequency monitoring.

Diagnostic history

Each diagnostic run is persisted to the diagnostic_runs table. Query the history via GET /diagnostics/history?limit=10 to track system health over time.

sql

-- Diagnostic results are persisted for history
CREATE TABLE diagnostic_runs (
  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  status      TEXT NOT NULL,       -- healthy | degraded | unhealthy
  results     JSONB NOT NULL,      -- full check array
  duration_ms INTEGER NOT NULL,
  run_at      TIMESTAMPTZ DEFAULT NOW()
);

Health Check — lightweight /health and /readiness endpoints for monitoring
Circuit Breaker — provider-level health tracked by the circuit breaker system
Deployment — run astra doctor after deploying to verify your setup

Diagnostics

All 12 diagnostic checks

Overall status

Running diagnostics

CLI

API

Provider key detection

Live inference test

Diagnostic history

Related