Backup & Recovery Runbook
This runbook covers backup procedures for PostgreSQL and Typesense, workspace-scoped memory restore, RTO/RPO targets, and platform-specific guidance for DigitalOcean Managed Databases.
RTO / RPO targets
| Scenario | RPO (data loss tolerance) | RTO (downtime tolerance) |
|---|---|---|
| Full instance failure | Up to 24 hours (daily backup) | 30–60 min (restore + verify) |
| Accidental workspace deletion | Up to 24 hours (daily backup) | 15–30 min (workspace restore) |
| Corrupt memory data | Up to 24 hours (daily backup) | 10–20 min (table-level restore) |
| DigitalOcean PITR (Point-in-Time Recovery) | Up to 1 minute (WAL streaming) | 20–40 min (cluster rebuild) |
The default daily backup schedule achieves 24-hour RPO. Enable WAL archiving or use DigitalOcean's PITR to reduce RPO to minutes.
What to back up
| System | Contains | Backup method | Criticality |
|---|---|---|---|
| PostgreSQL | All agent data: sessions, turns, memory, knowledge graph, workspace config | pg_dump / PITR | Critical |
| Typesense | Search index — a derived copy of memory_entries and documents | Collection export (JSONL) | High (can be rebuilt from PG, but takes time) |
astra.yml | Agent, skill, tool, channel configuration | Git version control | High |
| Environment variables / secrets | API keys, DB credentials | Secrets manager snapshot | Critical |
npx astra reindex. Back it up to reduce recovery time, but PostgreSQL is the source of truth.PostgreSQL backup
Use pg_dump for logical backups. Schedule daily runs with a cron job and store the output in object storage (S3, DigitalOcean Spaces):
# Full logical backup (all workspaces)
pg_dump \
--format=custom \
--compress=9 \
--no-acl \
--no-owner \
"$DATABASE_URL" \
-f backup-$(date +%Y%m%d-%H%M%S).dump
# Workspace-scoped backup (single tenant)
pg_dump \
--format=custom \
--table='memory_entries' \
--table='knowledge_graph_*' \
--table='sessions' \
--table='turns' \
--where="workspace_id = 'ws_acme'" \
"$DATABASE_URL" \
-f ws-acme-$(date +%Y%m%d).dumpPostgreSQL restore
# Full restore to a new database
pg_restore \
--clean \
--if-exists \
--no-acl \
--no-owner \
-d "$DATABASE_URL" \
backup-20260101-120000.dump
# Restore a single workspace's memory to a running instance
# 1. Restore into a staging table
pg_restore \
--table=memory_entries \
--table=knowledge_graph_nodes \
--table=knowledge_graph_edges \
-d "$DATABASE_URL" \
ws-acme-20260101.dump
# 2. Promote staging rows to live (in psql)
INSERT INTO memory_entries SELECT * FROM memory_entries_staging
WHERE workspace_id = 'ws_acme'
ON CONFLICT (id) DO NOTHING;Restoring a single workspace's memory
To restore one workspace without touching others:
- Restore the dump into temporary tables (using
--tableflags and a staging schema). - Filter rows by
workspace_idand insert into live tables withON CONFLICT DO NOTHINGto avoid overwriting newer data. - Re-trigger search indexing for the workspace:
npx astra reindex --workspace ws_acme. - Verify memory is accessible: send a test agent turn and check retrieved memory results.
Typesense backup
Export each collection as JSONL. Store alongside the PostgreSQL dump so they're from the same point in time:
# Export all collections (run against your Typesense host)
for collection in $(curl -s "http://localhost:8108/collections" \
-H "X-TYPESENSE-API-KEY: $TYPESENSE_API_KEY" | jq -r '.[].name'); do
curl -s "http://localhost:8108/collections/$collection/documents/export" \
-H "X-TYPESENSE-API-KEY: $TYPESENSE_API_KEY" \
> "typesense-$collection-$(date +%Y%m%d).jsonl"
echo "Exported $collection"
doneTypesense restore
# Re-import a collection (collection must exist with correct schema first)
curl -X POST "http://localhost:8108/collections/memory_ws_acme/documents/import?action=upsert" \
-H "X-TYPESENSE-API-KEY: $TYPESENSE_API_KEY" \
-H "Content-Type: text/plain" \
--data-binary @typesense-memory_ws_acme-20260101.jsonlIf the collection schema has changed since the backup was taken, recreate the collection with the new schema before importing. The collection schema is defined in src/search/collections.ts.
DigitalOcean Managed Database specifics
DigitalOcean Managed PostgreSQL provides automated daily backups and optional Point-in-Time Recovery (PITR). PITR is available on Business-tier clusters and allows restoring to any second within the last 7 days.
# DigitalOcean Managed DB: enable automated backups via doctl
doctl databases backups list <database-id>
# Trigger a manual backup (before a risky migration)
doctl databases maintenance-window update <database-id> \
--day saturday --hour "02:00"
# Restore to a new cluster from a backup
doctl databases create astra-restore \
--engine pg \
--version 16 \
--restore-from-database-name astra-db \
--restore-from-timestamp "2026-01-01T12:00:00Z"Key DO-specific notes:
- pgvector is pre-installed on DigitalOcean Managed PostgreSQL 14+. You do not need to install it manually after restore.
- PITR restores create a new cluster — update
DATABASE_URLin your environment after the restore cluster is ready. - Connection pooling (PgBouncer) is a separate service — after restoring to a new cluster, update the pooler to point to the new host.
- Backups are stored in the same region as your cluster. For disaster recovery across regions, enable cross-region replica or use
pg_dump+ Spaces to a different region.
Recommended backup schedule
| Frequency | Method | Retention |
|---|---|---|
| Hourly | WAL archiving to S3/Spaces (if enabled) | 7 days |
| Daily | pg_dump + Typesense JSONL export | 30 days |
| Weekly | Full pg_dump snapshot | 90 days |
| Before migrations | Manual pg_dump | Keep until migration verified |
Recovery checklist
- Stop the gateway (
docker compose down astra) to prevent writes during restore. - Restore PostgreSQL from backup.
- Run migrations to ensure schema is current:
npx drizzle-kit migrate. - Restore Typesense collections, or rebuild from PG:
npx astra reindex. - Restart the gateway:
docker compose up -d astra. - Run
npx astra doctorand verify all agents report healthy. - Send a test message to a representative agent and verify memory retrieval is working.