Memory Summarization

Daily notes are automatically compacted into weekly and monthly summaries using LLM-driven consolidation. Runs nightly at 5 AM. Reduces memory footprint while preserving key insights — so agents can reason over weeks or months of history without hitting context limits or slowing down retrieval.

Summarization schedule

Run	Trigger	Input	Output
Nightly compaction	Every night at the configured `scheduleTime`	Raw daily notes older than `retainDays`	Compacted note entries with reduced token count
Weekly rollup	Every Sunday during the nightly run	All daily notes from the past 7 days	A single weekly summary with key insights extracted
Monthly rollup	First of each month during the nightly run	All weekly summaries from the past month	A single monthly summary preserved indefinitely

Raw daily notes older than retainDays are replaced by their compacted form. Weekly and monthly summaries are retained indefinitely and are included in the tiered memory search alongside other memory tiers.

How it works

Summarization runs in three stages for each consolidation pass:

LLM consolidation — The configured model receives a batch of raw notes with a system prompt instructing it to merge redundant information, resolve minor contradictions by keeping the most recent value, and produce a condensed version that preserves all distinct facts.
Importance scoring — Each candidate insight from the consolidation output is scored by the LLM on a 0–1 scale for long-term importance. Insights below 0.5 are omitted from the weekly and monthly rollups but retained in the compacted daily notes.
Key insight extraction — The top-scoring insights are written to a structured keyInsights array on the summary record. These are indexed in Typesense for fast keyword retrieval and surfaced prominently during memory assembly.

Configuration

yaml

settings:
  memory:
    summarization:
      enabled: true
      scheduleTime: "05:00"     # Nightly compaction time (UTC)
      retainDays: 30            # How many days of raw daily notes to keep before compaction
      model: gpt-4o-mini        # LLM used for consolidation (cheaper models work well here)

Using a smaller model like gpt-4o-mini for summarization is intentional — consolidation is a straightforward extraction task that does not require frontier reasoning. This keeps the nightly job inexpensive even at scale.

Manual trigger

Trigger a summarization pass on demand without waiting for the nightly schedule. Set force: true to run even if the notes have already been compacted during the current cycle.

bash

POST /memory/summarize
{
  "userId": "user_abc123",
  "period": "weekly",
  "force": true
}

# Response:
# { "summaryId": "sum_abc123", "period": "weekly", "notesConsolidated": 7, "status": "complete" }

Viewing summaries

Retrieve past summaries filtered by period. Use period=daily, period=weekly, or period=monthly.

bash

GET /memory/summaries?period=weekly

# Response:
# [
#   {
#     "summaryId": "sum_abc123",
#     "period": "weekly",
#     "startDate": "2026-02-17",
#     "endDate": "2026-02-23",
#     "keyInsights": [
#       "Decided to adopt pgvector for all vector search workloads.",
#       "Alex is leading the migration from Pinecone.",
#       "Sprint velocity improved by 20% after switching to async standups."
#     ],
#     "notesConsolidated": 34
#   }
# ]