Inference / Payload Audit

Payload Audit

The payload audit system records full inference request/response payloads to the inference_payloads table. Every LLM call — including the message array, tool schemas, response content, token usage, and timing — is captured for debugging, compliance, and cost analysis. Secrets are automatically redacted before storage.

Fire-and-forget. Payload writes happen asynchronously after the inference call completes. They never block or slow down the agent response.

What is recorded

typescript
interface InferencePayloadRecord {
  uid: string
  agentId: string
  sessionId: string
  provider: string
  modelId: string
  requestMessages: unknown[]        // full message array sent to provider
  systemPromptHash?: string         // auto-computed SHA-256 (first 16 hex chars)
  toolSchemas?: unknown[]           // tool definitions included in the call
  responseContent?: string          // raw response text
  finishReason?: string             // stop, length, tool_calls, etc.
  promptTokens: number
  completionTokens: number
  totalTokens: number
  cachedTokens?: number
  durationMs?: number
  traceId?: string                  // links to agent_traces
  ttlDays?: number                  // override per-record TTL
}

System prompt hash

If the first message in the request has role: "system", its content is hashed with SHA-256 and the first 16 hex characters are stored as systemPromptHash. This lets you identify prompt changes over time without storing duplicate system prompts.

Secret redaction

text
// Before writing to the database, all string values in
// request messages and response content are scanned for secrets.
//
// The secret scanner detects:
// - High-entropy strings (potential API keys, tokens)
// - Known patterns (AWS keys, API tokens, etc.)
//
// Detected secrets are replaced with [REDACTED] in the stored payload.
// The original content is never persisted.

Redaction uses the same secret scanner that protects memory writes. If the scanner module is unavailable, payloads are still written — but without redaction.

Retention and cleanup

text
// Cron job runs daily at 3:30 AM
// Deletes all rows where expires_at < NOW()
schedule: '30 3 * * *'

// Each record's TTL is set at write time:
// expires_at = NOW() + INTERVAL '1 day' * ttlDays
// where ttlDays defaults to INFERENCE_PAYLOAD_TTL_DAYS (30)

Each record has its own expires_at timestamp, computed at write time. The daily cron job at 3:30 AM deletes all expired rows. You can override the TTL per-record (useful for flagging important payloads for longer retention) or globally via the environment variable.

bash
# Default TTL for payload records (default: 30 days)
INFERENCE_PAYLOAD_TTL_DAYS=30

Database schema

sql
-- Table: inference_payloads
CREATE TABLE inference_payloads (
  id            UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  uid           TEXT NOT NULL,
  agent_id      TEXT NOT NULL,
  session_id    UUID,
  provider      TEXT NOT NULL,
  model_id      TEXT NOT NULL,
  request_messages  JSONB NOT NULL,
  system_prompt_hash TEXT,
  tool_schemas      JSONB,
  response_content  TEXT,
  finish_reason     TEXT,
  prompt_tokens     INTEGER,
  completion_tokens INTEGER,
  total_tokens      INTEGER,
  cached_tokens     INTEGER,
  duration_ms       INTEGER,
  trace_id          UUID,
  expires_at        TIMESTAMPTZ NOT NULL,
  created_at        TIMESTAMPTZ DEFAULT NOW()
);

Use cases

Use caseHow
Debug agent behaviorQuery payloads by session_id to see exactly what the model received and returned
Trace cost spikesAggregate total_tokens by agent_id and model_id to find expensive calls
Prompt regressionCompare system_prompt_hash over time to detect unintended prompt changes
Compliance auditLink payloads to agent_traces via trace_id for end-to-end audit trails
Cache analysisQuery cached_tokens to measure prompt caching effectiveness per provider
  • Trace Viewer — per-turn distributed tracing linked via traceId
  • Prompt Caching — understand cache hit rates from cachedTokens data
  • Event Systeminference.request events fired alongside payload writes
  • Scheduler — the 3:30 AM cleanup cron job