Reflection
Reflection is an optional post-inference self-critique pass. After the agent generates a response, a separate inference call evaluates the response for accuracy, completeness, actionability, and clarity — flagging issues and optionally requesting a revision.
When to reflect
The shouldReflect heuristic determines whether a turn warrants self-evaluation.
function shouldReflect(toolResults, responseLength): boolean {
// Triggers if ANY of these conditions are true:
return (
toolResults.some(r => r.error) || // any tool errored
responseLength > 2000 || // long response (may be rambling)
toolResults.length >= 3 // complex multi-tool turn
)
}The heuristic targets turns most likely to contain errors: tool failures that may have produced inaccurate data, long responses that may have drifted, and complex multi-tool interactions where synthesis errors are more likely.
Result format
interface ReflectionResult {
confidence: number // 0–1 score across all dimensions
issues: string[] // detected problems
suggestions: string[] // improvement recommendations
shouldRevise: boolean // true only for significant problems:
} // wrong facts, missing critical info, harmful adviceshouldRevise is only set to true for significant problems — wrong facts, missing critical information, or potentially harmful advice. Minor style issues or verbose responses do not trigger revision.
Evaluation dimensions
// The reflection pass evaluates four dimensions:
1. Accuracy — Are facts and claims correct?
2. Completeness — Does the response fully address the question?
3. Actionability — Can the user act on the advice given?
4. Clarity — Is the response well-structured and easy to follow?
// Inference parameters:
temperature: 0.1 // low creativity — we want consistent evaluation
maxTokens: 300 // brief assessment, not a rewriteThe reflection inference uses a very low temperature (0.1) to ensure consistent, conservative evaluations. The 300-token cap keeps assessments brief — this is a quality gate, not a rewrite step.
Failure handling
Reflection errors are silently caught and never block the user's response. If the reflection inference fails, a safe default is returned.
// Reflection errors never block the main response
try {
return await reflectOnResponse(userMsg, response, client, sessionId)
} catch {
return {
confidence: 0.5,
issues: ['Reflection failed'],
suggestions: [],
shouldRevise: false // safe default — let the response through
}
}How to use reflection
Reflection is available as an explicit call from the orchestrator, custom plugins, or the Debate Protocol. To add it to the standard agent loop, call reflectOnResponse after the loop completes and before sending the response to the user.
- If
shouldReviseistrue, feed the issues back into a second agent turn with revision instructions - If
shouldReviseisfalse, log the confidence score for observability and proceed - The confidence score can be tracked in Agent Metrics for quality monitoring over time