Entity Confidence Scoring

When the auto-extraction pipeline encounters an entity whose new description conflicts with what is already stored in the knowledge graph, it applies a confidence penalty of -0.05 to that entity. Over time, frequently contradicted entities naturally sink in retrieval ranking.

How Conflict Is Detected

A conflict is flagged when the first 20 characters of the existing entity description are not found in the newly extracted description. Both descriptions must be longer than 10 characters for the check to run.

Implementation

typescript

// In auto-extract.ts — runs after every entity extraction pass
if (
  graphEntity.description &&
  entity.description &&
  graphEntity.description.length > 10 &&
  entity.description.length > 10 &&
  !entity.description
    .toLowerCase()
    .includes(graphEntity.description.slice(0, 20).toLowerCase())
) {
  updateEntityConfidence(graphEntity.id, -0.05)
}

Parameters

Parameter	Value	Description
Conflict check	First 20 chars	Prefix of stored description must appear in new description
Minimum length	10 chars	Descriptions shorter than this skip the conflict check
Confidence delta	-0.05	Applied per conflicting extraction

Relationship to Retrieval

Entity confidence is factored into the RRF fusion score during tiered search. Entities with lower confidence surface lower in graph hints and knowledge-base results. Confidence does not have a floor — it can go negative for highly contested entities.

Confidence decay is a signal, not a deletion. Conflicted entities remain in the graph and can still be retrieved; they simply rank lower. To reset an entity's confidence, update it directly via the graph API.