Entity Confidence Scoring
When the auto-extraction pipeline encounters an entity whose new description conflicts with what is already stored in the knowledge graph, it applies a confidence penalty of -0.05 to that entity. Over time, frequently contradicted entities naturally sink in retrieval ranking.
How Conflict Is Detected
A conflict is flagged when the first 20 characters of the existing entity description are not found in the newly extracted description. Both descriptions must be longer than 10 characters for the check to run.
Implementation
// In auto-extract.ts — runs after every entity extraction pass
if (
graphEntity.description &&
entity.description &&
graphEntity.description.length > 10 &&
entity.description.length > 10 &&
!entity.description
.toLowerCase()
.includes(graphEntity.description.slice(0, 20).toLowerCase())
) {
updateEntityConfidence(graphEntity.id, -0.05)
}Parameters
| Parameter | Value | Description |
|---|---|---|
| Conflict check | First 20 chars | Prefix of stored description must appear in new description |
| Minimum length | 10 chars | Descriptions shorter than this skip the conflict check |
| Confidence delta | -0.05 | Applied per conflicting extraction |
Relationship to Retrieval
Entity confidence is factored into the RRF fusion score during tiered search. Entities with lower confidence surface lower in graph hints and knowledge-base results. Confidence does not have a floor — it can go negative for highly contested entities.
Confidence decay is a signal, not a deletion. Conflicted entities remain in the graph and can still be retrieved; they simply rank lower. To reset an entity's confidence, update it directly via the graph API.