Skip to content

Feature: temporal validity on facts + recency weighting in query (living corpora return stale facts at full weight) #1650

Description

@Ns2384-star

Version: graphifyy 0.9.5. Context: living, CRM-like documentary corpus (people join/leave, RFPs open/close), 289 real queries logged over one month.

Problem

serve._query_graph_text ranks purely by term-overlap + IDF + confidence — no time factor at all. On a living corpus, stale facts weigh exactly as much as fresh ones: a query about a person returns their old assignment and their current one with equal standing, and nothing marks which edge is superseded.

reflect.py already has the half-life mechanics (30 days) but only applies it to Q&A reputation, not to graph facts.

Proposal

Three composable pieces:

  1. Optional temporal fields valid_from / valid_to / superseded_by on nodes/edges, populated by extraction when the source states them.
  2. --recency flag on query that weights results by captured_at / mtime of the source_file.
  3. Invalidation mechanism: when a new extraction states "X left Y", mark the existing employment edge as closed (don't delete it — history stays queryable).

Even just (2) would be a big win: today we have to re-check every time-sensitive answer against the raw source files.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions