Skip to content

Add semantic-drift signals and a deterministic drift-evaluator for confidence decay#89

Open
danielbdyer wants to merge 1 commit into
mainfrom
codex/extend-run-receipts-for-semantic-consistency-signals
Open

Add semantic-drift signals and a deterministic drift-evaluator for confidence decay#89
danielbdyer wants to merge 1 commit into
mainfrom
codex/extend-run-receipts-for-semantic-consistency-signals

Conversation

@danielbdyer
Copy link
Copy Markdown
Owner

Motivation

  • Surface runtime semantic-consistency signals (label/role mismatch, accessible-name semantics change, unexpected transition effects, assertion-target ambiguity) so the system can distinguish semantic drift from locator degradation.
  • Use deterministic decay rules to reduce derived overlay confidence when run evidence indicates real drift, and expose operator controls to tune decay behavior per artifact class.

Description

  • Extended StepExecutionReceipt to include semanticConsistency signals and populated them during runtime step execution in lib/runtime/scenario.ts.\
  • Added lib/application/drift-evaluator.ts which ingests run records and produces deterministic decay evaluations (total decay, floor, suppression counts, applied-signal rationale).\
  • Integrated decay into the confidence overlay projection in lib/application/confidence.ts so overlay score is reduced by the computed decay and decay lineage is recorded in ArtifactConfidenceRecord.lineage.decay.\
  • Added trust-policy config/schema surfaces (confidenceDecay) to tune per-artifact decay rates, minimumFloor, and suppressionWindowRuns, and updated the default .tesseract/policy/trust-policy.yaml.\
  • Treated semantic drift as a first-class operator concern by adding semantic-drift inbox kind, a semantic-drift hotspot kind, and scorecard fields (semanticDriftHotspotCount, runtimeFailureClasses) to surface actionable groupings and distinct failure classes.
  • Updated schemas, types, validators, and a few test fixtures to reflect the new fields and kinds.

Testing

  • Ran targeted test suite: npm test -- tests/domain-validation-lanes.spec.ts tests/hotspots.spec.ts tests/vscode-integration.laws.spec.ts, all tests passed (24 passed).\
  • Ran npm run types / npm run build; npm run build failed due to pre-existing unrelated TypeScript issues in lib/application/replay-interpretation.ts and lib/infrastructure/mcp/dashboard-mcp-server.ts not introduced by this change (noted during validation).

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant