fix: implement drift event emission and make evidence optional in record_decision by acailic · Pull Request #206 · acailic/agent_debugger

acailic · 2026-06-05T00:58:00Z

Summary

evidence now optional in RecordingMixin.record_decision (moved after chosen_action, default None) — all existing callers pass it explicitly as a keyword argument, so no callsites break
New EventType.DRIFT and DriftDetectedEvent added to the events module for representing replay divergence
TraceContext.record_decision override compares each replayed decision against the DriftDetector baseline and emits a DriftDetectedEvent when chosen_action (or confidence) differs from the original
Skipped test fixed (test_drift_detected_during_replay_emits_event): corrected mock URL key (events→traces), added a post-checkpoint timestamp to the mock event so the timestamp filter passes, and checked ctx._events rather than a disconnected capture_event callback

Closes #205

Test plan

python3 -m pytest tests/test_replay_depth_l3.py -q — previously skipped test now passes
python3 -m pytest -q — 2754 passed, 27 skipped (was 2753 passed, 28 skipped), zero regressions

…ord_decision - Add EventType.DRIFT and DriftDetectedEvent to the events module - Override record_decision in TraceContext to call DriftDetector.compare() and emit a DriftDetectedEvent when replay diverges from the original execution - Make the `evidence` parameter optional (default None) in RecordingMixin.record_decision, since all existing callers already pass it explicitly as a keyword arg - Fix skipped test: correct mock URL key (events→traces), add post-checkpoint timestamp to mock event so the timestamp filter passes, and check ctx._events instead of a disconnected capture_event callback Closes #205 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

This PR adds first-class drift reporting during checkpoint restore/replay by introducing a new DRIFT event type and emitting a typed drift event when a replayed decision diverges from the recorded baseline. It also unblocks drift-related replay tests by making evidence optional in record_decision and updating the affected test harness.

Changes:

Make RecordingMixin.record_decision() accept optional evidence (defaulting to [] internally) to avoid breaking replay callsites that omit it.
Introduce EventType.DRIFT and DriftDetectedEvent, register the event type, and emit drift events from TraceContext.record_decision() during replay.
Fix the previously-skipped drift replay test by aligning mocked restore fetch shapes and asserting drift presence in the context’s stored events.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`tests/test_replay_depth_l3.py`	Updates mocked restore trace fetch payload and drift assertion for replay drift emission test.
`agent_debugger_sdk/core/recorders.py`	Makes `evidence` optional in `record_decision` and normalizes `None` to `[]`.
`agent_debugger_sdk/core/events/drift.py`	Adds new typed `DriftDetectedEvent` dataclass.
`agent_debugger_sdk/core/events/base.py`	Adds `EventType.DRIFT`.
`agent_debugger_sdk/core/events/__init__.py`	Exports and registers `DriftDetectedEvent` in the event type registry.
`agent_debugger_sdk/core/context/trace_context.py`	Adds `record_decision` override to emit drift events during replay and initializes drift decision index.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

acailic · 2026-06-05T02:20:24Z

+        event_id = await super().record_decision(
+            reasoning=reasoning,
+            confidence=confidence,
+            chosen_action=chosen_action,
+            evidence=evidence,
+            **kwargs,
+        )
+        drift_detector = getattr(self, "_drift_detector", None)
+        if drift_detector is not None:
+            index = getattr(self, "_drift_decision_index", 0)
+            self._drift_decision_index = index + 1
+            new_event_dict = {
+                "event_type": "decision",
+                "data": {"chosen_action": chosen_action, "confidence": confidence},
+            }
+            drift = drift_detector.compare(new_event_dict, index)
+            if drift is not None:


Fixed in 6a095be. Two changes: (1) the drift detector baseline is now seeded with only decision events (filtered from post_events), so _drift_decision_index aligns correctly with original_events positions; (2) confidence is clamped to [0.0, 1.0] before building new_event_dict to match what RecordingMixin.record_decision() persists, eliminating false confidence drift for out-of-range inputs.

acailic · 2026-06-05T02:20:26Z

-                    # Drift event should have been emitted
-                    drift_events = [e for e in emitted_events if getattr(e, "event_type", None) == "drift"]
+                    # Drift event should have been emitted into the context's event store
+                    drift_events = [e for e in ctx._events if getattr(e, "event_type", None) == "drift"]


Fixed in 6a095be. Switched to await ctx.get_events() instead of accessing ctx._events directly, which returns a safe copy under the async lock.

…mping - Filter drift detector baseline to decision events only so _drift_decision_index aligns with the correct original event at each position (was indexing into the full post-checkpoint stream including tool calls) - Clamp confidence to [0.0, 1.0] before drift comparison to match what RecordingMixin.record_decision() persists, preventing false confidence drift for out-of-range inputs - Use ctx.get_events() in test instead of private ctx._events to avoid bypassing the async lock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings June 5, 2026 00:58

Copilot started reviewing on behalf of acailic June 5, 2026 00:58 View session

Copilot AI reviewed Jun 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: implement drift event emission and make evidence optional in record_decision#206

fix: implement drift event emission and make evidence optional in record_decision#206
acailic wants to merge 2 commits into
mainfrom
fix/evidence-optional-record-decision

acailic commented Jun 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

acailic Jun 5, 2026

Uh oh!

acailic Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

acailic commented Jun 5, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

acailic Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

acailic Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants