Skip to content

fix(subagents): honor approval-pause in every tool-watchdog mode#1473

Merged
Aaronontheweb merged 4 commits into
devfrom
fix/watchdog-approval-suspend
Jun 24, 2026
Merged

fix(subagents): honor approval-pause in every tool-watchdog mode#1473
Aaronontheweb merged 4 commits into
devfrom
fix/watchdog-approval-suspend

Conversation

@Aaronontheweb

Copy link
Copy Markdown
Collaborator

What & why

Step 1 of #1472. A human-approval block sets ToolActivityUpdate.SuspendsInactivityWatchdog to pause the per-tool-call liveness clock. The tool-liveness change gated that suspend behind ResetMode != WallClock, so in WallClock mode (every opaque tool) the approval-pause was ignored and the wall-clock budget kept ticking through a human approval — killing a healthy tool mid-approval whenever the human took longer than the budget.

This violated the intended contract: approvals (and any explicit human-block) must never affect the wall clock, in any mode.

The fix

StreamingToolWatchdog.cs — honor SuspendsInactivityWatchdog in all modes. The change is surgical: ordinary per-item resets stay mode-gated in ApplyItemReset (WallClock still can't be kept alive by its own streamed output); only the explicit human-block signal is now universal.

- if (budget.ResetMode != ToolWatchdogResetMode.WallClock && activity.SuspendsInactivityWatchdog)
+ if (activity.SuspendsInactivityWatchdog)
      Volatile.Write(ref budgetTicks, TimeSpan.Zero.Ticks);

Test (the part that lets it regress)

The pre-existing suspend test only exercised Flat mode — which is exactly why the WallClock regression slipped through. Replaced with a mode-matrix [Theory] across Flat / FirstItemOnly / WallClock. Verified as a real guard:

  • old guard restored → wallClock FAILS, flat/firstItemOnly pass
  • fix applied → all three pass

Scope

Targeted fix only — deliberately small and shippable independent of the larger consolidation. The structural rework (collapse to one actor-owned watchdog, approval-as-state, kill the silent ?? Opaque, etc.) is tracked as the migration steps in #1472.

Relates to #1467. Part of #1472.

A human-approval block sets ToolActivityUpdate.SuspendsInactivityWatchdog to
pause the per-tool-call liveness clock. The tool-liveness change gated that
suspend behind ResetMode != WallClock, so in WallClock mode (every opaque tool)
the approval-pause was ignored and the wall-clock budget kept ticking through a
human approval — killing a healthy tool mid-approval whenever the human took
longer than the budget.

Honor SuspendsInactivityWatchdog in all modes. Ordinary per-item resets stay
mode-gated in ApplyItemReset (WallClock still cannot be kept alive by its own
output); only the explicit human-block signal is now universal.

The pre-existing suspend test only exercised Flat mode, which is why the
WallClock regression slipped through. Replaces it with a mode-matrix Theory
across Flat/FirstItemOnly/WallClock; the WallClock case fails on the old guard
and passes with the fix.

Step 1 of #1472. Relates to #1467.

@Aaronontheweb Aaronontheweb left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Aaronontheweb Aaronontheweb added subagents spawn_agent, SubAgentActor, definition loader, discovery context layer, and related features reliability Retries, resilience, graceful degradation labels Jun 23, 2026
@Aaronontheweb Aaronontheweb marked this pull request as ready for review June 23, 2026 23:53
@Aaronontheweb Aaronontheweb enabled auto-merge (squash) June 23, 2026 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

reliability Retries, resilience, graceful degradation subagents spawn_agent, SubAgentActor, definition loader, discovery context layer, and related features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant