Skip to content

fix(triage): live UI surfacing fixes and proposal-dispatch hardening#72

Merged
sourcehawk merged 17 commits into
mainfrom
fix/live-surfacing-and-dispatch-hardening
Jun 3, 2026
Merged

fix(triage): live UI surfacing fixes and proposal-dispatch hardening#72
sourcehawk merged 17 commits into
mainfrom
fix/live-surfacing-and-dispatch-hardening

Conversation

@sourcehawk
Copy link
Copy Markdown
Owner

Description

A run of operator-reported bugs and reliability fixes for triage, found while exercising real investigations. The headline is the proposal flow: sub-agent-drafted wiki/playbook proposals now actually surface in the session UI, and the dispatch that drafts them is hardened so a failed draft can no longer be reported as a success. Alongside that it closes several "only updates after a refresh" gaps (codefix/auto-mode SSE, repo lists, wiki sidebar), stops the Teleport MCP from luring kubeconfig deployments into a login loop, and cleans up two latent DOM/test issues.

Changes

  • Proposal surfacing: sub-agent-drafted wiki/playbook proposals render their inline card (transcript hoist) and refresh the sidebar (a global wiki_proposal_created event); a validation-failure draft (proposal_id:"") no longer renders an empty card; a brand-new playbook proposal opens from the sidebar instead of 404-ing on getPlaybook.
  • Proposal-dispatch hardening: an explicit decline_proposal terminal + a prompt that teaches the draft→validate→submit procedure; runtime verification that a terminal tool actually fired (resume-force + a submitted/declined/none outcome) so a missing proposal can't read as success; the master is told the sub-agent is context-blind so it sends a complete brief.
  • Live updates: register dropped SSE kinds (codefix_pr_state, auto_mode_state, wiki_proposal_created) so those surfaces update live; repo lists refresh on add/remove across the sidebar and /repos.
  • Teleport: gate the teleport MCP and the agent's k8s-auth guidance on auth.kind so a kubeconfig deployment isn't steered into a tsh login loop; thread the profile proxy/connector into the subprocess.
  • Cleanups: the investigation-row no longer nests interactive controls inside a <button> (invalid HTML / hydration error); the WatchForm test flushes its mount fetch inside act.

Challenges

The hard part was the dispatch "confabulation window": the proposal flows are free-form sub-agents and the walker is a suggester (ADR-0004), so nothing can force the terminal proposal call. The fix is outcome verification at the dispatch boundary — detect whether a terminal tool actually fired and resume-force if not, surfacing a structured outcome — rather than trying to make the walker an enforcer.

Testing

  • Go: full -race suite + golangci-lint clean; new unit tests per backend change (teleport gating/threading, wiki global event, stream kinds, subagent terminal detection, dispatch verify/resume + outcome, the decline tool, the dispatch prompt).
  • Frontend: tsc clean, 225 vitest tests pass (hoist, repos-events, stream-kinds contract, notifier), and no act()/hydration warnings.
  • E2E (Go harness + Playwright): nested sub-agent proposals surface (backend invariants + rendered cards); a validation-failure draft renders no card; new-playbook and new-wiki-entry proposals open from the sidebar.
  • Worth poking at: the dispatch verify/resume loop and the proposal-card render guard.

sourcehawk and others added 16 commits June 3, 2026 21:59
…tor to the subprocess

On a kubeconfig deployment the teleport MCP was registered unconditionally, so the agent saw list_clusters/login in its catalog and looped on a Teleport login it cannot complete. Gate the MCP (and its allowlist glob) on auth.kind == "teleport", and make the k8s-auth guidance (session prompt + the no-active-context error + switch_context schema) deployment-aware so a kubeconfig deployment is steered at list_contexts/switch_context instead of teleport.

For an actual teleport deployment, the subprocess was built with an empty config, so the re-auth advice degraded to `tsh login --proxy= --auth=okta`. Thread the profile's auth.teleport proxy/connector through mcpconfig env -> serve flags -> teleport.New, and source the auth-required message from the provider's ReauthAdvice() so it always matches the real login command.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The single EventSource registers one listener per kind, and a kind absent from that list is delivered to no listener and silently dropped — its live updates only appear after a /transcript refetch. codefix_pr_state and auto_mode_state were both missing, so codefix PR-state transitions and auto-mode phase changes only updated on reload.

Extract the list to an exported STREAM_EVENT_KINDS (single source of truth, grouped by Go envKind*/globalKind*), add codefix_pr_state, auto_mode_state, and wiki_proposal_created. stream.test.ts pins that every kind with a live consumer is registered.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…l event

A proposal drafted through the guided walk_playbook flow runs in a sub-agent, so its propose_wiki_draft result arrives nested under the dispatch (parentToolId set). applyEvent routed nested events into the parent's children, and the inline WikiProposalCard only renders from a top-level item — so the card never showed. There was also no creation-time signal, so the sidebar didn't refresh either.

Frontend: applyEvent now hoists proposal-draft tool calls (wiki AND playbook — same nesting bug) out of sub-agent nesting so the inline card renders. Backend: handleToolEvent fires a global wiki_proposal_created event on a propose_wiki_draft end (nesting-independent), and WikiProposalNotifier bridges it to c1:wiki-proposals-changed so the sidebar pending list refetches live.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The /repos page and the sidebar's repo list/manage modal each render an independent /api/repos list and only refetched on their own mutation, so adding a repo from the sidebar didn't show on the /repos page until reload (and vice-versa).

Add a tested repos-events helper (triagent:repos-changed, the documented DOM-event prefix); every repo-list surface subscribes and refetches, every mutation notifies. dispatchEvent is synchronous, so the acting surface refreshes through its own listener too — one mechanism, symmetric.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…+ prompt

The wiki_proposal/playbook_proposal flows are dispatch-mode: a free-form sub-agent runs the playbook as a prompt, and nothing forces the terminal *_proposal_draft call. When it diverges (writes a draft to a file, returns a prose summary) the master reads the summary as a successful submission — the 'confabulation window' — and no proposal is ever produced.

Add an explicit terminal: a decline_proposal tool (the verifiable 'below the bar, no proposal' signal, in both sub-agents' allowlists) and a mandatory '## Finishing' section in the dispatch prompt that names the flow's submit tool and forbids ending with a prose summary or file write. Covers both wiki and playbook. Bump the proposal-dispatch timeout 10 -> 15 min.

Lays the groundwork for the verify half (subagent terminal-tool detection + bounded resume-force), still to come.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extends the claude-stub so a proposal action can carry an explicit toolId + parentToolId, modelling a proposal drafted INSIDE a walk_playbook sub-agent dispatch — the shape that regressed.

Adds a nested-proposals stub script and a backend-invariant test driving it end to end through the real launcher: the nested codefix proposal still persists and lists on /api/codefix-proposals, and the nested wiki proposal still fans a wiki_proposal_created global event (sidebar-refresh path) — both nesting-independent. The inline-card half (wiki + playbook hoist) is pinned next in the browser layer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A Playwright spec drives the nested-proposals investigation and asserts exactly two inline proposal cards — wiki and playbook — render in the session view despite being drafted inside a walk_playbook sub-agent dispatch, proving the transcript hoist surfaces them. (The nested codefix card is intentionally not hoisted; codefix surfaces on the repos panel, pinned in the backend test.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The verify half of the confabulation fix: detect whether a dispatch sub-agent actually called a terminal tool, resume-and-force when it didn't, and surface a structured submitted/declined/none outcome so a missing proposal can't read as success. Companion to the explicit-terminal + prompt increment (e55ef9f).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The verify half of the confabulation fix (spec: docs/superpowers/specs/2026-06-03-proposal-dispatch-verification-design.md).

subagent: Options.RequiredTerminalTools + Result.TerminalToolsCalled. relaySubEvents correlates tool_use ids with their tool_result blocks and reports which required terminal tools the sub-agent called with a non-error result (an errored draft/decline does not count).

strategies: runDispatch passes the flow's terminal set (submit + decline), and when neither fired and the run didn't time out, resumes the same conversation with a forcing follow-up up to maxForceDispatchRetries times. It returns a classified ProposalOutcome (submitted | declined | none) on DispatchedResult; a 'none' outcome also prefixes a NO PROPOSAL WAS SUBMITTED line onto the summary so a missing proposal can't read as success. A timed-out run is surfaced, not retried.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…result

A playbook_proposal_draft / propose_wiki_draft that fails validation returns proposal_id:"" (plus validation_errors). The card guard treated any string proposal_id as valid, so an empty id rendered an empty approve/decline card — a sub-agent that submitted invalid YAML five times before one validated showed six cards, five empty. Require a non-empty proposal_id; validation failures fall through to the raw tool card (visible as activity, not an empty card).

Pins it in the nested-proposals E2E: the stub now also makes a validation-failed playbook draft, and the browser spec asserts the playbook card count stays exactly one.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The finishing instruction told the sub-agent WHICH tool to call but not HOW to produce a valid draft. A real session called playbook_proposal_draft with invalid YAML five times before one validated. Rewrite the '## Finishing' section as a numbered procedure (draft -> validate -> submit, or decline), and for the playbook flow add a validate-before-submit step naming validate_playbook — the submit tool rejects invalid input, so an unvalidated draft is wasted effort. ValidateTool is empty for the wiki flow (its submit tool is the only validator).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lind

The dispatched wiki/playbook sub-agent runs in a separate session with no access to the investigation — its only context is the master's walk_playbook notes, which were thin (a real session passed ~328 chars), so it drafted near-blind. The notes never said the sub-agent can't see the investigation.

Rewrite the walk_playbook notes field description and the capture_offer wiki/playbook/all handoff terminals to make this explicit: the sub-agent has NO access to this investigation, so the brief must be complete and self-contained — symptom, findings, root cause, resolution, and the actual content that should end up in the artifact. Keeps the token-saving isolation; just makes the hand-off sufficient.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Clicking a pending proposal for a NEW playbook (one never promoted to a live playbook) deep-linked the editor to ?playbook=<id>&proposal=<pid>&tab=proposal, but PlaybookEditor unconditionally called getPlaybook(<id>) -> 404 -> "playbook not found". On a 404 with a deep-linked proposal it now falls back to seeding the editor from the proposal's draft (like __new), so the proposal opens and its approve/decline card renders.

E2E (both, as requested): a new-playbook proposal is opened from the sidenav and its Approve action renders; the wiki twin navigates a new-entry proposal deep-link and confirms the proposed content is viewable (the wiki backend already returns an is_stub entry instead of 404, so it never had the bug — this protects that path).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…n-row button

The investigation row was a <button> containing the rename/delete buttons and the rename <input> — invalid HTML (a button may not contain interactive descendants), which React flags as a hydration error and leaves the nested controls with undefined activation semantics. Make the row a div role=button with tabIndex + a focus-guarded Enter/Space handler so keyboard activation is preserved and the nested controls are legal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
WatchForm's mount effect fetches connection status and setState's on it; the test never awaited that async update, so it landed outside act() and warned. Stub api.getConnections and flush the effect inside act at the end of each test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 3, 2026 23:16
It was effectively a shipped plan (scratch), not a durable spec — the design now lives in the code and the PR. Per the repo convention, design that has shipped doesn't linger under specs/.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves triage reliability and “live-ness” across the launcher by (1) ensuring sub-agent drafted wiki/playbook proposals surface correctly in the UI, (2) hardening dispatch-mode proposal flows so missing terminal tool calls can’t be misreported as success, and (3) gating/parameterizing Teleport behavior to avoid kubeconfig deployments getting steered into Teleport auth loops.

Changes:

  • Adds proposal-dispatch terminal verification (submit/decline detection + bounded resume/force) and surfaces a structured outcome (submitted|declined|none) to prevent “confabulated success.”
  • Fixes live UI surfacing: registers missing SSE kinds, hoists nested proposal tool calls to top-level transcript items, and publishes a global wiki_proposal_created event to refresh the sidebar.
  • Gates Teleport MCP + guidance on auth.kind, threads proxy/connector into the Teleport subprocess/provider, and updates k8s context guidance/messages accordingly.

Reviewed changes

Copilot reviewed 56 out of 63 changed files in this pull request and generated no comments.

Show a summary per file
File Description
system/capture_offer.yaml Strengthens operator guidance to provide context-blind sub-agents complete briefs.
pkg/mcp/teleport/tools_login.go Uses provider-sourced reauth advice for unauthenticated login attempts.
pkg/mcp/teleport/tools_clusters.go Uses provider-sourced reauth advice; removes static auth-required message.
pkg/mcp/teleport/tools_clusters_test.go Adds coverage ensuring reauth advice is sourced/configured correctly.
pkg/mcp/teleport/server.go Threads Teleport proxy/connector and enforces provider supports ReauthAdvice().
pkg/mcp/subagent/subagent.go Tracks “required terminal tools” invoked successfully during sub-agent runs.
pkg/mcp/subagent/subagent_test.go Tests correlation of required terminal tool_use/tool_result pairs.
pkg/mcp/strategies/tools_proposal.go Adds decline_proposal explicit terminal tool.
pkg/mcp/strategies/tools_proposal_test.go Tests decline_proposal validation and acknowledgement behavior.
pkg/mcp/strategies/specs.go Registers decline_proposal in tool specs.
pkg/mcp/strategies/server.go Adds dispatched proposal outcome to API response and prefixes “none” summary.
pkg/mcp/strategies/dispatch.go Adds proposal tool/validator selection, terminal verification, resume/force loop.
pkg/mcp/strategies/dispatch_test.go Tests dispatch forcing/resume behavior and proposal outcome classification.
pkg/mcp/strategies/dispatch_prompt.go Teaches draft→validate→submit flow; mandates tool-terminal completion.
pkg/mcp/strategies/dispatch_prompt_test.go Pins prompt contract for terminal + validation instructions.
pkg/mcp/k8s/tools_resources_test.go Updates tests to prefer k8s-native guidance over Teleport-first guidance.
pkg/mcp/k8s/tools_contexts.go Clarifies switch_context schema guidance; Teleport step conditional in prose.
pkg/mcp/k8s/server.go Updates no-active-context message to steer kubeconfig vs Teleport appropriately.
internal/sessions/session.go Adds auth.kind-aware k8s auth guidance; gates Teleport allowlisting.
internal/sessions/session_test.go Adds coverage for Teleport gating and split guidance behavior.
internal/server/stream.go Extends stream envelope to carry wikiProposalCreated payload.
internal/server/manager.go Adds global wiki_proposal_created kind + publisher wiring.
internal/server/handlers.go Publishes global wiki-proposal-created event on propose_wiki_draft end-event.
internal/server/handlers_wiki.go Defines tool-name constant + extracts proposal_id from tool result JSON.
internal/server/handlers_tool_event_test.go Tests global event publication for nested wiki proposal draft tool events.
internal/server/global_events.go Adds global kind + payload type for wiki proposal creation fan-out.
internal/preflight/mcpconfig.go Gates Teleport MCP registration on auth.kind; threads proxy/connector env.
internal/preflight/mcpconfig_test.go Tests Teleport gating + env threading for proxy/connector.
frontend/lib/stream.tsx Centralizes exhaustive SSE kind registration; adds missing kinds.
frontend/lib/stream.test.ts Pins SSE kinds required for live consumers + enforces no duplicates.
frontend/lib/repos-events.ts Adds cross-surface DOM event for linked repo list refresh.
frontend/lib/repos-events.test.ts Tests repo-refresh event semantics and naming.
frontend/lib/events.ts Hoists nested proposal-draft tool calls/results out of dispatch nesting.
frontend/lib/events.test.ts Tests proposal hoisting behavior for nested propose_wiki_draft/playbook_proposal_draft.
frontend/lib/api.ts Extends StreamEnvelope typing for wikiProposalCreated global payload.
frontend/components/WikiProposalNotifier.tsx Bridges global SSE wiki_proposal_created → existing DOM refresh event.
frontend/components/WikiProposalNotifier.test.tsx Tests notifier dispatches DOM event only on wiki_proposal_created.
frontend/components/WatchForm.test.tsx Wraps mount-time async state updates in act() to avoid warnings.
frontend/components/Sidebar.tsx Fixes invalid nested-interactive HTML by switching row container to role=button div.
frontend/components/SessionView.tsx Prevents empty-card render on validation-failure drafts (proposal_id="").
frontend/components/PlaybookEditor.tsx Falls back to proposal draft when deep-linked playbook id 404s (new playbook proposals).
frontend/components/LinkedReposPanel.tsx Refetches linked repos on global repo-change event; notifies on add/remove.
frontend/app/(main)/repos/client.tsx Refetches repos list on cross-surface repo-change event; notifies on remove.
frontend/app/(main)/layout.tsx Wires WikiProposalNotifier into the main provider stack.
e2e/proposal_view_test.go Adds E2E tests for opening new wiki entry / new playbook proposals from sidebar.
e2e/proposal_surfacing_test.go Adds E2E coverage for nested sub-agent proposal surfacing (backend + browser).
e2e/fixtures/stub-scripts/nested-proposals/main.jsonl Adds a stub script modeling nested proposal tool events under a dispatch.
e2e/fixtures/playbooks/with-new-playbook-proposal/...yaml Adds fixture proposal for “brand-new playbook” opening path.
e2e/cmd/claude-stub/script.go Allows explicit toolId + parentToolId for proposal actions.
e2e/cmd/claude-stub/proposal.go Threads parentToolId through telemetry tool-events for nested proposals.
e2e/cmd/claude-stub/proposal_test.go Tests nested proposal tool-events preserve parentToolId/toolId.
e2e/browser/specs/wiki-proposal-view.spec.ts Playwright spec for deep-linked new wiki proposal opening.
e2e/browser/specs/playbook-proposal-view.spec.ts Playwright spec for new playbook proposal opening from sidenav.
e2e/browser/specs/nested-proposals.spec.ts Playwright spec asserting nested proposal cards hoist & invalid drafts don’t render cards.
docs/superpowers/specs/2026-06-03-proposal-dispatch-verification-design.md Design doc for dispatch terminal verification behavior and rationale.
cmd/triagent-mcp/serve.go Adds Teleport proxy/connector flags/env plumbing; passes through to Teleport server.
cmd/triagent-mcp/serve_test.go Tests env/flag precedence for Teleport proxy/connector resolution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@sourcehawk sourcehawk merged commit 87b605f into main Jun 3, 2026
5 checks passed
@sourcehawk sourcehawk deleted the fix/live-surfacing-and-dispatch-hardening branch June 3, 2026 23:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants