feat(perf,#4740): add mock SSE producer rig + test-only _testInjectSsePush hook by OneStepAt4time · Pull Request #4746 · OneStepAt4time/aegis

OneStepAt4time · 2026-06-16T06:32:00Z

What

Adds the lowest-risk path to make the #4683 Phase 2 retry rig fit-for-purpose: bypass the live CC session entirely and drive synthetic SSE push events directly into the dashboard's perfRecorder via a test-only window hook.

Why

The previous Phase 2 run saw 3 pushes in 90 sec, then 0 for the rest of the 26-min run (worker pending, byteOffset=0) because the live CC session is blocked by P0 #4737 / #4738 (ACK round-trip defects). The test was measuring the wrong thing — at-rest perf, not SSE-push stress.

This rig decouples the stress path measurement from the ACK round-trip defect. The existing driver-final.mjs observer can run in parallel to capture the dashboard-side metrics (CLS, heap, page load) under the synthetic load.

Changes

dashboard/src/utils/perfRecorder.ts: expose _testInjectSsePush(ms) on window.__aegisPerf__ (delegates to the private recordSsePushToRender). Underscore prefix flags it as test-only; production code should not call this.
dashboard/src/utils/__tests__/perfRecorder.test.ts: unit test for the new hook (verifies count, last, max update correctly across 3 calls).
scripts/perf/mock-sse-producer.mjs: Playwright-based rig. Opens the dashboard, authenticates via /v1/auth/verify, resets the perfRecorder, then injects synthetic pushes at PUSH_HZ (default 1 Hz) for RUN_DURATION_MS (default 30 min). Logs JSONL snapshots every TICK_MS.
scripts/perf/README.md: usage, env vars, AC, related issues.

Acceptance criteria (from #4740)

The rig is fit-for-purpose when:

ssePushToRenderCount >= 100 sustained over a 30-min window ← this rig delivers
CLS p100 < 0.5 across that window (or root-cause with a candidate fix — see [Endurance #4683 Phase 2] CLS tail breaches 0.5 under active SSE-push stress (max 0.5346, 6/144 scrapes) #4739)
Heap p95 < 12MB across that window
Page load p95 < 2s across that window
Zero SSE reconnects, zero give-ups
Zero new P0s/P1s

The mock producer's job is to drive ssePushToRenderCount reliably. The other metrics (CLS, heap, page load) are dashboard-side concerns that the existing tests already cover.

Verification

9/9 unit tests pass (including the new _testInjectSsePush test)
tsc --noEmit exits 0
eslint clean on all 4 changed files
Smoke test (60s, 5 Hz) not run in this PR — needs an AEGIS_TOKEN + running Aegis instance; documented in README

[Endurance #4683 Phase 2] test-rig limitation — driver cannot sustain active CC SSE pushes (#4737 dependency) #4740 — parent: Phase 2 test-rig limitation (devops, ready, P2)
[P1] Endurance test: 4h+ CC development session through Aegis #4683 — parent: Endurance test 4h+ CC development session through Aegis
[P0] prompt_ack_timeout: ACK round-trip not completing (Aegis send → tmux → CC runtime) #4737 — [P0] prompt_ack_timeout (CC ACK round-trip)
P0: ag send fails with no_acp_runtime — autoResume conflicts with in-flight background handshake from createSessionAsync #4738 — P0 ag send no_acp_runtime (CC runtime registration race)
[Endurance #4683 Phase 2] CLS tail breaches 0.5 under active SSE-push stress (max 0.5346, 6/144 scrapes) #4739 — [Endurance [P1] Endurance test: 4h+ CC development session through Aegis #4683 Phase 2] CLS tail breaches 0.5 under SSE-push stress

Lane

Hermes (devops) per Boss's 2026-06-16 08:23 GMT+2 directive.

…ePush hook Adds the lowest-risk path to make the #4683 Phase 2 retry rig fit-for-purpose: bypass the live CC session entirely and drive synthetic SSE push events directly into the dashboard's perfRecorder via a test-only window hook. Changes: - dashboard/src/utils/perfRecorder.ts: expose _testInjectSsePush(ms) on window.__aegisPerf__ (delegates to private recordSsePushToRender). Underscore prefix flags it as test-only. - dashboard/src/utils/__tests__/perfRecorder.test.ts: unit test for the new hook (verifies count, last, max update correctly). - scripts/perf/mock-sse-producer.mjs: Playwright-based rig that opens the dashboard, authenticates, resets the perfRecorder, then injects synthetic pushes at PUSH_HZ (default 1 Hz) for RUN_DURATION_MS (default 30 min). Logs JSONL snapshots every TICK_MS. - scripts/perf/README.md: usage, env vars, AC, related issues. Acceptance criteria from #4740: - ssePushToRenderCount >= 100 sustained over 30-min window (this rig) - CLS p100 < 0.5 (dashboard-side, #4739) - Heap p95 < 12MB, page load p95 < 2s (dashboard-side) - Zero SSE reconnects, zero give-ups - Zero new P0s/P1s The previous Phase 2 run saw 3 pushes in 90 sec, then 0 for the rest of the 26-min run (worker pending, byteOffset=0). This rig decouples the stress path measurement from the ACK round-trip defect.

aegis-gh-agent

CHANGES_REQUESTED — CI gate failure + 1 architectural concern

Status: 1 CI failure + 1 substantive concern. Code quality, scope, tests, and architecture are otherwise solid. Once the gate is unblocked and the hook concern is addressed, this is an easy approval.

🔴 Blocker: `feat-minor-bump-gate` failing

CI run 27598856071, step Enforce approved minor bump label for feat PRs → failure.

The release-please gate requires the approved-minor-bump label on any feat: PR. Recent precedent with the label (verified via gh search issues):

#4723 (perf instrumentation for #4683) — merged 2026-06-15 with label
#4708 (acp fatal-stderr detection) — merged 2026-06-14 with label
#4618 (server-inbound stable-actor resolver) — merged 2026-06-08 with label

Action: Please apply the approved-minor-bump label — this is an explicit release-please approval from the owner lane. Once applied, push a re-trigger commit (or close/reopen) to re-run the gate. I will re-review and merge once green.

Note: I am intentionally not adding the label myself — release-version approvals are an owner/Boss lane decision per the established feat: PR convention. My role here is code review, not release management.

🟡 Substantive concern: test hook on production code

dashboard/src/utils/perfRecorder.ts exposes _testInjectSsePush(ms) on window.__aegisPerf__ unconditionally — there is no if (import.meta.env.DEV) or NODE_ENV guard. The PR body acknowledges "underscore prefix flags it as test-only" but a prefix is convention, not enforcement. In a production build, anyone with a console can do:

window.__aegisPerf__._testInjectSsePush(99999);

…and silently poison the SSE push→render metrics in a live session.

The __aegisPerf__ surface (snapshot/reset) is already exposed for scraping, so the precedent for "this is OK on production" is already set. But _testInjectSsePush is a write path that updates derived state, which is qualitatively different from read-only scraping.

Suggestion (non-blocking): guard the hook on build mode, e.g.

if (import.meta.env.DEV || import.meta.env.MODE === 'test') {
  (window.__aegisPerf__ as any)._testInjectSsePush = (ms: number) =>
    perfRecorder.recordSsePushToRender(ms);
}

OR strip it from production via the bundler's dead-code path. If you decide the current "underscore prefix is the contract" approach is intentional for an internal/perf-tooling product, please add a one-line comment in perfRecorder.ts documenting the decision so future readers know it was a choice, not an oversight.

🟢 What looks good (no changes requested)

Scope discipline: 4 files, +356/-1, coherent arc. The mock producer + hook + test + README land together as one deliverable.
Unit test quality: the new test for _testInjectSsePush exercises count/last/max correctly across 3 calls with values that verify the max-tracks-higher invariant. Good edge-case coverage.
Mock producer design: Playwright + addInitScript for auth bypass, env-driven config, PID file for clean shutdown, watchdog on the tick loop, JSONL logging with structured fields. All production-quality patterns.
Acceptance criteria mapped: every AC in #4740 is enumerated in the PR body and the README. The ssePushToRenderCount >= 100 goal is testable by the rig; the dashboard-side ACs (CLS, heap, page load) remain with the dashboard owner — clean separation.
Security: CodeQL, Trivy, Gitleaks, GitGuardian all clean. No secrets, no path traversal, no injection vectors.
Lane adherence: devops lane per Boss 2026-06-16 08:23 directive, parent issue #4740 in Hermes's queue, related to #4683 endurance protocol. The rig decouples stress measurement from the broken live ACK path — correct scoping.
PR hygiene: conventional-commit title, body has "What/Why/Changes/Verification/Related/Lane", targets develop, 1 well-formed commit.

Re-review plan

Once the label is applied (and ideally the import.meta.env.DEV guard is added or explicitly waived with a comment), push a trigger commit and ping me. I will:

Verify feat-minor-bump-gate is green
Re-verify the other 15 green checks are still green
Re-check the test-hook guard (if added) compiles
Approve + squash-merge via bot API

If Ema wants the label added in the same PR as the env-guard follow-up, that's fine — push both in one commit and ping me, I'll review the combined diff.

👁️ — Argus

Addresses Argus's non-blocking review concern on PR #4746: the test-only injection point should not be exposed in production builds. The dev-only guard is now the contract: production builds expose only snapshot() and reset(). Test: - 9/9 unit tests pass (vitest runs in dev mode, so the hook is still available for the existing test) - tsc --noEmit exit 0 - eslint exit 0 The mock producer rig (scripts/perf/mock-sse-producer.mjs) targets the dev dashboard, so the env-guard does not affect the rig's ability to drive synthetic pushes.

aegis-gh-agent

Re-review: env-guard fix verified ✅ — only `approved-minor-bump` label remains

Commit 936dacc (Hermes, 06:48:10Z): the import.meta.env.DEV guard is in place. The test-only injection point is now correctly stripped from production builds — my non-blocking concern from the prior review is fully addressed.

Vitest runs in dev mode, so the existing unit test continues to pass against the same surface. The mock producer rig targets the dev dashboard, so the env-guard does not affect its ability to drive synthetic pushes. Contract is now enforced at the bundler level, not just by underscore convention.

CI status on commit 936dacc:

✅ Analyze, CodeQL, GitGuardian, Gitleaks, Trivy — all clean
✅ dashboard-e2e, helm-smoke, platform-smoke (mac/win), sdk-drift, lint-pr-title, lint
✅ test (ubuntu-20) + test (ubuntu-22) — both green
❌ feat-minor-bump-gate — failing (1 annotation: missing approved-minor-bump label)

Remaining blocker: the approved-minor-bump label. Per the established feat: PR convention (#4723, #4708, #4618), this is an owner/Boss lane decision and I intentionally do not add it myself. Once applied, the gate will re-evaluate on the next push (or via "re-run jobs" from the Actions tab).

My re-review plan: once the label is in place and CI is fully green, I will:

Push APPROVE (this COMMENT + APPROVE on the same PR from the bot identity — latest wins)
Squash-merge to develop via bot API

Lane: Hermes (devops) per Boss 2026-06-16 08:23 directive. PR targets develop. No conflicts. All 9 merge gates pass once the label is on.

👁️ — Argus

aegis-gh-agent

✅ APPROVED — All 9 merge gates pass

Latest commit 936daccf (Hermes, 2026-06-16 06:48:10Z): env-guard fix verified, label in place, CI 17/17 green.

Gate-by-gate verification

✅ Review completed — full diff reviewed, prior non-blocking concern (env-guard) fully addressed
✅ No conflicts — mergeable: MERGEABLE, base = develop
✅ CI green — 17/17:
- feat-minor-bump-gate ✅ (label approved-minor-bump applied by Ema at 07:02:51Z, re-evaluated 07:12:50Z → SUCCESS)
- test ubuntu-20 + ubuntu-22 ✅
- dashboard-e2e ✅, helm-smoke ✅, platform-smoke mac/win ✅
- CodeQL ✅, Trivy ✅, Gitleaks ✅, GitGuardian ✅
- lint + lint-pr-title ✅, sdk-drift ✅, Analyze ✅
✅ No regressions — both test runs green
✅ Unit tests — 9/9 vitest pass, new _testInjectSsePush test exercises count/last/max invariants
✅ E2E — dashboard-e2e green
✅ Documented — scripts/perf/README.md covers env vars, AC, related issues, lane
✅ Security — Trivy/CodeQL/Gitleaks/GitGuardian all clean; no secrets; no injection vectors
✅ Targets develop — base = develop

Scope discipline

4 files, +356/-1 coherent arc: perfRecorder hook (with env-guard) + unit test + Playwright rig + README. Lane-clean (Hermes/devops), parent #4740 in queue, decouples stress measurement from broken live ACK path #4737/#4738.

Proceeding with squash-merge to develop via bot API.

👁️ — Argus

aegis-gh-agent

LGTM — re-review on head 936daccf (env-guard commit). My earlier non-blocking 🟡 concern is closed; the rest of the 9-gate sweep re-verified on the fresh CI run.

Env-guard verification (closes my earlier concern):

isDev = import.meta.env.DEV computed at module load
...(isDev ? { _testInjectSsePush: ... } : {}) — production builds spread {}, exposing only snapshot() + reset()
TypeScript-safe conditional spread, no runtime cost in prod
Comment explicitly states the MUST NOT contract for production

9-gate sweep (re-run on fresh CI after approved-minor-bump label applied):

Review: full diff read on 391f4c9 + 936daccf; non-blocking concern closed — ✅
No conflicts: mergeable: MERGEABLE, branch up-to-date with develop — ✅
CI green: 15/15 SUCCESS (lint, lint-pr-title, sdk-drift, feat-minor-bump-gate, test ubuntu 20+22, dashboard-e2e, CodeQL, platform-smoke mac/win, helm-smoke, Trivy, Gitleaks, GitGuardian, notify, Analyze), 2 expected SKIPPED (test-matrix, auto-label-test) — ✅
No regressions: tsc --noEmit exit 0, eslint clean on 4 changed files, dashboard-e2e green — ✅
Unit tests: 9/9 pass in dashboard/src/utils/__tests__/perfRecorder.test.ts (new _testInjectSsePush test included) — ✅
E2E/UAT: dashboard-e2e green; the 30-min mock-producer smoke test needs AEGIS_TOKEN + running Aegis instance, documented in scripts/perf/README.md as out-of-scope for this PR (correctly — the rig delivery IS the E2E for this PR; the rig execution is a separate downstream activity) — ✅
Documented: scripts/perf/README.md covers usage, env vars, AC, related issues; PR body has full rationale, AC mapping to #4740, related issues, lane — ✅
Security clean: env-guard prevents production exposure of test hook (closes the attack-surface concern I raised earlier); Gitleaks + Trivy + GitGuardian + CodeQL all green; no secrets, no gitignored files — ✅
Targets develop: baseRefName: develop — ✅

Issue closure note: PR delivers sub-fix (1) of 3 from #4740 ("Mock SSE producer" — bypass CC and drive __aegisPerf__ directly). Sub-fixes (2) and (3) — active CC session spawn loop (blocked on P0 #4737/#4738 ACK-round-trip fix) and stress-surface scraper — remain open. #4740 will stay OPEN per the work-split rule; the merge satisfies one of three pieces, not the original DoD as-written. Will post a partial-progress comment on #4740 tagging PM (Athena) for the remaining two pieces.

Lane: devops — Hermes per Boss 2026-06-16 08:23 directive. Test-infra guard (env-guard) is correct; no env-leak risk.

Approving. Squash-merging to develop.

aegis-gh-agent Bot requested changes Jun 16, 2026

View reviewed changes

aegis-gh-agent Bot reviewed Jun 16, 2026

View reviewed changes

OneStepAt4time added the approved-minor-bump Approves a minor version bump for release-please label Jun 16, 2026

aegis-gh-agent Bot approved these changes Jun 16, 2026

View reviewed changes

aegis-gh-agent Bot merged commit 84af8c5 into develop Jun 16, 2026
27 of 28 checks passed

aegis-gh-agent Bot deleted the devops/4740-mock-sse-producer branch June 16, 2026 07:16

aegis-gh-agent Bot approved these changes Jun 16, 2026

View reviewed changes

aegis-gh-agent Bot mentioned this pull request Jun 16, 2026

[Endurance #4683 Phase 2] test-rig limitation — driver cannot sustain active CC SSE pushes (#4737 dependency) #4740

Closed

6 tasks

This was referenced Jun 16, 2026

ci(devops): fix bot-can't-self-approve friction in PR approval flow #4725

Open

[DevOps] Fallback approver chain for bot reviewers (post-#4683) #4728

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(perf,#4740): add mock SSE producer rig + test-only _testInjectSsePush hook#4746

feat(perf,#4740): add mock SSE producer rig + test-only _testInjectSsePush hook#4746
aegis-gh-agent[bot] merged 2 commits into
developfrom
devops/4740-mock-sse-producer

OneStepAt4time commented Jun 16, 2026

Uh oh!

aegis-gh-agent Bot left a comment

Uh oh!

aegis-gh-agent Bot left a comment

Uh oh!

aegis-gh-agent Bot left a comment

Uh oh!

Uh oh!

aegis-gh-agent Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

OneStepAt4time commented Jun 16, 2026

What

Why

Changes

Acceptance criteria (from #4740)

Verification

Related

Lane

Uh oh!

aegis-gh-agent Bot left a comment

Choose a reason for hiding this comment

CHANGES_REQUESTED — CI gate failure + 1 architectural concern

🔴 Blocker: feat-minor-bump-gate failing

🟡 Substantive concern: test hook on production code

🟢 What looks good (no changes requested)

Re-review plan

Uh oh!

aegis-gh-agent Bot left a comment

Choose a reason for hiding this comment

Re-review: env-guard fix verified ✅ — only approved-minor-bump label remains

Uh oh!

aegis-gh-agent Bot left a comment

Choose a reason for hiding this comment

✅ APPROVED — All 9 merge gates pass

Gate-by-gate verification

Scope discipline

Uh oh!

Uh oh!

aegis-gh-agent Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

🔴 Blocker: `feat-minor-bump-gate` failing

Re-review: env-guard fix verified ✅ — only `approved-minor-bump` label remains