Skip to content

Agent-layer e2e tests should run inside a hardened kernel and assert via logs (not import agents into vitest under mock-endoify) #961

Description

@grypez

Summary

The agent-layer "e2e" tests in @ocap/kernel-test-local (sample-agent.e2e.test.ts, chat-agent.e2e.test.ts) are not actually end-to-end. They import the agent factories (makeJsonAgent / makeReplAgent / makeChatAgent) and the built-in capability modules directly into the vitest process and run them under the mock-endoify shim — i.e. in an un-hardened realm with a no-op harden. They should instead drive an agent running inside a properly hardened kernel process and assert on captured logs, mirroring the pattern already used by @ocap/kernel-test.

Why this matters (how it surfaced)

While reviewing the "capabilities as pattern-guarded exos" stack, yarn workspace @ocap/kernel-test-local test:e2e:local revealed that sample-agent.e2e.test.ts (the JSON + REPL agent suite, 9 cases) crashes at import:

Error: Check failed   (passStyleOf / assertPassable)
 ❯ describedMethod  packages/kernel-utils/src/described.ts
 ❯ makeEnd          packages/kernel-agents/src/capabilities/end.ts

Root cause: the built-in capabilities are now pattern-guarded discoverable exos, so they build @endo/patterns guards eagerly at import. Those builders (.returns() / .optional()) call assertPassable, which requires frozen patterns. mock-endoify's harden is a no-op that does not freeze, so construction throws. The code is correct under real lockdown — makeEnd() constructs cleanly under real endoify-node (verified) — the test is simply running it in the wrong (un-hardened) environment.

This is a test-environment gap, not a product bug, and it is local-only (these tests are not in CI; kernel-test-local has no test:e2e:ci script). But it means the "e2e" suite gives no real signal about the agent layer running where it actually runs.

Why the obvious fixes don't work

  • Switch the e2e files to real endoify-node. Blocked by a pre-existing, orthogonal collision: importing makeReplAgent under real lockdown throws TypeError: Cannot redefine property: sliceToImmutable from kernel-agents-repl/.../compartment.ts's import 'ses' — vitest's module runner re-evaluates the SES shim after hardenIntrinsics() already froze that intrinsic. No server.deps (external / inline / inline: true) or pool config works around it (all tried). This is exactly why these tests used mock-endoify in the first place.
  • Make mock-endoify's harden deep-freeze. Not viable: it breaks 7 of 16 consuming packages (service-matcher 22/25, sheaves 19/88, kernel-agents, kernel-agents-repl, kernel-platforms, kernel-browser-runtime, evm-wallet-experiment) with Cannot add property X, object is not extensible. Crucially, these are test-environment failures, not product mutations: vitest's own machinery (vi.fn / vi.spyOn / vi.mock, call tracking) and test-harness object assembly mutate objects that a real (freezing) harden would freeze but the no-op harden currently leaves mutable. The shipped product always runs under real lockdown (and is exercised that way by @ocap/kernel-test), so a genuine harden-then-mutate in product code would already fail there — it doesn't. In other words, the no-op harden is load-bearing for the test environment, and freezing it would mean reworking large amounts of test setup, not product code. Either way it's out of scope here.

Proposed direction

Rebuild the agent-layer e2e tests to be genuinely end-to-end, following the @ocap/kernel-test pattern:

  • Launch a real, hardened kernel (makeKernel(db, true, logger) with vat workers), not the vitest realm.
  • Run the agent inside a vat (an agent-driver vat bundle, analogous to lms-sample-vat), with a language-model service registered on the kernel (makeKernelLanguageModelService + makeMockSample, or a real provider for :local runs).
  • Assert on captured log entries (makeTestLogger() / extractTestLogs) rather than on in-process return values.

This exercises the capability exos, the membrane, and the agent loop under real lockdown — where the harden/exo/@endo/patterns machinery actually operates — and removes the dependency on mock-endoify for the agent layer entirely. It also avoids the whole class of "vitest mutates a would-be-frozen object" problems, because the agent runs in a real hardened vat instead of being instrumented in the test realm.

Scope notes

  • Likely lives in or alongside @ocap/kernel-test (which already has the hardened-kernel + log-assertion harness and lms-*-vat bundles) rather than kernel-test-local.
  • The REPL agent's compartment.ts import 'ses' behavior should be reviewed as part of this — running it inside a hardened vat (rather than re-importing ses in the test realm) should sidestep the sliceToImmutable collision, but confirm.
  • Until this lands, the current sample-agent.e2e.test.ts is expected to fail to construct under test:e2e:local; chat-agent.e2e.test.ts happens to still pass.

Context

Surfaced during review of the capabilities-as-pattern-guarded-exos stack (#958#959#960).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions