test(e2e): spec-version lifecycle infrastructure for the 2026-07-28 release#2226
test(e2e): spec-version lifecycle infrastructure for the 2026-07-28 release#2226felixweinberger wants to merge 2 commits into
Conversation
|
@modelcontextprotocol/client
@modelcontextprotocol/codemod
@modelcontextprotocol/server
@modelcontextprotocol/server-legacy
@modelcontextprotocol/express
@modelcontextprotocol/fastify
@modelcontextprotocol/hono
@modelcontextprotocol/node
commit: |
|
Review follow-up: every |
There was a problem hiding this comment.
Additional findings (outside current diff — PR may have been updated during review):
@felixweinberger edit: this is a leftover from an old version of this diff - significantly simplified now.
Extended reasoning...
-
🔴
test/e2e/requirements.ts:1492-1500— The retirement annotation sweep misses several session-dependent requirements defined away from the main session blocks:flow:session:terminate-then-reconnect(line ~2397),hosting:session:lifecycle-callbacks(~2780), andtypescript:client-transport:http:session-id-property/session-id-option(~2697/2704) — all describe Mcp-Session-Id lifecycle behavior that the 2026-07-28 release retires, yet none carryremovedInSpecVersion: '2026-07-28'(flow:multi-client:stateful-isolationat ~2375 may also qualify). Either annotate them too or add a note recording why they survive; otherwise these cells will register against retired behavior when the 2026-07-28 axis is activatedWhat's missing. This PR's deliverable is the complete set of
removedInSpecVersion: '2026-07-28'annotations for behaviors the sessionless 2026-07-28 release retires. The diff annotates the entire session family it touches — everyhosting:session:*entry in the main block (create, delete, reuse, unknown-id, missing-id, id-charset, isolation, reinitialize, post-termination-404, delete-cancels-inflight, cors-expose) plus the client-side siblings (client-transport:http:session-stored,404-surfaces,session-404-reinitialize,terminate-405-ok). But four (possibly five) session-dependent requirements defined elsewhere inrequirements.tswere left un-annotated:\n\n-flow:session:terminate-then-reconnect(~line 2397) — "After terminating a session, a fresh connection obtains a new session id and operations succeed"; sourced from the same#session-managementspec section as the annotated entries.\n-hosting:session:lifecycle-callbacks(~line 2780) — "StreamableHTTPServerTransport invokes onsessioninitialized with the new session id after initialization and onsessionclosed when the client issues DELETE"; composed entirely of session creation + DELETE termination, both retired here, and it even shares the identical note text with the annotatedhosting:session:*siblings.\n-typescript:client-transport:http:session-id-property(~line 2697) — ".sessionId exposes the negotiated session id after initialization"; the same store-the-negotiated-Mcp-Session-Id mechanism as the annotatedclient-transport:http:session-stored. Against a sessionless server there is no negotiated session id to expose.\n-typescript:client-transport:http:session-id-option(~line 2704) — constructorsessionIdsent as theMcp-Session-Idheader. (More arguable, since it only constrains client-side header emission — but then it deserves a note recording why it survives.)\n- Possibly alsoflow:multi-client:stateful-isolation(~line 2375), which depends on each client receiving a distinct session.\n\nWhy it looks like a sweep miss rather than a deliberate exception. No principled rule visible in the diff explains the omissions:source: 'sdk'entries were not exempted (hosting:session:cors-expose,isolation,reinitialize,delete-cancels-inflight,client-transport:http:resume-stream-apiare all sdk-sourced and annotated), andtypescript:-prefixed entries were not exempted either (typescript:hosting:resume:bad-event-id,typescript:server:get-client-capabilities,typescript:mcpserver:tool:extraare annotated).flow:*entries weren't deferred wholesale (flow:resume:tool-call-resumption-tokenandflow:elicitation:url-at-session-initare annotated). What the missed entries do have in common is location: each sits hundreds of lines away from its family's main block (the flow section, the late typescript: client-transport section, and the tail of the file just before the Legacy SSE section), which is the classic shape of a partial sweep. Where the PR did deliberately keep an entry alive, it added an explanatory note (e.g. the mrtr:caps-gated-inputrequests rationale onelicitation:capability:mode-mismatch/roots:list:not-supported); these entries got neither an annotation nor a note.\n\nConcrete walk-through of the impact. Takeflow:session:terminate-then-reconnect:\n1. Today,ALL_SPEC_VERSIONS = ['2025-11-25'], so the annotation set is inert — nothing changes, exactly as the PR description states.\n2. When the follow-up PR adds'2026-07-28'toALL_SPEC_VERSIONS, cell registration filters each requirement byremovedInSpecVersion(a version registers only whilev < removedInSpecVersion).\n3.hosting:session:create,hosting:session:delete, andhosting:session:post-termination-404correctly drop their 2026-07-28 cells, because they are annotated here.\n4.flow:session:terminate-then-reconnect— which is literally a composition of those three retired behaviors (terminate via DELETE, then obtain a new session id) — is not annotated, so it registers a 2026-07-28 cell and the suite asks a sessionless server to hand out a session id. The same logic applies tolifecycle-callbacks(no session is ever initialized, soonsessioninitializednever fires) andsession-id-property(.sessionIdis never populated).\n\nWhy nothing else catches this.removedInSpecVersionis the only mechanism that stops a cell from registering for a new version; there is no cross-check that session-dependent requirements are mutually consistent. The gap would only surface as unexpected failures during the activation PR — and the entire point of doing this bookkeeping as a separate M0 step is to have that data correct in advance.\n\nFix. AddremovedInSpecVersion: '2026-07-28'to the four entries (and decide onflow:multi-client:stateful-isolation), or — if any of them are intentionally kept (e.g. the constructorsessionIdoption as a back-compat affordance for older servers, mirroring thehosting:http:batchprecedent) — extend theirnoteto record that rationale, the way the diff already does for the surviving elicitation/roots entries. Severity-wise this is inert today, but the completeness of this annotation set is the PR's stated purpose, so it's worth resolving before the stack on top of it builds on the data.
|
Re the review finding on missed session-dependent entries: confirmed — all 5 named entries were genuine misses, and a full completeness sweep of the manifest found 10 more (including This PR also picked up a structural change from maintainer review: every retired entry that has a 2026 successor now gets that successor created in this PR — 27 superseding requirement entries with Final shape: 88 retired entries (73 original + 15 from the sweep) · 27 superseding stubs · 20 survives-with-reason notes · matrix unchanged (1,231 cells, 0 unexpected failures). |
|
Per maintainer review: supersession links are now structural fields, not prose. |
|
Per review: the superseding-stub section comments are gone; each stub now carries |
5109d18 to
6784af5
Compare
|
Restructured per maintainer direction. This PR is now infrastructure-only: the version label + the supersession-link fields + their symmetry gate. The bulk annotation sweep (88 retirements, 27 superseding stubs) has been removed from this PR — that content moves into the implementation PRs, where each retirement/successor lands in the same diff as the code that makes it true. Rationale: reviewing a manifest change next to its code change preserves the association between behavior change and implementation; a big upfront sweep loses it. |
There was a problem hiding this comment.
LGTM — small, test-infrastructure-only change (version label, structural supersession fields, symmetry gate); the only finding is the doc nit on CLAUDE.md line 58, which can be folded in here or with the first implementation PR.
Extended reasoning...
Overview
The PR touches only test/e2e/types.ts and test/e2e/coverage.test.ts. It splits the spec-version axis into KNOWN_SPEC_VERSIONS (typing for lifecycle bounds, now including 2026-07-28) and ALL_SPEC_VERSIONS (the active matrix axis, unchanged at 2025-11-25, constrained with satisfies readonly SpecVersion[] so its literal type is preserved), changes Requirement.supersedes from a single string to a readonly array, adds supersededBy, and replaces the old referential-integrity gate with a stricter symmetry gate.
Verification
I confirmed requirements.ts contains no existing supersedes, addedInSpecVersion, or removedInSpecVersion entries, so the field-shape change breaks no current data and the new gate passes vacuously, exactly as the PR description states. helpers/verifies.ts filters cells on ALL_SPEC_VERSIONS, which is unchanged, so the matrix registers the same 1,231 cells. The gate logic itself is straightforward set/back-link checking with clear error messages and the consistency rules (supersededBy ⇒ removedInSpecVersion, supersedes ⇒ addedInSpecVersion) match the stated convention.
Security risks
None — no SDK or runtime code is touched; this is test-suite manifest infrastructure only.
Level of scrutiny
Low. Test-only, ~50 changed lines, self-contained, and inert until a later PR adds 2026-07-28 to ALL_SPEC_VERSIONS. The earlier review concern about missed retirement annotations no longer applies: the bulk annotation sweep was removed from this PR per maintainer direction, so there is nothing here for it to be incomplete against.
Other factors
The single remaining finding is a documentation nit (test/e2e/CLAUDE.md line 58 still describes the old single-string supersedes link), posted as an inline comment. It does not affect correctness and is naturally fixed either here or alongside the first implementation PR that exercises the new fields.
KNOWN_SPEC_VERSIONS carries version labels the manifest may reference in addedInSpecVersion/removedInSpecVersion bounds and knownFailure scoping. ALL_SPEC_VERSIONS (the active matrix axis driving cell registration) is unchanged and stays 2025-11-25 only.
When a spec release replaces a behavior, the retired entry records its successor via supersededBy and the new entry lists what it replaces via supersedes. A coverage gate enforces referential integrity and exact symmetry, plus: supersededBy requires removedInSpecVersion, and supersedes requires addedInSpecVersion. Entries populate these fields in the PRs that implement the corresponding spec changes.
87f95b3 to
3ecd8a8
Compare
There was a problem hiding this comment.
LGTM — small, self-contained test-infrastructure change (version label, supersession-link fields, and a symmetry gate); prior review feedback, including the CLAUDE.md convention text, is now reflected in the diff.
Extended reasoning...
Overview
This PR touches only the e2e test harness: test/e2e/types.ts (adds KNOWN_SPEC_VERSIONS with the 2026-07-28 label while keeping ALL_SPEC_VERSIONS — the active matrix axis — at 2025-11-25 only, and converts supersedes from a single string to a readonly string[] plus a new supersededBy back-link), test/e2e/coverage.test.ts (replaces the old referential-integrity check with a symmetry gate that also enforces supersededBy ⇒ removedInSpecVersion and supersedes ⇒ addedInSpecVersion), and test/e2e/CLAUDE.md (contributor-doc text describing the new bidirectional convention). No SDK source code, no published package, and no requirement entries are changed.
Security risks
None. The change is confined to test-suite manifest typing, a static coverage gate, and contributor documentation — there is no runtime, transport, auth, or protocol code involved.
Level of scrutiny
Low. The diff is ~50 lines of test infrastructure. I verified that no entry in requirements.ts currently uses supersedes or supersededBy, so the string→array type change cannot break any existing manifest entry and the new gate passes vacuously, exactly as the description states. helpers/verifies.ts registers cells from ALL_SPEC_VERSIONS only, which is unchanged, so the matrix shape (1,231 cells) is unaffected by adding the 2026-07-28 label to KNOWN_SPEC_VERSIONS. The gate logic itself is straightforward and its error messages are precise.
Other factors
The PR went through several restructures during review; the current head is the infrastructure-only shape requested by maintainers, with the bulk annotation sweep deliberately deferred to the implementation PRs. All previously raised review threads are addressed in the current diff: the missed-annotation concern no longer applies (no annotations ship here), and the CLAUDE.md line-58/59 update that was earlier missing from the pushed head is now present in the diff. The bug hunting system found no issues in the current revision. Stacked PRs (#2227, #2228) depend on this landing first, and the change is the minimal base they need.
Motivation and Context
The upcoming spec release (stateless/sessionless/MRTR — SEP-2575/2567/2322) retires some 2025 behaviors and replaces others. Recording those changes requires two pieces of manifest infrastructure:
KNOWN_SPEC_VERSIONS— version labels thataddedInSpecVersion/removedInSpecVersionbounds and knownFailure scoping may reference.ALL_SPEC_VERSIONS(the active matrix axis) is unchanged: only2025-11-25registers cells.supersededBy) and the new entry lists what it replaces (supersedes: readonly string[]). A coverage gate enforces referential integrity and exact symmetry, plus two consistency rules:supersededByrequiresremovedInSpecVersion, andsupersedesrequiresaddedInSpecVersion.The actual annotations land incrementally: each implementation PR marks the entries it retires and adds the successors it implements, in the same diff as the code — so every manifest change is reviewed alongside the behavior change that justifies it.
How Has This Been Tested?
Full e2e matrix unchanged: 1,231 cells = 1,139 passed + 92 expected-fail, 0 unexpected failures. The new symmetry gate passes (vacuously, until entries populate the fields). Typecheck and lint clean.
Breaking Changes
None. Test-infrastructure only; no SDK code touched.
Types of changes
Checklist