feat(relay): /v1/server WS upgrade with header gate and server-id claim (#16)#19
Conversation
…im (#16) Implements the binary-side WebSocket upgrade endpoint. Validates X-Pyrycode-Server, X-Pyrycode-Version, and User-Agent before invoking websocket.Accept; on success, wraps the conn in WSConn (with a crypto/rand-suffixed conn-id) and registers it via Registry.ClaimServer. Returns close code 4409 with reason "server-id already claimed" when the slot is already held. Holds the connection open via CloseRead until the peer disconnects; releases the slot in a defer. Logs structured fields only (server_id, binary_version, remote on claim; server_id and remote on conflict; server_id on release) per docs/threat-model.md log-hygiene rules. Closes #16
Code Review: #16Decision: PASS Architect's security-review section is present in Findings
SummaryA clean, spec-faithful implementation of the binary-side WebSocket upgrade. Header gate runs pre- |
) - Add features/server-endpoint.md covering the binary-side ingress: header gate pre-upgrade, ClaimServer integration, 4409 conflict path, CloseRead-held connection until #6's frame loop lands. - Add ADR-0005 capturing the rule that application close codes go through the underlying *websocket.Conn (stillborn-WSConn window), keeping WSConn.Close fixed to StatusNormalClosure. - Update INDEX with new feature and ADR. - Update PROJECT-MEMORY: mark /v1/server done; add patterns for pre-upgrade validation, close-code emission, defer-after-claim ordering, and the CloseRead bridge for long-lived WS handlers. - Add four lessons: control-frame requirement, close-code emission rule, defer ordering for ReleaseServer, and crypto/rand panic-as-fatal idiom.
What
Implements the binary-side WebSocket upgrade endpoint at
/v1/server:internal/relay/server_endpoint.goexportsServerHandler(reg, logger) http.Handler. Pre-upgrade gate onX-Pyrycode-Server,X-Pyrycode-Version, andUser-Agent(all must be non-empty); failures return400with no body and the registry untouched.websocket.Acceptupgrades withOriginPatterns: ["*"](the custom-header gate structurally excludes browser-driven CSWSH). The conn is wrapped viaNewWSConnwith aserver-<id>-<8 hex>conn-id sourced fromcrypto/rand.reg.ClaimServerenforces first-claim-wins; a losing claim closes the WS with code4409and reason"server-id already claimed". The conflict path closes the underlying*websocket.Conndirectly (WSConn.Close always sendsStatusNormalClosure); a comment in source names this principled exception.Conn.CloseReaduntil the peer disconnects, then releases the slot in adefer.cmd/pyrycode-relay/main.goadds onemux.Handle("/v1/server", …)line next to/healthz.Issue
Closes #16
Testing
internal/relay/server_endpoint_test.gocovers all five AC scenarios:TestServerEndpoint_ValidUpgrade_RegistersBinary— successful upgrade, conn-id shape (server-s1-+ 8 hex chars).TestServerEndpoint_HeaderGate_400— table-driven, missing/empty for each of the three required headers; usesreq.Header[k] = nilfor the User-Agent missing case to suppress Go's default UA injection so the test exercises the wire-level missing case.TestServerEndpoint_DuplicateClaim_4409— second dial reads back awebsocket.CloseErrorwithCode == 4409and the documented reason; first conn remains registered.TestServerEndpoint_PeerClose_ReleasesSlot— peer-side close releases the slot; re-claim succeeds.TestServerEndpoint_WrongMethod_NoPanic— GET without upgrade headers and POST both return 4xx with no panic and registry empty.Verified locally:
go test -race ./...✓go vet ./...✓go build ./cmd/pyrycode-relay✓Architecture compliance
Follows the spec at
docs/specs/architecture/16-server-endpoint.md:ServerHandler), file-localrandHex8andremoteHosthelpers.server_claimedincludesbinary_version;server_id_conflictdeliberately omits it to avoid creating a version-disclosure oracle.defer { ReleaseServer; wsconn.Close; log }is registered after the successfulClaimServerso the conflict path cannot trigger a release on a slot we don't hold.registry.go,ws_conn.go, or any other existing file beyond the single mux registration line.