Skip to content

feat(relay): refuse to boot on unexpected Linux capabilities (#79)#83

Merged
ilmoniemi merged 3 commits into
mainfrom
feature/79
May 13, 2026
Merged

feat(relay): refuse to boot on unexpected Linux capabilities (#79)#83
ilmoniemi merged 3 commits into
mainfrom
feature/79

Conversation

@ilmoniemi
Copy link
Copy Markdown
Contributor

What

Adds a Linux-only boot-time check that parses /proc/self/status's CapEff: hex mask and refuses to start the relay if any capability bit outside an explicit allowlist (CAP_NET_BIND_SERVICE only) is set. On non-Linux platforms the check is a no-op that emits a single startup-log line. Establishes the _linux.go / _other.go build-tag convention for the repo.

Issue

Closes #79. Split from #42. Sibling of #77.

Testing

  • internal/relay/caps_test.go — cross-platform, table-driven tests for parseCapEff (value matrix, malformed inputs) and checkCapEffMask (allowlist matrix, sentinel branchability, message content). Covers AC (a)–(d).
  • internal/relay/caps_linux_test.go — Linux-only seam test for checkCapabilitiesWithReader with injected readStatus closures (allowed mask → nil, disallowed → sentinel, missing CapEff → wrapped parse error, read error → propagated).
  • go test -race ./... clean, go vet ./... clean on both GOOS=darwin and GOOS=linux (cross-compile).
  • go test -c -o /dev/null ./internal/relay/ under GOOS=linux confirms the Linux test file compiles.

Architecture compliance

  • New files exactly as specced: caps.go (sentinel, allowlist, parser, mask checker), caps_linux.go (Linux entry + read seam), caps_other.go (non-Linux no-op, explicit //go:build !linux), plus the two test files.
  • Allowlist content: CAP_NET_BIND_SERVICE only (justified in code comment per Dockerfile + fly.toml — autocert binds :80/:443 from uid 65532).
  • Sentinel ErrUnexpectedCapability, branchable via errors.Is; wrapped with all offending bit names + the allowlist + operator-fix string. All unexpected bits reported in one error (not first-only).
  • Test seam is at the readStatus func() (string, error) boundary, so AC case (d) (malformed /proc/self/status content) exercises parseCapEff end-to-end.
  • main.go wiring is one block immediately after CheckInsecureListenInProduction, before startedAt. Exit code 2, same as other boot-time refusals. No production-mode gating.
  • Log-hygiene gate: GOOS is embedded in the non-Linux log msg rather than as a structured key, keeping the AC's literal message format without growing the allowedLogKeys allowlist for a startup-only diagnostic.
  • Security-review SHOULD FIX honored: the main-side log line emits err + fixed fix string only — no raw cap_eff hex value field.

🤖 Generated with Claude Code

ilmoniemi and others added 2 commits May 13, 2026 09:08
…st (#79)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…allowlist (#79)

Adds a Linux-only boot-time check that parses /proc/self/status's CapEff
hex mask and aborts if any bit outside an explicit allowlist
(CAP_NET_BIND_SERVICE only) is set. On non-Linux platforms the check is
a no-op that logs a single skip line at startup. Build-tag split via the
new _linux.go / _other.go convention.

Wired into cmd/pyrycode-relay/main.go after flag-parse, before any
listener starts; exit code 2 matches existing boot-time configuration
refusals. No production-mode gating — stray capabilities are never
legitimate.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ilmoniemi
Copy link
Copy Markdown
Contributor Author

Code Review: #79

Decision: PASS

Findings

No MUST FIX, no SHOULD FIX.

  • [NIT] internal/relay/caps.go:135 — the parse error includes the offending value (fmt.Errorf("relay: parsing /proc/self/status CapEff %q: %w", value, err)), which is slightly richer than what the spec's design block showed. This is an improvement, not a deviation worth changing.
  • [NIT] internal/relay/caps_linux_test.go:30 — the case name "reader error propagates wrapped" is mildly misleading: checkCapabilitiesWithReader returns the injected error as-is (the wrapping happens in readProcSelfStatus, which is replaced by the fake here). The test still asserts the right behaviour (wantErr=true, wantSentinel=false); only the name suggests an extra wrap step that isn't on the seam-test path.

Summary

Spec-compliant implementation of the Linux capability allowlist boot check:

  • ACs covered. (a) empty CapEff → nil, (b) only-allowlisted CapEff → nil, (c) bit outside allowlist → sentinel via errors.Is, (d) malformed /proc/self/status → wrapped error (not panic) — all exercised by TestParseCapEff_ValueMatrix and TestCheckCapEffMask_AllowlistMatrix in caps_test.go, plus the seam in caps_linux_test.go.
  • Build-tag split is correct. caps_linux.go (auto-applied suffix) + caps_other.go (explicit //go:build !linux) — mutually exclusive, no GOOS triggers both/neither. First file in the repo with a build tag, establishing the convention as the spec intends.
  • Wiring at cmd/pyrycode-relay/main.go:53-58 is adjacent to the just-merged CheckInsecureListenInProduction block, exits with code 2, no production-mode predicate — matches the AC and the spec verbatim.
  • Sentinel + allowlist mirror the ErrCacheDirInsecure / ErrInsecureListenInProduction shape. AllowedCapabilities as a typed slice (rather than a bitmask constant) is the right call for operator-facing error messages.
  • Test seam is injected at the reader boundary (not the mask boundary), which threads parseCapEff end-to-end and lets the malformed-input case ride the seam. Matches the spec's explicit decision.

Security-sensitive checks (label present on #79)

  • Architect's ## Security review section in the spec is present with PASS verdict — obligation relay: routing-envelope wrapper type — marshal, unmarshal, tests #1 met.
  • The one SHOULD FIX flagged inline by the architect ("do not extend the log line to add a cap_eff field with the raw hex value") is honored: the log line at cmd/pyrycode-relay/main.go:54-57 carries only err (which contains symbolic names + bit positions) and fix (a fixed string). No raw mask is logged separately.
  • No new subprocess calls, no crypto, no path concatenation, no TOCTOU. Single hard-coded read of /proc/self/status. No //#nosec annotations.
  • caps_other.go deviates from the spec's design block by embedding GOOS into the slog msg via fmt.Sprintf rather than as a structured "goos" field. The deviation is well-commented (avoids growing allowedLogKeys for a startup-only diagnostic) and produces the literal AC text "skipping linux-only capability check on ". Reasonable.

CI: go vet, go build, and go test -race ./... all green locally.

Adds feature doc for the boot-time CapEff check, ADR-0009 capturing
the new _<goos>.go / _other.go build-tag convention, and a per-ticket
codebase note.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ilmoniemi ilmoniemi merged commit d2cda07 into main May 13, 2026
4 checks passed
@ilmoniemi ilmoniemi deleted the feature/79 branch May 13, 2026 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

relay: refuse to boot when Linux effective capabilities exceed allowlist

1 participant