From 0508be4f3075cc1b4ca65f1c104a6098d738284e Mon Sep 17 00:00:00 2001 From: Juhana Ilmoniemi Date: Mon, 11 May 2026 12:00:47 +0300 Subject: [PATCH 1/3] spec: log-key allowlist test (#36) --- .../architecture/36-log-key-allowlist-test.md | 152 ++++++++++++++++++ 1 file changed, 152 insertions(+) create mode 100644 docs/specs/architecture/36-log-key-allowlist-test.md diff --git a/docs/specs/architecture/36-log-key-allowlist-test.md b/docs/specs/architecture/36-log-key-allowlist-test.md new file mode 100644 index 0000000..89db873 --- /dev/null +++ b/docs/specs/architecture/36-log-key-allowlist-test.md @@ -0,0 +1,152 @@ +# Spec — Structured-log key allowlist (test-time gate) (#36) + +## Files to read first + +- `internal/relay/client_endpoint.go:55-83` — three log call sites (`phone_register_no_server`, `phone_registered`, `phone_unregistered`); shape of multi-line `key, value` pairs the AST walker must traverse. +- `internal/relay/server_endpoint.go:60-95` — three log call sites (`server_id_conflict`, `server_claimed`, `server_released`); note `server_released` is a single-line call (`"server_id", serverID` inline with msg) — both shapes must work. +- `internal/relay/forward.go:38-72, 110-160` — eight log call sites across two forwarder loops; widest fan-out of keys. Source of the `binary_conn_id` key (distinct from `conn_id`). +- `docs/threat-model.md:46-55` — § Log hygiene. `MUST NOT` / `MAY` lists; the new doc cross-reference attaches at the end of the bullet list. +- `docs/lessons.md` § "Credentials the relay does not validate are presence-checked then discarded — never logged…" — the layered-defence posture this ticket adds the fourth layer to. +- `internal/relay/registry.go` — read enough to confirm it has zero `logger.*` call sites (registry has no logger dep); the test will assert it's silent only by absence of matches, not by an explicit check. +- Go stdlib reminder: `go/parser.ParseFile(fset, path, nil, parser.SkipObjectResolution)` + `ast.Inspect` is the idiomatic walker; we do not need `go/types`. + +## Context + +`docs/threat-model.md` § Log hygiene lists fields the relay must never log (tokens, payload bodies, full headers/URLs) and fields it may log (server-id, conn-id, device-name, remote host, event types, close codes). Today the rule is enforced by code review and the structural absence of token-bearing code paths (see `docs/lessons.md` "Credentials the relay does not validate…"). One careless PR re-introduces a leak. This ticket adds a fourth defence layer: a Go test that fails when any non-test file in `internal/relay/` contains a `logger.{Info,Warn,Error,Debug}` call carrying a key not present in an in-repo allowlist. + +The allowlist becomes the canonical machine-checkable form of the threat-model rule. The doc continues to enumerate intent (what may/must-not appear); the allowlist enumerates the exact key names. Adding a logged key requires editing the allowlist in the same commit — that diff is what reviewers (human + future linters) gate on. + +## Design + +Three artefacts: + +### 1. `internal/relay/log_allowlist.go` (new, ~20 lines) + +```go +package relay + +// allowedLogKeys is the closed set of structured-log keys the relay is +// permitted to emit on a slog.Logger call. Adding a key here is the +// commit-time gate that pairs with the threat-model rule in +// docs/threat-model.md § Log hygiene. TestLogKeysAreAllowlisted reads +// from this map; never inline new keys at a call site without adding +// the key here in the same commit. +var allowedLogKeys = map[string]struct{}{ + "server_id": {}, + "conn_id": {}, + "binary_conn_id": {}, + "device_name": {}, + "remote": {}, + "binary_version": {}, + "err": {}, +} +``` + +- Unexported. Only the test in the same package reads it; no external consumer should depend on this set. +- Lives in a non-test `.go` file (not `_test.go`) so that the threat-model doc can point at a stable, non-fixture path and so `go vet` / IDEs surface it as production code that pairs with the doc. +- Comment line names the doc section and the test, so a developer adding a key sees the contract without leaving the file. + +The seven initial entries are the exhaustive set produced by walking the package on `main` as of this ticket — every existing key must already be in the allowlist for the test to pass on the same commit. The ticket's "test passes against current main" AC is enforced by this file's content, not by the test logic. + +### 2. `internal/relay/log_keys_test.go` (new, ~120 lines) + +Single test, no fixtures, no testdata. Pure stdlib (`go/parser`, `go/ast`, `go/token`, `strconv`, `path/filepath`). + +```go +func TestLogKeysAreAllowlisted(t *testing.T) { … } +``` + +Algorithm: + +1. **Discover files.** `filepath.Glob("*.go")` from the test's working directory (which is `internal/relay/` when the test runs). Exclude names ending in `_test.go`. Excluding the test file itself is essential — it contains string literals that look like log keys (e.g. assertion messages). +2. **Parse each file.** `parser.ParseFile(fset, name, nil, parser.SkipObjectResolution)`. A parse error fails the test with the file path. +3. **Walk for log calls.** `ast.Inspect` each file. Match `*ast.CallExpr` where: + - `Fun` is `*ast.SelectorExpr`, and + - `Sel.Name` is one of `{"Info", "Warn", "Error", "Debug"}`. + No type resolution. False positives (a non-logger type with a method named `Info`) are not present in this package and are acceptable noise if introduced later — the violation message will name the file:line so a reviewer can disambiguate. +4. **Extract keys.** For each match, `args := call.Args[1:]` (skip the msg arg). Iterate `args` two at a time. For each pair at even index `i`: + - `args[i]` is a `*ast.BasicLit` with `Kind == token.STRING` → `strconv.Unquote` to get the key. + - Otherwise → record a violation: `"%s:%d: non-literal log key argument (dynamic key defeats the allowlist; use a literal string key)"`. + - Odd-length argument list (a key with no value) → also a violation; same `non-literal-or-malformed` family. Stop walking that call's args. +5. **Compare to allowlist.** Each extracted literal key not in `allowedLogKeys` → violation: `"%s:%d: log key %q is not in allowedLogKeys (see internal/relay/log_allowlist.go and docs/threat-model.md § Log hygiene)"`. +6. **Report.** Accumulate all violations across all files (do not fail at first hit). At end of walk, if violations is non-empty, `t.Errorf` once per violation followed by a final `t.Fatalf("found %d log-hygiene violation(s)", n)`. Failure surface names file path, line, and offending key without `-v`. + +Out-of-scope methods, captured as a guard not as full coverage: + +- `Logger.With`, `Logger.LogAttrs`, `Logger.Log`, and any `slog.` (`slog.String`, `slog.Any`, `slog.Group`, …) are not in the matched set. None are used in the current codebase. +- The walker emits a separate "unsupported log shape" violation if it encounters a `SelectorExpr.Sel.Name` in `{"With", "LogAttrs", "Log"}` on any receiver inside a non-test file. Message: `"%s:%d: %s.%s used; this method is not gated by TestLogKeysAreAllowlisted. Either convert the call to positional Info/Warn/Error/Debug or extend the walker."` This is the "future contributor switches to `With` and bypasses the gate" hedge — a deterministic loud failure that forces the next contributor to either extend the test or convert the call. Costs ~6 lines; pays its own freight the first time anyone touches log structure. + +The walker is local to the test file (a small visitor func over `ast.Inspect` is enough; no need for a separate type or table). Each violation is a `string`; no need for a `Violation` struct. + +### 3. `docs/threat-model.md` § Log hygiene — one-line addition + +After the existing `MAY be logged` bullet, append a new paragraph: + +> **Enforcement:** The canonical set of permitted keys lives in `internal/relay/log_allowlist.go`. `TestLogKeysAreAllowlisted` in `internal/relay/log_keys_test.go` AST-walks every non-test `.go` file in the package on each `make test` run and fails if any `logger.{Info,Warn,Error,Debug}` call carries a key absent from that set or uses a dynamic (non-string-literal) key. Adding a logged key requires editing the allowlist file in the same commit. + +This converts the existing "Residual risk: future contributor adds a debug log statement…" sentence from a guarded risk into a partially-covered one. The "Future hardening" entry about a `slog` middleware remains valid — the test catches authoring-time drift; runtime redaction would catch operational drift (a different threat). Do **not** delete the residual-risk or future-hardening paragraphs; they cover the gaps the test does not (runtime, payload-bearing values misrouted to an allowlisted key like `err`). + +## Concurrency model + +None. The test is sequential, single-goroutine, no I/O outside `os.ReadFile` via `parser.ParseFile`. No timers, no contexts. + +## Error handling + +- Parse error on any walked file → `t.Fatalf("parse %s: %v", path, err)`. The test cannot proceed if it cannot read the package; this is a hard fail. +- Glob error → same. Glob does not return errors for empty matches; an empty match is itself a `t.Fatalf` (something is wrong if `internal/relay/` has zero non-test `.go` files when the test runs). +- Allowlist violations → accumulated, reported in full, single trailing `t.Fatalf` with count. Rationale: the developer who introduces a misnamed key in one file probably misnames it in three; reporting them all at once shortens the fix loop. +- Non-literal key violations → same accumulation path as allowlist violations; one combined trailing count is fine. Each line is self-explanatory. + +## Testing strategy + +The test is the test. Two meta-checks: + +1. **Self-check on current `main`.** Spec passes when the seven enumerated keys cover every existing call site. Developer verifies by running `go test ./internal/relay/ -run TestLogKeysAreAllowlisted` and observing pass before adding any other change. If pass fails, the allowlist content (not the test logic) is wrong — fix the allowlist, do not relax the matcher. +2. **Negative-path manual check.** Before committing, the developer temporarily edits one call site (e.g. add `"forbidden_key", "x"` to a `logger.Info`), re-runs the test, confirms failure message names the right file:line and key, reverts the edit. Document the manual check in the test's leading doc comment so a future contributor can repeat it; do not commit a permanent negative-path test that would itself be a "logs a forbidden key" pattern. + +No table-driven cases. No fixture files. The package's own source is the corpus. + +## Open questions + +1. **Glob vs. `filepath.WalkDir`.** Spec says `filepath.Glob("*.go")` because the package is flat (no subdirectories in `internal/relay/`). If a future refactor introduces a subpackage under `internal/relay/`, the test will silently miss it. Mitigation deferred to that future ticket; a one-line `WalkDir` swap is cheap. Not worth pre-emptive complexity now. +2. **`slog.SetDefault` / package-level `slog.Info`.** None used today. The walker matches `Sel.Name`, so `slog.Info("msg", "key", v)` (where `Fun` is `slog.Info`) is structurally identical to `logger.Info(...)` and *would* be caught — the receiver name is irrelevant to the AST match. No spec change needed; record here so the developer knows the matcher is broader than its docstring suggests and writes the docstring accurately. +3. **`docs/lessons.md` add-on.** Not in AC. The pattern this ticket establishes — "stochastic spec → deterministic test" — is worth a lesson entry, but per AC it is out of scope for this ticket. The dispatcher's documentation agent (post-merge) handles knowledge-base capture; the architect should not write into `lessons.md` from this spec. + +## Security review + +See § "Security review" below; appended after PASS. + +--- + +## Security review (security-sensitive ticket) + +### Threats considered + +1. **Bypass via dynamic key.** A future PR writes `logger.Info("msg", dynamicKey, value)` where `dynamicKey` is a variable carrying token bytes. The allowlist literal-key check would be irrelevant. + - **Mitigation in spec:** The walker rejects non-`*ast.BasicLit` key arguments with a distinct violation. The error message explicitly names the failure mode ("dynamic key defeats the allowlist"). This converts the attack from "silent bypass" to "build-time failure with a clear remediation." +2. **Bypass via unsupported method.** A future PR converts a call to `logger.With("token", t).Info("msg")` or `logger.LogAttrs(ctx, level, msg, slog.String("token", t))`. The visitor matches only `Info/Warn/Error/Debug`. + - **Mitigation in spec:** The walker also flags `With`/`LogAttrs`/`Log` selectors as "unsupported log shape" violations. The PR's CI fails; the developer must either convert to positional form (gated) or extend the walker in the same PR (which itself is a security-sensitive change that gets review). +3. **Bypass via attr constructors at the call site.** `logger.Info("msg", slog.String("token", t))` — `slog.String` returns an `slog.Attr`, which `slog.Logger.Info` accepts as a positional arg. The walker treats `slog.String(...)` (a `*ast.CallExpr`, not a `*ast.BasicLit`) as a non-literal key argument and rejects it. + - **Outcome:** caught by the dynamic-key rule, no extra spec needed. Worth calling out so the developer's docstring lists this case explicitly. +4. **Bypass via `fmt.Sprintf`-built key.** `logger.Info("msg", fmt.Sprintf("token_%s", id), val)` — non-`*ast.BasicLit`, caught by the dynamic-key rule. Same outcome as #3. +5. **Bypass via package-level `slog.Info`.** Walker matches by `Sel.Name`, not receiver type. `slog.Info("msg", "token", t)` is gated identically. Not a bypass — recorded in Open Questions #2 so the developer's docstring is accurate. +6. **Bypass via a *value* argument containing secrets under an allowlisted key.** `logger.Info("msg", "err", fmt.Errorf("bad token %s: …", token))`. The key is `err` (allowlisted); the value carries the secret. The spec does NOT mitigate this — the test only inspects keys, not values, and inspecting values would require dataflow analysis the AST walk cannot do. + - **Decision:** Out of scope; deliberate. The threat-model doc's `MUST NOT be logged` list addresses this at the human-review layer ("`x-pyrycode-token` values or any header that carries a token"). The "Future hardening" entry (runtime `slog` middleware that scans Attr values) remains the right place to land this — runtime is the only layer that can see formatted strings. The spec's threat-model edit must NOT claim the test covers value-side leaks. (Confirmed: the proposed enforcement paragraph says "carries a key absent from that set" — keys only, not values.) +7. **Allowlist tampering.** A malicious PR adds `"token"` to `allowedLogKeys` to unblock a key that should never be allowed. + - **Mitigation:** Code review. The allowlist file is one of the highest-leverage review targets in the repo; the doc paragraph explicitly names it as the canonical enforcement point so reviewers know to scrutinise diffs to it. Not a structural defence; that is acceptable for a contributor-trust threat. +8. **Test disabled / skipped.** A PR adds `t.Skip()` or builds without the `internal/relay` package's tests. + - **Mitigation:** `make test` runs the whole module; the test cannot be locally skipped without an explicit `t.Skip` that shows up in diff. Not a structural defence; same trust model as #7. +9. **Cross-package leakage.** A future log call site appears in `cmd/pyrycode-relay/main.go` or another package; the test walks only `internal/relay/`. + - **Decision:** Acknowledged in the ticket's "Out of Scope" block. `cmd/pyrycode-relay/main.go` has handler construction only, no `logger.{Info,Warn,Error,Debug}` calls today. If a log call lands there, the architect of that ticket extends the test (or carbons it). Not a current threat. +10. **Parse failure masks coverage.** A `.go` file with a syntax error → `t.Fatalf`. Could a tampering PR introduce an intentional parse error in a file with a forbidden log call? + - The Go compiler would refuse to build the package; `make build` fails before `make test` runs. Not a viable bypass. + +### Trust boundaries + +- **Input to the test:** the package's own source files. Not adversary-controlled at test time — the corpus is the committed code. +- **Input to the production code (the relay):** unchanged by this ticket. No new attack surface introduced. +- **Allowlist file:** trust-boundary equivalent to the threat-model doc itself — a privileged config that code review owns. + +### Verdict + +**PASS.** The spec mitigates the two viable bypass classes (dynamic key, alternate method) with deterministic structural checks. The remaining bypasses (value-side leaks, tampering, off-package logs) are explicitly acknowledged in scope, mitigated at the threat-model doc / code-review layer, and aligned with the "Future hardening: runtime redaction middleware" entry that this ticket does not displace. From 4f54286f122133f4c36a36f67c0943f14d531347 Mon Sep 17 00:00:00 2001 From: Juhana Ilmoniemi Date: Mon, 11 May 2026 12:05:07 +0300 Subject: [PATCH 2/3] relay: AST-walk test enforces log-key allowlist (#36) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a fourth defence layer for docs/threat-model.md § Log hygiene beyond spec, code review, and structural absence: a Go test that fails if any logger.{Info,Warn,Error,Debug} call in internal/relay carries a key outside the closed set in log_allowlist.go. The walker also flags non-literal key arguments (dynamic keys defeat the allowlist) and Logger.With/LogAttrs/Log selectors (method shapes the walker does not parse — a future contributor must either convert the call or extend the walker). Narrowed to calls whose first arg is a string literal so unrelated same- named calls like http.Error(w, "", code) are excluded. Threat-model § Log hygiene gains a one-line enforcement cross-reference pointing at the allowlist as the canonical machine-checkable rule. Co-Authored-By: Claude Opus 4.7 --- docs/threat-model.md | 2 + internal/relay/log_allowlist.go | 17 +++ internal/relay/log_keys_test.go | 188 ++++++++++++++++++++++++++++++++ 3 files changed, 207 insertions(+) create mode 100644 internal/relay/log_allowlist.go create mode 100644 internal/relay/log_keys_test.go diff --git a/docs/threat-model.md b/docs/threat-model.md index 5e8c071..5374f0b 100644 --- a/docs/threat-model.md +++ b/docs/threat-model.md @@ -48,6 +48,8 @@ Each threat below records four fields: - **MUST NOT be logged:** payload bodies (the inner `json.RawMessage` of an envelope), `x-pyrycode-token` values or any header that carries a token, full request headers, full URL paths if they ever carry tokens. - **MAY be logged:** server-id, `conn-id`, device-name (advertised by the binary, not user-secret), remote host (IP), event types (`upgrade`, `forward`, `close`), and close codes (`4401`, `4404`, `4409`). +**Enforcement:** The canonical set of permitted keys lives in `internal/relay/log_allowlist.go`. `TestLogKeysAreAllowlisted` in `internal/relay/log_keys_test.go` AST-walks every non-test `.go` file in the package on each `make test` run and fails if any `logger.{Info,Warn,Error,Debug}` call carries a key absent from that set or uses a dynamic (non-string-literal) key. Adding a logged key requires editing the allowlist file in the same commit. + Log retention and rotation are operator-owned. The relay writes only to stderr; on a typical deployment the systemd journal or a `logrotate`-managed flat file owns rotation. **Residual risk:** A future contributor adds a debug log statement that prints a payload body or token. Code review and the acceptance criteria of any frame-forwarding ticket are the first line of defence; runtime redaction is not built. diff --git a/internal/relay/log_allowlist.go b/internal/relay/log_allowlist.go new file mode 100644 index 0000000..6ea9e3f --- /dev/null +++ b/internal/relay/log_allowlist.go @@ -0,0 +1,17 @@ +package relay + +// allowedLogKeys is the closed set of structured-log keys the relay is +// permitted to emit on a slog.Logger call. Adding a key here is the +// commit-time gate that pairs with the threat-model rule in +// docs/threat-model.md § Log hygiene. TestLogKeysAreAllowlisted reads +// from this map; never inline new keys at a call site without adding +// the key here in the same commit. +var allowedLogKeys = map[string]struct{}{ + "server_id": {}, + "conn_id": {}, + "binary_conn_id": {}, + "device_name": {}, + "remote": {}, + "binary_version": {}, + "err": {}, +} diff --git a/internal/relay/log_keys_test.go b/internal/relay/log_keys_test.go new file mode 100644 index 0000000..75b613b --- /dev/null +++ b/internal/relay/log_keys_test.go @@ -0,0 +1,188 @@ +package relay + +// TestLogKeysAreAllowlisted enforces docs/threat-model.md § Log hygiene at +// test time. It AST-walks every non-test .go file in this package, finds +// every selector call whose method name is Info/Warn/Error/Debug (the four +// slog.Logger level methods used in this package), extracts the string- +// literal key arguments, and fails if any key is absent from +// allowedLogKeys in log_allowlist.go. +// +// Matched shape: any *ast.CallExpr whose Fun is a *ast.SelectorExpr with +// Sel.Name in {Info, Warn, Error, Debug}. The walker matches by selector +// name only, not by receiver type — so a package-level slog.Info(...) is +// caught the same way as logger.Info(...). False positives from an +// unrelated type that happens to have an Info/Warn/Error/Debug method are +// not present in this package today; if one is introduced later, the +// violation message names file:line so a reviewer can disambiguate. +// +// Narrowing: a matched call's first positional argument must itself be a +// string literal (the slog msg convention used at every existing call +// site). This filters out unrelated same-named methods/funcs whose first +// arg is something else — notably http.Error(w, msg, code) where the +// first arg is an http.ResponseWriter. A future log call with a +// computed msg would slip past the matcher, but the threat model gates +// keys, not msgs; if a contributor needs computed msgs, the convention +// (and this filter) gets revisited in the same PR. +// +// Bypasses the walker explicitly rejects, with distinct error messages: +// - non-literal key argument (e.g. a variable, slog.String(...), +// fmt.Sprintf(...)) — dynamic keys defeat the allowlist. +// - odd-length key/value arg list — malformed; stops walking that call. +// - logger.With(...) / LogAttrs(...) / Log(...) — these methods accept +// keys in shapes this walker does not parse. A non-test file using +// them must either be converted to positional Info/Warn/Error/Debug +// or the walker must be extended in the same PR. +// +// Out of scope (deliberate, see spec): value-side leaks (an allowlisted +// key like "err" carrying a string that contains a secret). The test +// inspects keys only. Value-side redaction belongs in the future runtime +// slog middleware named in docs/threat-model.md § Future hardening. +// +// Negative-path manual check (do not commit a permanent negative test): +// to confirm the walker reports correctly, temporarily add e.g. +// `"forbidden_key", "x"` to a logger.Info call in this package and +// re-run `go test ./internal/relay/ -run TestLogKeysAreAllowlisted`. The +// failure should name the file, line, and offending key without -v. +// Revert before committing. + +import ( + "fmt" + "go/ast" + "go/parser" + "go/token" + "path/filepath" + "strconv" + "strings" + "testing" +) + +func TestLogKeysAreAllowlisted(t *testing.T) { + matches, err := filepath.Glob("*.go") + if err != nil { + t.Fatalf("glob *.go: %v", err) + } + var nonTest []string + for _, f := range matches { + if strings.HasSuffix(f, "_test.go") { + continue + } + nonTest = append(nonTest, f) + } + if len(nonTest) == 0 { + t.Fatalf("no non-test .go files found in cwd; test is in the wrong directory") + } + + levelMethods := map[string]struct{}{ + "Info": {}, + "Warn": {}, + "Error": {}, + "Debug": {}, + } + unsupportedMethods := map[string]struct{}{ + "With": {}, + "LogAttrs": {}, + "Log": {}, + } + + var violations []string + fset := token.NewFileSet() + + for _, path := range nonTest { + file, err := parser.ParseFile(fset, path, nil, parser.SkipObjectResolution) + if err != nil { + t.Fatalf("parse %s: %v", path, err) + } + + ast.Inspect(file, func(n ast.Node) bool { + call, ok := n.(*ast.CallExpr) + if !ok { + return true + } + sel, ok := call.Fun.(*ast.SelectorExpr) + if !ok { + return true + } + name := sel.Sel.Name + + if _, bad := unsupportedMethods[name]; bad { + pos := fset.Position(call.Lparen) + violations = append(violations, fmt.Sprintf( + "%s: %s used; this method is not gated by TestLogKeysAreAllowlisted. "+ + "Either convert the call to positional Info/Warn/Error/Debug or extend the walker.", + pos, selString(sel))) + return true + } + if _, ok := levelMethods[name]; !ok { + return true + } + + // args[0] is the msg; key/value pairs follow. Require msg + // to be a string literal — this is the slog convention at + // every existing call site, and the filter that + // distinguishes logger.Info(...) from same-named non-slog + // calls like http.Error(w, "", code). + if len(call.Args) < 1 { + return true + } + msgLit, ok := call.Args[0].(*ast.BasicLit) + if !ok || msgLit.Kind != token.STRING { + return true + } + args := call.Args[1:] + i := 0 + for i < len(args) { + keyArg := args[i] + lit, isLit := keyArg.(*ast.BasicLit) + if !isLit || lit.Kind != token.STRING { + kpos := fset.Position(keyArg.Pos()) + violations = append(violations, fmt.Sprintf( + "%s: non-literal log key argument in %s call "+ + "(dynamic key defeats the allowlist; use a literal string key)", + kpos, name)) + // Stop walking this call's args — we can't trust the pairing. + break + } + key, err := strconv.Unquote(lit.Value) + if err != nil { + kpos := fset.Position(keyArg.Pos()) + violations = append(violations, fmt.Sprintf( + "%s: unparseable string literal log key %s: %v", + kpos, lit.Value, err)) + break + } + if _, ok := allowedLogKeys[key]; !ok { + kpos := fset.Position(keyArg.Pos()) + violations = append(violations, fmt.Sprintf( + "%s: log key %q is not in allowedLogKeys "+ + "(see internal/relay/log_allowlist.go and docs/threat-model.md § Log hygiene)", + kpos, key)) + } + // Skip the value at i+1; an odd-length list is a malformed call. + if i+1 >= len(args) { + kpos := fset.Position(keyArg.Pos()) + violations = append(violations, fmt.Sprintf( + "%s: log key %q has no paired value (odd-length key/value list)", + kpos, key)) + break + } + i += 2 + } + return true + }) + } + + if len(violations) == 0 { + return + } + for _, v := range violations { + t.Error(v) + } + t.Fatalf("found %d log-hygiene violation(s)", len(violations)) +} + +func selString(sel *ast.SelectorExpr) string { + if id, ok := sel.X.(*ast.Ident); ok { + return id.Name + "." + sel.Sel.Name + } + return "." + sel.Sel.Name +} From 881e81668f768e76c7b00ea9eeb417336b6a02b4 Mon Sep 17 00:00:00 2001 From: Juhana Ilmoniemi Date: Mon, 11 May 2026 12:10:34 +0300 Subject: [PATCH 3/3] =?UTF-8?q?docs:=20log-key=20allowlist=20gate=20?= =?UTF-8?q?=E2=80=94=20codebase=20note=20+=20INDEX=20cross-cutting=20entry?= =?UTF-8?q?=20(#36)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per-ticket #36 codebase note describing the AST-walk test, the seven-key allowlist, the three violation classes (key not in allowlist, non-literal key, unsupported method), and the threats it gates vs. those it deliberately does not (value-side leaks, tampering, off-package logs). Lessons-learned section captures the "stochastic spec → deterministic test" pattern and the selector-name-AST-match-is-enough heuristic. INDEX.md gains a one-line Cross-cutting entry pointing at the allowlist file and test as the canonical machine-checkable rule. Co-Authored-By: Claude Opus 4.7 --- docs/knowledge/INDEX.md | 1 + docs/knowledge/codebase/36.md | 41 +++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) create mode 100644 docs/knowledge/codebase/36.md diff --git a/docs/knowledge/INDEX.md b/docs/knowledge/INDEX.md index 14fe385..8b0e762 100644 --- a/docs/knowledge/INDEX.md +++ b/docs/knowledge/INDEX.md @@ -32,6 +32,7 @@ One-line pointers into the evergreen knowledge base. Newest entries at the top o ## Cross-cutting +- [Log-key allowlist (test-time gate)](../../internal/relay/log_allowlist.go) — closed `allowedLogKeys` set in `internal/relay/log_allowlist.go` paired with `TestLogKeysAreAllowlisted` in `internal/relay/log_keys_test.go`; AST-walks every non-test `.go` file in the package and fails if a `logger.{Info,Warn,Error,Debug}` call carries a key absent from the set, a non-string-literal key, or a `Logger.With/LogAttrs/Log` selector the walker can't parse. Fourth defence layer for threat-model § *Log hygiene* (alongside spec, code review, structural absence) (#36). - [Threat model](../threat-model.md) — operational threats to the relay-as-deployed-process: deploy, supply chain, DoS, log hygiene, cert handling, TLS config, error leakage. Complements (does not replace) the protocol spec's wire-protocol Security model. - [Project memory](../PROJECT-MEMORY.md) — what's built, patterns established, current state. - [Lessons](../lessons.md) — gotchas worth carrying forward. diff --git a/docs/knowledge/codebase/36.md b/docs/knowledge/codebase/36.md new file mode 100644 index 0000000..3a463c0 --- /dev/null +++ b/docs/knowledge/codebase/36.md @@ -0,0 +1,41 @@ +# Ticket #36 — AST-walk test enforces log-key allowlist + +Fourth defence layer for `docs/threat-model.md` § *Log hygiene*. Before this ticket the rule lived in three places that all rely on human attention: the threat-model doc, code review of each new log call, and the structural absence of token-bearing values in routing-path code (`docs/lessons.md` "Credentials the relay does not validate…"). One careless PR re-introduces a leak. This ticket adds a deterministic test-time gate: any `logger.{Info,Warn,Error,Debug}` call in `internal/relay/` whose key is absent from a closed in-repo allowlist fails `go test`, so the rule lands in CI rather than in production-log review. + +## Implementation + +- **`internal/relay/log_allowlist.go`** — unexported `var allowedLogKeys = map[string]struct{}{…}` with the seven keys that cover every existing call site: `server_id`, `conn_id`, `binary_conn_id`, `device_name`, `remote`, `binary_version`, `err`. Lives in a non-test `.go` file (not `_test.go`) so the threat-model doc can point at a stable production path. File comment names the test and the doc section so a developer adding a key sees the full contract without leaving the file. +- **`internal/relay/log_keys_test.go`** — single test, pure stdlib (`go/parser`, `go/ast`, `go/token`, `strconv`, `path/filepath`). `filepath.Glob("*.go")` from cwd, drop `_test.go` files, `parser.ParseFile(..., parser.SkipObjectResolution)` each, `ast.Inspect` for `*ast.CallExpr` whose `Fun` is a `*ast.SelectorExpr` with `Sel.Name ∈ {Info, Warn, Error, Debug}`. Skip-the-test-file is essential — the test itself contains string literals that look like log keys. +- **Selector-name match, not receiver type.** Walker matches by `Sel.Name` only, so `slog.Info(...)` is gated identically to `logger.Info(...)`. No `go/types` resolution; intentional — false positives from an unrelated type that happens to expose `Info/Warn/Error/Debug` are not present in the package today and would surface as a `file:line` violation if introduced later. +- **First-arg-must-be-string-literal narrowing.** A matched call's `Args[0]` must itself be `*ast.BasicLit` with `Kind == token.STRING`. This filters `http.Error(w, msg, code)` and anything else that shares a method name but starts with a non-string arg. The narrowing is documented in the test docstring; a future log call with a computed msg would slip past it, and the convention (and this filter) gets revisited in that PR. +- **Three distinct violation classes, all accumulated and reported together** (single trailing `t.Fatalf("found %d log-hygiene violation(s)", n)` so a developer who misnames a key in three places fixes them in one pass): + 1. *Key not in allowlist* — `"%s: log key %q is not in allowedLogKeys (see internal/relay/log_allowlist.go and docs/threat-model.md § Log hygiene)"`. Names file path, line, and key without `-v`. + 2. *Non-literal key argument* — `"%s: non-literal log key argument in %s call (dynamic key defeats the allowlist; use a literal string key)"`. Catches `slog.String(dynamicKey, v)`, `fmt.Sprintf("k_%s", id)`, bare variables. Stops walking that call's args (pairing can't be trusted past the first bad key). + 3. *Unsupported method shape* — `"%s: %s.%s used; this method is not gated by TestLogKeysAreAllowlisted. Either convert the call to positional Info/Warn/Error/Debug or extend the walker."` for any selector with `Sel.Name ∈ {With, LogAttrs, Log}`. The "future contributor switches to `Logger.With(...).Info(...)` and bypasses the gate" hedge — ~6 lines that pay their freight the first time anyone touches log structure. +- **`docs/threat-model.md` § Log hygiene** gains a one-line **Enforcement** paragraph pointing at the allowlist file and the test as the canonical machine-checkable rule. The existing *Residual risk* and *Future hardening* paragraphs stay — the test catches authoring-time drift; runtime redaction (still future) catches operational drift and value-side leaks. + +## Threats this gates, and what it deliberately does not + +Captured in the test docstring and the architect's security review (verdict PASS). Summary: + +- **Caught structurally** — dynamic keys (`slog.String(dynKey, v)`, `fmt.Sprintf`-built keys, bare variables); switches to `Logger.With/LogAttrs/Log` shapes the walker can't parse; package-level `slog.Info(...)` (same selector match). +- **Acknowledged out of scope, by design** — value-side leaks (an allowlisted key like `err` carrying a string with a token in it). The test inspects keys, not values; value-side redaction is the future `slog` middleware named in the threat-model's *Future hardening* line. The Enforcement paragraph deliberately says "carries a key absent from that set" — keys only. +- **Acknowledged trust-layer, not structural** — allowlist tampering (a PR adds `"token"` to `allowedLogKeys`) and `t.Skip()`-style test disablement. Both visible in diff; code review owns them, same trust model as every other in-repo policy file. +- **Acknowledged scope boundary** — the test walks `internal/relay/` only. `cmd/pyrycode-relay/main.go` has handler construction and no `logger.{Info,Warn,Error,Debug}` calls today; a future log call there extends the test (or carbons it). Recorded in the ticket's *Out of Scope* block. + +## Negative-path check (manual, not committed) + +Per the test docstring: temporarily add e.g. `"forbidden_key", "x"` to a `logger.Info` call, re-run `go test ./internal/relay/ -run TestLogKeysAreAllowlisted`, confirm the failure names the right file:line and key without `-v`, revert. Documented in the test file so a future contributor can repeat it. No permanent negative-path fixture — committing a file that contains a forbidden log key would itself be the pattern the test is supposed to ban. + +## Lessons learned + +- **"Stochastic spec → deterministic test" pairs naturally with the threat-model layer.** The pattern this ticket establishes: when a threat-model bullet enumerates a closed set ("must not log X, may log Y"), the closed set wants a machine-checkable home in the source tree, and the doc bullet becomes a pointer at the canonical rule. The doc keeps stating intent; the source file states the exact names; the test gates drift. This is the same shape as `internal/protocol/closecodes/codes.go` (the closed set of WS close codes) — when you find yourself writing "the list of X is Y, Z, W" in a doc, ask whether the list wants to live in a `.go` file the source can be tested against. (Belt-and-suspenders fabric rule from `CLAUDE.md`: the threat-model rule is the stochastic agent layer; this test is the deterministic backstop. Different fabric, by design.) +- **Selector-name AST match is sufficient where the package has a stable receiver vocabulary.** No `go/types` resolution needed for a single-package gate of this shape. The walker is ~100 lines of stdlib; the cost of type resolution (build the package, run the type checker, handle build-tag combinatorics) would have been the dominant complexity for no real precision gain. Reach for `go/types` when the same check has to span receivers from multiple packages or interact with generics — not for in-package selector-name gates. +- **Narrow the matcher *before* the test runs against the live corpus, not after.** The first-arg-must-be-string-literal filter was added during implementation when `http.Error(w, msg, code)` showed up as a false positive. The general rule: a structural walker that matches by selector name will pick up unrelated methods sharing the name; lean on argument-shape filters (here, "first arg is a string literal — i.e., the slog msg convention") to scope it down. Cheaper than learning the false-positive list by running the test and triaging output. +- **Accumulate-and-report-all beats fail-at-first.** Three violation classes, all surfaced in one run, single `t.Fatalf` with count. The "developer who misnames a key in one file probably misnames it in three" guess turned out right in pre-commit testing; the negative-path check (one synthetic forbidden key) genuinely is a single-violation scenario, but a real refactor introducing a new key name hits every call site that uses it. The fix-loop shortens by a factor equal to the call-site count. + +## Cross-links + +- [Spec: 36-log-key-allowlist-test](../../specs/architecture/36-log-key-allowlist-test.md) — architect's design and security review (verdict: PASS). +- [Threat model § Log hygiene](../../threat-model.md) — the doc rule this test enforces; the new **Enforcement** paragraph points back to the allowlist file. +- [`docs/lessons.md`](../../lessons.md) — pre-2026-05-11 historical entry on layered defence (spec, code review, structural absence); this ticket adds the fourth layer.