Skip to content

feat(relay): per-IP token-bucket rate-limit type with eviction (#50)#52

Merged
ilmoniemi merged 4 commits into
mainfrom
feature/50
May 11, 2026
Merged

feat(relay): per-IP token-bucket rate-limit type with eviction (#50)#52
ilmoniemi merged 4 commits into
mainfrom
feature/50

Conversation

@ilmoniemi
Copy link
Copy Markdown
Contributor

What

Introduces IPRateLimiter (internal/relay/ratelimit.go) — the pre-upgrade attempt-throttling primitive named in docs/threat-model.md § DoS resistance. Per-IP token bucket guarded by one sync.Mutex; integer-token refill (no float drift); single owned eviction goroutine started in NewIPRateLimiter, stopped via idempotent Close. No new dependencies (stdlib only).

This is the primitive only. IP-extraction and upgrade-handler wiring are sibling tickets — cmd/pyrycode-relay/main.go is intentionally untouched.

Issue

Closes #50. Split from #46; sibling of the IP-extraction helper ticket and of #47 (wiring).

Design highlight — two-field eviction invariant

Each bucket carries two timestamps:

  • refillLast — last credited refill, advances in integer-token chunks (preserves fractional elapsed time).
  • pollLast — last Allow call, advances to now on every call including denies.

Eviction predicate is now - pollLast >= burst * refillEvery. The invariant "eviction never drops a bucket below capacity" follows from pollLast being load-bearing on every call: an actively-hammered (denied) bucket stays resident, so an attacker cannot recycle a fresh burst by waiting out an eviction cycle. Single-timestamp designs structurally cannot satisfy both this and the fractional-refill-preservation property.

Testing

10 tests in internal/relay/ratelimit_test.go:

  • BurstThenDeny, RefillAfterAdvance, FractionalRefillPreserved, PerIPIsolation — refill math + isolation.
  • EvictionDropsIdleBucket — sweep drops a bucket past threshold.
  • EvictionPreservesActiveBucket — regression guard for the two-field invariant.
  • EvictionGoroutineRunsAndDrops — only test using real-clock sleep; end-to-end goroutine path.
  • CloseIsIdempotent, CloseStopsEvictionGoroutine — close lifecycle.
  • RaceFreedom — 32 goroutines × 200 ops against a 1ms-sweep goroutine; run with go test -race -count=20 -run TestIPRateLimiter_RaceFreedom ./internal/relay.

Fake clock is local to the test file; constructor helper registers t.Cleanup(l.Close). Most tests construct with evictionInterval=time.Hour (goroutine effectively never fires) and drive sweep() directly under a fake clock — fully deterministic, no sleep.

All green under go test -race ./... and go vet ./...; go build ./cmd/pyrycode-relay succeeds.

Architecture compliance

Follows docs/specs/architecture/50-ip-rate-limiter.md directly: file layout, public API, refill math, eviction rule, goroutine lifecycle, and test list match the spec. The spec's security review (verdict: PASS, no findings) shaped the two-field design + the documented-not-enforced parameter contracts.

Mirrors patterns from internal/relay/registry.go (passive in-memory store under one lock; race-test shape lifted from TestRegistry_RaceFreedom). No update to docs/threat-model.md or docs/PROJECT-MEMORY.md § What's built — both are #47's job once the primitive is actually wired into traffic.

Test plan

  • go test -race ./... green
  • go vet ./... clean
  • go build ./cmd/pyrycode-relay succeeds
  • go test -race -count=20 -run TestIPRateLimiter_RaceFreedom ./internal/relay green (20 stress iterations)

🤖 Generated with Claude Code

ilmoniemi and others added 3 commits May 11, 2026 10:25
Introduces IPRateLimiter — the pre-upgrade attempt-throttling primitive
named in docs/threat-model.md § DoS resistance. Ships the type only;
IP-extraction and upgrade-handler wiring are sibling tickets.

Per-IP token bucket guarded by one sync.Mutex; integer-token refill
math (no float drift); single owned eviction goroutine started in
NewIPRateLimiter, stopped via idempotent Close.

Two timestamps per bucket decouple two concerns: refillLast anchors
integer-token refill math; pollLast anchors the last Allow call and
updates on every call including denies. This keeps eviction from
dropping a bucket an attacker is actively hammering (which would
grant a free fresh burst on the next request) while still reclaiming
genuinely idle buckets under address-space scanning. Regression-
guarded by TestIPRateLimiter_EvictionPreservesActiveBucket.

No new dependencies; stdlib only. No changes to main — the wiring
ticket is what registers defer limiter.Close() at the composition root.

Tests cover burst-then-deny, refill-after-advance, fractional-refill
preservation, per-IP isolation, eviction of idle + preservation of
active buckets, goroutine end-to-end, Close idempotency, Close stops
the goroutine, and race-freedom (32 goroutines × 200 ops vs. a
1ms-sweep goroutine).

Stress: go test -race -count=20 -run TestIPRateLimiter_RaceFreedom ./internal/relay

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ilmoniemi
Copy link
Copy Markdown
Contributor Author

Code Review: #50

Decision: PASS

Findings

  • [NIT] internal/relay/ratelimit.go:67,80now := l.now() is captured before l.mu.Lock(), so under heavy concurrent contention the order of time.Now() calls can differ from the order of critical-section entry. The unconditional b.pollLast = now (line 80) can therefore go backward between two close-in-time Allows. Benign — the worst case is slightly more eager eviction of an actively-polled bucket, which still leaves the bucket at capacity per the two-field invariant. If ever tightened, if now.After(b.pollLast) { b.pollLast = now } would suffice. Not worth changing for v1.

  • [NIT] internal/relay/ratelimit_test.go:114-138 (TestIPRateLimiter_EvictionPreservesActiveBucket) — the test name + the spec's "actively-throttled attacker hammering a depleted bucket" framing don't quite match what the test actually exercises: after Advance(burst*refillEvery + 1ms), the very next Allow("a") refills back to burst and returns true (not "still denied"). The test's own comment acknowledges this ("We don't assert on the Allow return value here…"). It does correctly validate the load-bearing property — that pollLast advances on every Allow — so the regression guard works. Consider tightening the test (or its comment) to make the actual property under test more legible: e.g. advance by less than burst*refillEvery so the bucket stays denied, or advance by more and assert pollLast (via lock-and-read) rather than relying on the present check. Doesn't block merge.

Security review (label: security-sensitive)

Spec's ## Security review section is present with verdict PASS and 12 enumerated adversarial-walk items (no findings). Verified the architect's findings against the implementation:

  • No logging in the limiter — confirmed (no log/slog import, no IP echoed in any code path).
  • IP string used as map key only — confirmed (no parsing, no net.IP allocation).
  • Integer-token math, no float drift — confirmed at ratelimit.go:71-77.
  • pollLast advances on every Allow (free-burst-recycling defence) — confirmed at line 80; regression-guarded by TestIPRateLimiter_EvictionPreservesActiveBucket.
  • Clock-backwards safe-by-default — confirmed (elapsed > 0 gate at line 69; sweep's >= predicate returns false on negative).
  • Close path: sync.Once + wg.Wait() arrangement matches the spec (line 86-89); Close() outside the once.Do so a second concurrent caller also blocks. TestIPRateLimiter_CloseIsIdempotent and TestIPRateLimiter_CloseStopsEvictionGoroutine cover both halves.
  • TestIPRateLimiter_CloseStopsEvictionGoroutine is a particularly nice piece of negative-evidence engineering: insert a bucket with pollLast = time.Unix(0,0) post-Close, sleep > several intervals, assert it's still there — proves the goroutine actually exited rather than just done being closed.

No new findings.

Other checks

  • go vet ./internal/relay/ — clean
  • go test -race -run TestIPRateLimiter ./internal/relay/ — pass
  • go test -race -count=5 -run TestIPRateLimiter_RaceFreedom ./internal/relay/ — pass
  • No new dependencies (go.mod untouched)
  • cmd/pyrycode-relay/main.go untouched per AC
  • docs/knowledge/codebase/50.md created per the per-ticket convention; INDEX.md correctly not modified (the codebase/ convention is "directory listing IS the index"; the spec's instruction to update INDEX.md contradicts that and the developer correctly followed the canonical convention)

Summary

Spec-conforming, well-tested primitive. The two-field eviction invariant is the only non-obvious design choice and is justified inline + regression-tested. Implementation matches the spec verbatim including parameter naming, comments, and test structure. Two NITs only, neither blocks merge.

PROJECT-MEMORY § Patterns established: type-owned background goroutine
(ticker + done + WaitGroup with wg.Wait outside closeOnce); two-timestamp
decoupling when one field would conflate semantic purposes; injectable
now func() time.Time as unexported field.

lessons.md: integer token-bucket refill anchor advances by token-sized
chunks (not to now) to preserve fractional time; wg.Wait outside
closeOnce.Do so concurrent Close callers also block; range+delete on a
map is spec-safe (no snapshot needed).

codebase/50.md was authored by the implementer; INDEX.md unchanged (no
new feature/decision/architecture doc — limiter is unwired until #47).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ilmoniemi ilmoniemi merged commit 3b0b4bd into main May 11, 2026
2 checks passed
@ilmoniemi ilmoniemi deleted the feature/50 branch May 11, 2026 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

relay: per-IP token-bucket rate-limit type with eviction

1 participant