fix(limit): atomic redis commits and resolved-var validation#13467
Open
shreemaan-abhishek wants to merge 5 commits into
Open
fix(limit): atomic redis commits and resolved-var validation#13467shreemaan-abhishek wants to merge 5 commits into
shreemaan-abhishek wants to merge 5 commits into
Conversation
The redis policy committed the leaky-bucket state with separate
GET/GET/SET/SET commands. Concurrent requests could read the same stale
excess value, each conclude they are within limits, and all be admitted,
exceeding the configured rate.
The commit path now runs a single redis Lua script that reads, decays,
checks and persists the state atomically. The excess/last keys share a
{...} hash tag so they always map to the same Redis Cluster slot, which
EVAL with multiple keys requires.
A conn/burst value resolved from a request variable was only passed through tonumber(). Zero, negative, fractional, or out-of-safe-range values were accepted and fed into floor((conn - 1) / max), producing wrong delay calculations. Such values are now rejected with errors that report the offending input. Also log a warning when the redis-backed limit-conn rejects a connection, since connections outliving key_ttl expire from the tracking ZSET and can lead to undercounting.
lj-releng forbids reading globals inside functions; access floor via a module-level local instead.
…luster lua-resty-rediscluster rejects EVAL with more than one key, so the hash-tag two-key approach failed every commit with "Cannot execute eval with more than one keys for redis cluster". Store the leaky bucket state (excess, last) in a single Redis hash instead.
membphis
previously approved these changes
Jun 4, 2026
AlinsRan
reviewed
Jun 4, 2026
nic-6443
approved these changes
Jun 4, 2026
AlinsRan
approved these changes
Jun 4, 2026
membphis
reviewed
Jun 5, 2026
| return nil, "resolved value is not a number: " .. tostring(value) | ||
| return nil, "resolved value is not a number: " .. tostring(resolved) | ||
| end | ||
| if value <= 0 then |
Member
There was a problem hiding this comment.
Could we split validation between conn and burst here? burst = 0 is valid in the schema/docs (integer, minimum = 0), but this helper is also used for conf.burst and rule.burst, so a config like burst = "${http_burst ?? 0}" will pass schema and then fail requests with 500 once it resolves to 0 (or skip the matching rule in rules). conn should stay positive, but dynamic burst should allow 0 while still rejecting negative, fractional, and unsafe integer values.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR fixes two issues in the redis-backed rate limiting plugins.
limit-req: non-atomic redis commit lets concurrent requests exceed the configured rate
The redis policy commit path in
limit-req/util.luaranGET excess/GET last/ limit check /SET excess/SET lastas separate commands. Under concurrent load, multiple requests can read the same staleexcess, each conclude they are within limits, and all be admitted before anySETlands, so clients can exceed the configured rate.The commit path now runs a single redis Lua script (
EVAL) that performs the read, leaky-bucket decay, limit check, and write atomically. Theexcess/lastkeys now share a{...}hash tag so they always map to the same Redis Cluster slot, whichEVALwith multiple keys requires (limit-req/util.luais shared by the redis-cluster policy). The dry-run (non-commit) path keeps two plainGETs since it writes nothing back. First-request admission (no prior state,excess = 0) is preserved.Note on upgrades: the redis key format changes from
limit_req:<key>excessto{limit_req:<key>}excess. Old keys are simply ignored and expire via their existing TTL (a few seconds), so the rate-limit state resets once when upgrading.limit-conn: variable-resolved conn/burst values are not validated
connandburstcan be resolved from request variables (e.g. a header). The resolved value was only passed throughtonumber(): values like0,-1,1.5, or values above 2^53-1 were accepted and fed intofloor((conn - 1) / max), producing wrong delay calculations. Resolved values are now rejected unless they are positive integers within the safe integer range, and error messages report the offending input.Also adds a warning log when the redis-backed limit-conn rejects a connection, noting that connections outliving
key_ttlexpire from the tracking ZSET and can lead to undercounting of long-lived connections (e.g. WebSocket, gRPC streaming).Which issue(s) this PR fixes:
N/A
Checklist