Skip to content

fix(limit): atomic redis commits and resolved-var validation#13467

Open
shreemaan-abhishek wants to merge 5 commits into
apache:masterfrom
shreemaan-abhishek:fix/limit-req-conn-redis
Open

fix(limit): atomic redis commits and resolved-var validation#13467
shreemaan-abhishek wants to merge 5 commits into
apache:masterfrom
shreemaan-abhishek:fix/limit-req-conn-redis

Conversation

@shreemaan-abhishek
Copy link
Copy Markdown
Contributor

Description

This PR fixes two issues in the redis-backed rate limiting plugins.

limit-req: non-atomic redis commit lets concurrent requests exceed the configured rate

The redis policy commit path in limit-req/util.lua ran GET excess / GET last / limit check / SET excess / SET last as separate commands. Under concurrent load, multiple requests can read the same stale excess, each conclude they are within limits, and all be admitted before any SET lands, so clients can exceed the configured rate.

The commit path now runs a single redis Lua script (EVAL) that performs the read, leaky-bucket decay, limit check, and write atomically. The excess/last keys now share a {...} hash tag so they always map to the same Redis Cluster slot, which EVAL with multiple keys requires (limit-req/util.lua is shared by the redis-cluster policy). The dry-run (non-commit) path keeps two plain GETs since it writes nothing back. First-request admission (no prior state, excess = 0) is preserved.

Note on upgrades: the redis key format changes from limit_req:<key>excess to {limit_req:<key>}excess. Old keys are simply ignored and expire via their existing TTL (a few seconds), so the rate-limit state resets once when upgrading.

limit-conn: variable-resolved conn/burst values are not validated

conn and burst can be resolved from request variables (e.g. a header). The resolved value was only passed through tonumber(): values like 0, -1, 1.5, or values above 2^53-1 were accepted and fed into floor((conn - 1) / max), producing wrong delay calculations. Resolved values are now rejected unless they are positive integers within the safe integer range, and error messages report the offending input.

Also adds a warning log when the redis-backed limit-conn rejects a connection, noting that connections outliving key_ttl expire from the tracking ZSET and can lead to undercounting of long-lived connections (e.g. WebSocket, gRPC streaming).

Which issue(s) this PR fixes:

N/A

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

The redis policy committed the leaky-bucket state with separate
GET/GET/SET/SET commands. Concurrent requests could read the same stale
excess value, each conclude they are within limits, and all be admitted,
exceeding the configured rate.

The commit path now runs a single redis Lua script that reads, decays,
checks and persists the state atomically. The excess/last keys share a
{...} hash tag so they always map to the same Redis Cluster slot, which
EVAL with multiple keys requires.
A conn/burst value resolved from a request variable was only passed
through tonumber(). Zero, negative, fractional, or out-of-safe-range
values were accepted and fed into floor((conn - 1) / max), producing
wrong delay calculations. Such values are now rejected with errors that
report the offending input.

Also log a warning when the redis-backed limit-conn rejects a
connection, since connections outliving key_ttl expire from the
tracking ZSET and can lead to undercounting.
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. bug Something isn't working labels Jun 3, 2026
lj-releng forbids reading globals inside functions; access floor via a
module-level local instead.
…luster

lua-resty-rediscluster rejects EVAL with more than one key, so the
hash-tag two-key approach failed every commit with "Cannot execute
eval with more than one keys for redis cluster". Store the leaky
bucket state (excess, last) in a single Redis hash instead.
membphis
membphis previously approved these changes Jun 4, 2026
Comment thread apisix/plugins/limit-conn/util.lua Outdated
Signed-off-by: Abhishek Choudhary <shreemaan.abhishek@gmail.com>
return nil, "resolved value is not a number: " .. tostring(value)
return nil, "resolved value is not a number: " .. tostring(resolved)
end
if value <= 0 then
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we split validation between conn and burst here? burst = 0 is valid in the schema/docs (integer, minimum = 0), but this helper is also used for conf.burst and rule.burst, so a config like burst = "${http_burst ?? 0}" will pass schema and then fail requests with 500 once it resolves to 0 (or skip the matching rule in rules). conn should stay positive, but dynamic burst should allow 0 while still rejecting negative, fractional, and unsafe integer values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants