feat(redis_admin): live SCAN-based discovery of celery broker queues#914
Open
thomasrockhu-codecov wants to merge 1 commit into
Open
feat(redis_admin): live SCAN-based discovery of celery broker queues#914thomasrockhu-codecov wants to merge 1 commit into
thomasrockhu-codecov wants to merge 1 commit into
Conversation
The CeleryBrokerQueueAdmin summary changelist (and the
CeleryQueueFilter sidebar that backs the drill-down picker)
previously sourced its queue list exclusively from
_resolve_celery_queue_names() — a static enumeration of
BaseCeleryConfig.task_routes plus the celery / healthcheck defaults
plus the operator-configured setup.redis_admin.celery_queues list.
That works on worker pods, but on the API pod (where the admin runs)
the worker-side setup.tasks.*.queue env vars don't ship, so
BaseCeleryConfig.task_routes resolves every route to
task_default_queue ("celery") and the admin only sees the queues an
operator has explicitly declared. Per-tenant enterprise_*_dropbox /
_mixpanel / _squareup variants — generated dynamically by
shared.celery_router based on owner config — never make it onto
that static list, so they were silently invisible in admin even
when sitting on the broker with thousands of messages.
This change makes the celery_broker summary the union of:
* _discover_celery_queues_from_broker() — a live SCAN ... TYPE list
against the broker Redis. The broker is a dedicated Memorystore
in production (services.celery_broker), so every list-typed
top-level key on it is a celery queue. Server-side TYPE filter
(Redis 6.0+) avoids a TYPE round-trip per key. Kombu-internal
prefixes (_kombu.*) and the redis_admin LSET-tombstone prefix
are deny-listed so the summary stays scoped to real queues.
Bounded by MAX_SCAN_KEYS; on broker outage / older Redis builds
that reject the TYPE arg the helper logs and returns (), and
the summary falls back transparently to the static enumeration.
* _resolve_celery_queue_names() — preserved for the existing
contract: a known-but-currently-drained queue still renders a
depth=0 row instead of vanishing the moment kombu deletes its
empty list.
The summary changelist (CeleryBrokerQueueQuerySet._materialise_summary)
and the sidebar filter (CeleryQueueFilter.lookups) both render on
the same admin response, so the resolved tuple is cached on the
HttpRequest via _SUMMARY_QUEUE_NAMES_REQUEST_ATTR — one SCAN per
page render rather than two.
Tests cover:
* SCAN TYPE=list filtering (skips kombu HASH / ZSET / unrelated
string keys);
* prefix deny-list (kombu / tombstone internals);
* deterministic sorted ordering;
* graceful fallback when SCAN ... TYPE raises (older Redis builds);
* MAX_SCAN_KEYS truncation cap;
* summary-mode union of live discovery + static defaults;
* per-request cache shared between summary materialisation and
CeleryQueueFilter.lookups.
Operators no longer need to maintain
SETUP__REDIS_ADMIN__CELERY_QUEUES on the API pod for the admin to
see the broker's queue topology — though the knob is still
honoured, now scoped to "render this drained-but-known queue at
depth=0 even when the broker has never materialised it".
Contributor
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #914 +/- ##
=======================================
Coverage 91.89% 91.89%
=======================================
Files 1316 1316
Lines 50630 50666 +36
Branches 1625 1625
=======================================
+ Hits 46525 46561 +36
Misses 3799 3799
Partials 306 306
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
CeleryBrokerQueueAdminsummary changelist (and theCeleryQueueFiltersidebar that backs its drill-down picker)previously sourced its queue list exclusively from
_resolve_celery_queue_names()— a static enumeration ofBaseCeleryConfig.task_routesplus thecelery/healthcheckdefaults plus operator-configured
setup.redis_admin.celery_queues.That works on worker pods, but on the API pod (where the admin runs)
the worker-side
setup.tasks.*.queueenv vars don't ship, soBaseCeleryConfig.task_routesresolves every route totask_default_queue("celery") and the admin only sees the queuesan operator has explicitly declared. Per-tenant
enterprise_*_dropbox/_mixpanel/_squareupvariants —generated dynamically by
shared.celery_routerbased on ownerconfig — never made it onto that static list, so they were silently
invisible in admin even when sitting on the broker with thousands of
messages.
This change makes the celery_broker summary the union of:
_discover_celery_queues_from_broker()— a liveSCAN ... TYPE listagainst the broker Redis. The broker is a dedicated Memorystore in
production (
services.celery_broker), so every list-typedtop-level key on it is a celery queue. The server-side
TYPEfilter (Redis 6.0+) avoids a
TYPEround-trip per key. Kombuinternals (
_kombu.*) and theredis_adminLSET-tombstone prefixare deny-listed so the summary stays scoped to real queues.
Bounded by
MAX_SCAN_KEYS; on broker outage / older Redis buildsthat reject the
TYPEarg, the helper logs and returns()andthe summary falls back transparently to the static enumeration.
_resolve_celery_queue_names()— preserved for the existingcontract: a known-but-currently-drained queue still renders a
depth=0row instead of vanishing the moment kombu deletes itsempty list.
The summary changelist (
CeleryBrokerQueueQuerySet._materialise_summary)and the sidebar filter (
CeleryQueueFilter.lookups) both render onthe same admin response, so the resolved tuple is cached on the
HttpRequestvia_SUMMARY_QUEUE_NAMES_REQUEST_ATTR— one SCAN perpage render rather than two.
Operators no longer need to maintain
SETUP__REDIS_ADMIN__CELERY_QUEUESon the API pod for the admin tosee the broker's queue topology, though the knob is still honoured
(now scoped to "render this drained-but-known queue at
depth=0even when the broker has never materialised it").
Test plan
New unit + integration tests (all passing locally):
test_discover_celery_queues_from_broker_returns_only_list_keys—TYPE=listfilter skips kombu'sunackedHASH /unacked_indexZSET / unrelated string keystest_discover_celery_queues_from_broker_skips_kombu_internal_prefixes—_kombu.*+__redis_admin_celery_tombstone__:*deny-listtest_discover_celery_queues_from_broker_returns_sorted_tuple— deterministic ordering for stable changelist rendertest_discover_celery_queues_from_broker_handles_scan_errors_gracefully— falls back to()on broker outage / older Redis (noTYPEarg)test_discover_celery_queues_from_broker_respects_max_scan_keys—MAX_SCAN_KEYStruncationtest_queryset_summary_mode_unions_live_broker_scan_with_static_names— summary contains both live-discovered (uploads,notify,enterprise_*) and static (celery,healthcheck) queuestest_queryset_summary_mode_skips_kombu_internal_keys— internal keys filtered end-to-end through the changelisttest_queryset_summary_mode_caches_discovery_on_request— second materialisation against the samerequestreuses the first SCANtest_queryset_summary_mode_no_request_skips_cache— standalone callers without arequesteat the per-call SCANtest_celery_queue_filter_lookups_includes_live_broker_queues— sidebar dropdown surfaces both live + static queuestest_celery_queue_filter_lookups_reuses_request_cache— sidebar shares the per-request SCAN result with the changelistManual verification on staging recommended:
/admin/redis_admin/celerybrokerqueue/on a deploy with nosetup.redis_admin.celery_queuesset; confirm queues likeuploads,notify,sync, and anyenterprise_*variants show up with their liveLLENdepthsCeleryQueueFilterdropdown lists the same queue setMade with Cursor
Note
Medium Risk
Adds live
SCAN ... TYPE listcalls against the Celery broker Redis on admin page renders; although bounded/cached with fallbacks, it changes operational visibility and could affect performance or behavior on older/broken Redis setups.Overview
Celery broker queue discovery is now live. The
CeleryBrokerQueueAdminsummary andCeleryQueueFilterdropdown no longer rely solely on_resolve_celery_queue_names(); they union it with a new broker-side discovery helper thatSCANs forTYPE=listkeys (with deny-listed internal prefixes) so dynamically materialised queues (includingenterprise_*variants) appear automatically.Performance/safety guardrails were added. Discovery is capped by
REDIS_ADMIN_MAX_SCAN_KEYS, falls back to static enumeration on errors/unsupported Redis, and the resolved queue-name tuple is cached on theHttpRequestso the summary page and sidebar share a single broker scan per render. Tests and README were updated to cover and document the new behavior.Reviewed by Cursor Bugbot for commit 773adde. Bugbot is set up for automated code reviews on this repo. Configure here.