Skip to content

feat(redis_admin): live SCAN-based discovery of celery broker queues#914

Open
thomasrockhu-codecov wants to merge 1 commit into
mainfrom
redis-admin/live-broker-queue-discovery
Open

feat(redis_admin): live SCAN-based discovery of celery broker queues#914
thomasrockhu-codecov wants to merge 1 commit into
mainfrom
redis-admin/live-broker-queue-discovery

Conversation

@thomasrockhu-codecov
Copy link
Copy Markdown
Contributor

@thomasrockhu-codecov thomasrockhu-codecov commented May 5, 2026

Summary

The CeleryBrokerQueueAdmin summary changelist (and the
CeleryQueueFilter sidebar that backs its drill-down picker)
previously sourced its queue list exclusively from
_resolve_celery_queue_names() — a static enumeration of
BaseCeleryConfig.task_routes plus the celery / healthcheck
defaults plus operator-configured setup.redis_admin.celery_queues.

That works on worker pods, but on the API pod (where the admin runs)
the worker-side setup.tasks.*.queue env vars don't ship, so
BaseCeleryConfig.task_routes resolves every route to
task_default_queue ("celery") and the admin only sees the queues
an operator has explicitly declared. Per-tenant
enterprise_*_dropbox / _mixpanel / _squareup variants —
generated dynamically by shared.celery_router based on owner
config — never made it onto that static list, so they were silently
invisible in admin even when sitting on the broker with thousands of
messages.

This change makes the celery_broker summary the union of:

  • _discover_celery_queues_from_broker() — a live SCAN ... TYPE list
    against the broker Redis. The broker is a dedicated Memorystore in
    production (services.celery_broker), so every list-typed
    top-level key on it is a celery queue. The server-side TYPE
    filter (Redis 6.0+) avoids a TYPE round-trip per key. Kombu
    internals (_kombu.*) and the redis_admin LSET-tombstone prefix
    are deny-listed so the summary stays scoped to real queues.
    Bounded by MAX_SCAN_KEYS; on broker outage / older Redis builds
    that reject the TYPE arg, the helper logs and returns () and
    the summary falls back transparently to the static enumeration.
  • _resolve_celery_queue_names() — preserved for the existing
    contract: a known-but-currently-drained queue still renders a
    depth=0 row instead of vanishing the moment kombu deletes its
    empty list.

The summary changelist (CeleryBrokerQueueQuerySet._materialise_summary)
and the sidebar filter (CeleryQueueFilter.lookups) both render on
the same admin response, so the resolved tuple is cached on the
HttpRequest via _SUMMARY_QUEUE_NAMES_REQUEST_ATTR — one SCAN per
page render rather than two.

Operators no longer need to maintain
SETUP__REDIS_ADMIN__CELERY_QUEUES on the API pod for the admin to
see the broker's queue topology, though the knob is still honoured
(now scoped to "render this drained-but-known queue at depth=0
even when the broker has never materialised it").

Test plan

New unit + integration tests (all passing locally):

  • test_discover_celery_queues_from_broker_returns_only_list_keysTYPE=list filter skips kombu's unacked HASH / unacked_index ZSET / unrelated string keys
  • test_discover_celery_queues_from_broker_skips_kombu_internal_prefixes_kombu.* + __redis_admin_celery_tombstone__:* deny-list
  • test_discover_celery_queues_from_broker_returns_sorted_tuple — deterministic ordering for stable changelist render
  • test_discover_celery_queues_from_broker_handles_scan_errors_gracefully — falls back to () on broker outage / older Redis (no TYPE arg)
  • test_discover_celery_queues_from_broker_respects_max_scan_keysMAX_SCAN_KEYS truncation
  • test_queryset_summary_mode_unions_live_broker_scan_with_static_names — summary contains both live-discovered (uploads, notify, enterprise_*) and static (celery, healthcheck) queues
  • test_queryset_summary_mode_skips_kombu_internal_keys — internal keys filtered end-to-end through the changelist
  • test_queryset_summary_mode_caches_discovery_on_request — second materialisation against the same request reuses the first SCAN
  • test_queryset_summary_mode_no_request_skips_cache — standalone callers without a request eat the per-call SCAN
  • test_celery_queue_filter_lookups_includes_live_broker_queues — sidebar dropdown surfaces both live + static queues
  • test_celery_queue_filter_lookups_reuses_request_cache — sidebar shares the per-request SCAN result with the changelist
$ pytest -q redis_admin/tests/test_families.py redis_admin/tests/test_celery_broker_queue.py \
    -k "discover or queryset_summary or celery_queue_filter"
14 passed, 154 deselected in 0.38s

Manual verification on staging recommended:

  • Hit /admin/redis_admin/celerybrokerqueue/ on a deploy with no setup.redis_admin.celery_queues set; confirm queues like uploads, notify, sync, and any enterprise_* variants show up with their live LLEN depths
  • Hit the drill-down for one of those queues; confirm the sidebar CeleryQueueFilter dropdown lists the same queue set

Made with Cursor


Note

Medium Risk
Adds live SCAN ... TYPE list calls against the Celery broker Redis on admin page renders; although bounded/cached with fallbacks, it changes operational visibility and could affect performance or behavior on older/broken Redis setups.

Overview
Celery broker queue discovery is now live. The CeleryBrokerQueueAdmin summary and CeleryQueueFilter dropdown no longer rely solely on _resolve_celery_queue_names(); they union it with a new broker-side discovery helper that SCANs for TYPE=list keys (with deny-listed internal prefixes) so dynamically materialised queues (including enterprise_* variants) appear automatically.

Performance/safety guardrails were added. Discovery is capped by REDIS_ADMIN_MAX_SCAN_KEYS, falls back to static enumeration on errors/unsupported Redis, and the resolved queue-name tuple is cached on the HttpRequest so the summary page and sidebar share a single broker scan per render. Tests and README were updated to cover and document the new behavior.

Reviewed by Cursor Bugbot for commit 773adde. Bugbot is set up for automated code reviews on this repo. Configure here.

The CeleryBrokerQueueAdmin summary changelist (and the
CeleryQueueFilter sidebar that backs the drill-down picker)
previously sourced its queue list exclusively from
_resolve_celery_queue_names() — a static enumeration of
BaseCeleryConfig.task_routes plus the celery / healthcheck defaults
plus the operator-configured setup.redis_admin.celery_queues list.

That works on worker pods, but on the API pod (where the admin runs)
the worker-side setup.tasks.*.queue env vars don't ship, so
BaseCeleryConfig.task_routes resolves every route to
task_default_queue ("celery") and the admin only sees the queues an
operator has explicitly declared. Per-tenant enterprise_*_dropbox /
_mixpanel / _squareup variants — generated dynamically by
shared.celery_router based on owner config — never make it onto
that static list, so they were silently invisible in admin even
when sitting on the broker with thousands of messages.

This change makes the celery_broker summary the union of:

* _discover_celery_queues_from_broker() — a live SCAN ... TYPE list
  against the broker Redis. The broker is a dedicated Memorystore
  in production (services.celery_broker), so every list-typed
  top-level key on it is a celery queue. Server-side TYPE filter
  (Redis 6.0+) avoids a TYPE round-trip per key. Kombu-internal
  prefixes (_kombu.*) and the redis_admin LSET-tombstone prefix
  are deny-listed so the summary stays scoped to real queues.
  Bounded by MAX_SCAN_KEYS; on broker outage / older Redis builds
  that reject the TYPE arg the helper logs and returns (), and
  the summary falls back transparently to the static enumeration.
* _resolve_celery_queue_names() — preserved for the existing
  contract: a known-but-currently-drained queue still renders a
  depth=0 row instead of vanishing the moment kombu deletes its
  empty list.

The summary changelist (CeleryBrokerQueueQuerySet._materialise_summary)
and the sidebar filter (CeleryQueueFilter.lookups) both render on
the same admin response, so the resolved tuple is cached on the
HttpRequest via _SUMMARY_QUEUE_NAMES_REQUEST_ATTR — one SCAN per
page render rather than two.

Tests cover:
* SCAN TYPE=list filtering (skips kombu HASH / ZSET / unrelated
  string keys);
* prefix deny-list (kombu / tombstone internals);
* deterministic sorted ordering;
* graceful fallback when SCAN ... TYPE raises (older Redis builds);
* MAX_SCAN_KEYS truncation cap;
* summary-mode union of live discovery + static defaults;
* per-request cache shared between summary materialisation and
  CeleryQueueFilter.lookups.

Operators no longer need to maintain
SETUP__REDIS_ADMIN__CELERY_QUEUES on the API pod for the admin to
see the broker's queue topology — though the knob is still
honoured, now scoped to "render this drained-but-known queue at
depth=0 even when the broker has never materialised it".
@sentry
Copy link
Copy Markdown
Contributor

sentry Bot commented May 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.89%. Comparing base (c88232d) to head (773adde).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #914   +/-   ##
=======================================
  Coverage   91.89%   91.89%           
=======================================
  Files        1316     1316           
  Lines       50630    50666   +36     
  Branches     1625     1625           
=======================================
+ Hits        46525    46561   +36     
  Misses       3799     3799           
  Partials      306      306           
Flag Coverage Δ
apiunit 94.94% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codecov-notifications
Copy link
Copy Markdown

codecov-notifications Bot commented May 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant