Skip to content

feat: add rate-limited retry and observability for per-project GC partition setup#516

Open
zachsmith1 wants to merge 1 commit intomainfrom
feat/gcworkqueueandconfig
Open

feat: add rate-limited retry and observability for per-project GC partition setup#516
zachsmith1 wants to merge 1 commit intomainfrom
feat/gcworkqueueandconfig

Conversation

@zachsmith1
Copy link
Contributor

Per-project GC partitions could silently fail to initialize during bulk startup when client-side throttling caused discovery or syncMonitors calls to fail. There was no retry mechanism, leaving affected projects with incomplete monitors permanently.

  • Refactor AddProject to register-after-verify, preventing zombie builders
  • Add rate-limited work queue to projectprovider with exponential backoff
    and configurable retry limits for project onboarding
  • Add resyncNeeded flag to force GC resync when new partitions are added
  • Add Prometheus metrics for partition health, project add outcomes,
    retry counts, queue depth, and add duration
  • Expose project-provider tuning via CLI flags (workers, max-retries,
    rate-limit, rate-burst)

@joggrbot
Copy link
Contributor

joggrbot bot commented Mar 6, 2026

📝 Documentation Analysis

All docs are up to date! 🎉


✅ Latest commit analyzed: b556ad5 | Powered by Joggr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant