Skip to content

fix: add jitter to retry backoff delays #4

@0xneobyte

Description

@0xneobyte

Problem

Retries use pure exponential backoff with fixed delays: 1s, 2s, 4s. When multiple clients encounter an error simultaneously (e.g. a brief backend outage), they all retry at exactly the same intervals. This creates a thundering herd — all clients hit the server at once on each retry wave, making recovery slower.

Proposed Behaviour

Add random jitter to retry delays to spread retries across a time window.

Current: delay = min(1000 * 2^attempt, 10000)
Fixed: delay = random(0, min(1000 * 2^attempt, 10000))

Full jitter is the most effective strategy for preventing thundering herd while keeping average latency reasonable.

Files to Modify

File Change
src/brainus_ai/client.py Add jitter to sleep duration in _make_request

Acceptance Criteria

  • Retry delays are randomised within the exponential backoff window
  • Maximum possible delay is unchanged (10s cap)
  • Minimum possible delay is 0
  • Retry count behaviour is unchanged
  • Tests updated to account for non-deterministic delay

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions