Skip to content

chore(dynamic-sampling): rename dynamic sampling status enum#115360

Open
shellmayr wants to merge 2 commits into
masterfrom
shellmayr/chore/rename-ds-status-enum
Open

chore(dynamic-sampling): rename dynamic sampling status enum#115360
shellmayr wants to merge 2 commits into
masterfrom
shellmayr/chore/rename-ds-status-enum

Conversation

@shellmayr
Copy link
Copy Markdown
Member

This name makes more sense - it's the status of the dynamic sampling job, so let's not make this more complicated than necessary

@shellmayr shellmayr requested a review from a team as a code owner May 12, 2026 07:16
@shellmayr shellmayr changed the title Shellmayr/chore/rename ds status enum chore(dynamic-sampling): rename dynamic sampling status enum May 12, 2026
@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label May 12, 2026
Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit caabae7. Configure here.

except SnubaRPCError as exc:
sentry_sdk.capture_exception(exc)
emit_status(status_metric, DynamicSamplingStatus.SNUBA_TIMEOUT)
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading metric label for non-timeout Snuba errors

Medium Severity

The new except SnubaRPCError handler catches all subclasses of SnubaRPCError, including SnubaRPCRateLimitExceeded and SnubaRPCTooManySimultaneous, but labels all of them with DynamicSamplingStatus.SNUBA_TIMEOUT ("snuba_timeout"). Rate limit errors and concurrent query limit errors are not timeouts. This produces misleading metrics that could cause incorrect conclusions during incident investigation.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit caabae7. Configure here.

Comment on lines +117 to +120
except SnubaRPCError as exc:
sentry_sdk.capture_exception(exc)
emit_status(status_metric, DynamicSamplingStatus.SNUBA_TIMEOUT)
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The except SnubaRPCError block is too broad and incorrectly classifies various Snuba errors, like rate limits, as SNUBA_TIMEOUT.
Severity: MEDIUM

Suggested Fix

Refine the exception handling to be more specific. Catch urllib3.exceptions.ReadTimeoutError or a more specific timeout-related exception first and handle it by emitting DynamicSamplingStatus.SNUBA_TIMEOUT. Then, either re-raise other SnubaRPCError types or handle them separately with different metrics to ensure accurate error reporting.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: src/sentry/dynamic_sampling/per_org/tasks/telemetry.py#L117-L120

Potential issue: The `except SnubaRPCError` handler is intended to catch Snuba timeouts
but is overly broad. It will catch all `SnubaRPCError` subclasses, including
`SnubaRPCRateLimitExceeded` and `SnubaRPCTooManySimultaneous`. As a result, various
distinct Snuba issues such as rate limits, concurrent query limits, and other
server-side errors will be incorrectly reported with the
`DynamicSamplingStatus.SNUBA_TIMEOUT` status. This misclassification will lead to
inaccurate metrics, making it difficult to distinguish actual timeouts from other
critical Snuba failures in production monitoring.

Did we get this right? 👍 / 👎 to inform future reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants