chore(dynamic-sampling): add status for snuba timeouts by shellmayr · Pull Request #115359 · getsentry/sentry

shellmayr · 2026-05-12T07:15:05Z

Snuba timeouts are not generic failures - they should thus not be marked as such in our observability

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit ffa5096. Configure here.}

cursor · 2026-05-12T07:17:28Z

+            except SnubaRPCError as exc:
+                sentry_sdk.capture_exception(exc)
+                emit_status(status_metric, TelemetryStatus.SNUBA_TIMEOUT)
+                raise


All SnubaRPC errors incorrectly classified as timeouts

Medium Severity

except SnubaRPCError catches all subclasses including SnubaRPCRateLimitExceeded and SnubaRPCTooManySimultaneous, but labels every one of them as SNUBA_TIMEOUT. SnubaRPCError is also raised for generic non-200 HTTP responses and protobuf decode failures. This defeats the stated purpose of distinguishing timeouts from generic failures in observability — non-timeout errors are now misclassified as timeouts rather than as failures.

^{Reviewed by Cursor Bugbot for commit ffa5096. Configure here.}

sentry · 2026-05-12T07:17:59Z

+            except SnubaRPCError as exc:
+                sentry_sdk.capture_exception(exc)
+                emit_status(status_metric, TelemetryStatus.SNUBA_TIMEOUT)


Bug: Catching the generic SnubaRPCError misclassifies various errors as timeouts and leads to inconsistent metric tagging (snuba_timeout vs. failed) for the same event.
_{Severity: MEDIUM}

Suggested Fix

To accurately capture only timeouts, catch the more specific SnubaRPCTimeout exception instead of the general SnubaRPCError. If the goal is to handle other Snuba errors separately, add specific except blocks for them. To fix the inconsistent tagging, ensure the duration metric's status is set consistently with the status metric within the exception block.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/sentry/dynamic_sampling/per_org/tasks/telemetry.py#L117-L119 Potential issue: The `except SnubaRPCError` block is too broad and will incorrectly handle various Snuba errors as timeouts. `SnubaRPCError` is a base class for multiple error types, including rate limits, concurrent query limits, and other HTTP errors, not just timeouts. Consequently, any of these non-timeout errors will be caught and mislabeled with the `SNUBA_TIMEOUT` status. Furthermore, because the exception is re-raised, an outer exception handler tags the associated duration metric with `status=failed`, creating an inconsistency where the status metric reports `snuba_timeout` but the duration metric reports `failed` for the exact same event.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

This might be valid. Maybe we can call this error status SNUBA_ERRORß

Yeah it's a good point but I really want these to be semantically meaningful errors - if that is not yet possible, I think we should add those statuses to the sentry snuba layer first

Adding a change here and will revisit when it shipped

chore(dynamic-sampling): add status for snuba timeouts

ffa5096

shellmayr requested a review from a team as a code owner May 12, 2026 07:15

github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label May 12, 2026

cursor Bot reviewed May 12, 2026

View reviewed changes

sentry Bot reviewed May 12, 2026

View reviewed changes

constantinius approved these changes May 12, 2026

View reviewed changes

shellmayr marked this pull request as draft May 12, 2026 07:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore(dynamic-sampling): add status for snuba timeouts#115359

chore(dynamic-sampling): add status for snuba timeouts#115359
shellmayr wants to merge 1 commit into
masterfrom
shellmayr/chore/add-status-for-snuba-timeouts

shellmayr commented May 12, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot May 12, 2026

Uh oh!

sentry Bot May 12, 2026

Uh oh!

constantinius May 12, 2026

Uh oh!

shellmayr May 12, 2026

Uh oh!

shellmayr May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

shellmayr commented May 12, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot May 12, 2026

Choose a reason for hiding this comment

All SnubaRPC errors incorrectly classified as timeouts

Uh oh!

sentry Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

constantinius May 12, 2026

Choose a reason for hiding this comment

Uh oh!

shellmayr May 12, 2026

Choose a reason for hiding this comment

Uh oh!

shellmayr May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants