-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
chore(dynamic-sampling): add status for snuba timeouts #115359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,6 +14,7 @@ | |
| metrics_sample_rate, | ||
| ) | ||
| from sentry.utils import metrics | ||
| from sentry.utils.snuba_rpc import SnubaRPCError | ||
|
|
||
| F = TypeVar("F", bound=Callable[..., object]) | ||
|
|
||
|
|
@@ -38,6 +39,7 @@ class TelemetryStatus(StrEnum): | |
| ORG_NOT_FOUND = "org_not_found" | ||
| ROLLOUT_DISABLED = "rollout_disabled" | ||
| ROLLOUT_EXCLUDED = "rollout_excluded" | ||
| SNUBA_TIMEOUT = "snuba_timeout" | ||
|
|
||
|
|
||
| class DynamicSamplingException(Exception): | ||
|
|
@@ -112,6 +114,10 @@ def wrapper(*args: object, **kwargs: object) -> object: | |
| result = func(*args, **kwargs) | ||
| except DynamicSamplingException as exc: | ||
| result = exc.status | ||
| except SnubaRPCError as exc: | ||
| sentry_sdk.capture_exception(exc) | ||
| emit_status(status_metric, TelemetryStatus.SNUBA_TIMEOUT) | ||
| raise | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All SnubaRPC errors incorrectly classified as timeoutsMedium Severity
Reviewed by Cursor Bugbot for commit ffa5096. Configure here. |
||
| except Exception as exc: | ||
| sentry_sdk.capture_exception(exc) | ||
| emit_status(status_metric, TelemetryStatus.FAILED) | ||
|
|
||


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Catching the generic
SnubaRPCErrormisclassifies various errors as timeouts and leads to inconsistent metric tagging (snuba_timeoutvs.failed) for the same event.Severity: MEDIUM
Suggested Fix
To accurately capture only timeouts, catch the more specific
SnubaRPCTimeoutexception instead of the generalSnubaRPCError. If the goal is to handle other Snuba errors separately, add specificexceptblocks for them. To fix the inconsistent tagging, ensure the duration metric's status is set consistently with the status metric within the exception block.Prompt for AI Agent
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be valid. Maybe we can call this error status
SNUBA_ERRORßThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's a good point but I really want these to be semantically meaningful errors - if that is not yet possible, I think we should add those statuses to the sentry snuba layer first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a change here and will revisit when it shipped