Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 3 potential issues.
Autofix Details
Bugbot Autofix prepared fixes for all 3 issues found in the latest run.
- ✅ Fixed: Cache hit counter incremented before JSON parse succeeds
- Moved the hit counter to after json.loads succeeds, so decode errors are no longer double-counted as both hits and decode_errors.
- ✅ Fixed: Background job counter double-counts processed total
- Removed the redundant result=processed counter since processed = succeeded + failed, so summing across all result attribute values now yields the correct total.
- ✅ Fixed: Error path uses different attribute key than success path
- Changed the error path to use http.route (derived from request.scope route with fallback) instead of http.target, matching the success path attribute key.
Or push these changes by commenting:
@cursor push 04c7e26e03
Preview (04c7e26e03)
diff --git a/src/primer/server/app.py b/src/primer/server/app.py
--- a/src/primer/server/app.py
+++ b/src/primer/server/app.py
@@ -102,11 +102,6 @@
if result["processed"] > 0:
record_counter(
"primer.background_jobs.processed",
- result["processed"],
- {"result": "processed"},
- )
- record_counter(
- "primer.background_jobs.processed",
result["succeeded"],
{"result": "succeeded"},
)
diff --git a/src/primer/server/services/analytics_cache_service.py b/src/primer/server/services/analytics_cache_service.py
--- a/src/primer/server/services/analytics_cache_service.py
+++ b/src/primer/server/services/analytics_cache_service.py
@@ -98,12 +98,7 @@
)
return None
try:
- record_counter(
- "primer.analytics_cache.requests",
- 1,
- {"namespace": namespace, "result": "hit"},
- )
- return json.loads(payload)
+ parsed = json.loads(payload)
except json.JSONDecodeError:
record_counter(
"primer.analytics_cache.requests",
@@ -111,6 +106,12 @@
{"namespace": namespace, "result": "decode_error"},
)
return None
+ record_counter(
+ "primer.analytics_cache.requests",
+ 1,
+ {"namespace": namespace, "result": "hit"},
+ )
+ return parsed
def set_cached_json(
diff --git a/src/primer/server/services/observability_service.py b/src/primer/server/services/observability_service.py
--- a/src/primer/server/services/observability_service.py
+++ b/src/primer/server/services/observability_service.py
@@ -122,12 +122,14 @@
except Exception as exc:
if span is not None:
span.set_attribute("error.type", exc.__class__.__name__)
+ route = request.scope.get("route")
+ route_path = getattr(route, "path", request.url.path)
record_counter(
"primer.http.requests",
1,
{
"http.method": request.method,
- "http.target": request.url.path,
+ "http.route": route_path,
"http.status_code": "500",
},
)You can send follow-ups to the cloud agent here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Route path always resolves to raw URL, not template
- Moved request.scope.get('route') after call_next() returns so the route template is available from the scope (populated during routing inside call_next), preventing unbounded cardinality on http.route metric labels.
Or push these changes by commenting:
@cursor push 2ab3c5eb7f
Preview (2ab3c5eb7f)
diff --git a/src/primer/server/services/observability_service.py b/src/primer/server/services/observability_service.py
--- a/src/primer/server/services/observability_service.py
+++ b/src/primer/server/services/observability_service.py
@@ -109,20 +109,20 @@
@app.middleware("http")
async def otel_request_middleware(request, call_next):
- route = request.scope.get("route")
- route_path = getattr(route, "path", request.url.path)
with start_span(
"http.request",
{
"http.method": request.method,
- "http.route": route_path,
},
) as span:
started = perf_counter()
try:
response = await call_next(request)
except Exception as exc:
+ route = request.scope.get("route")
+ route_path = getattr(route, "path", request.url.path)
if span is not None:
+ span.set_attribute("http.route", route_path)
span.set_attribute("error.type", exc.__class__.__name__)
record_counter(
"primer.http.requests",
@@ -134,6 +134,8 @@
},
)
raise
+ route = request.scope.get("route")
+ route_path = getattr(route, "path", request.url.path)
duration_ms = (perf_counter() - started) * 1000
attributes = {
"http.method": request.method,You can send follow-ups to the cloud agent here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Wall clock used for duration instead of monotonic timer
- Replaced _utcnow_naive() wall clock with time.perf_counter() monotonic timer for job execution duration measurement, consistent with the rest of the codebase.
Or push these changes by commenting:
@cursor push ffc711f342
Preview (ffc711f342)
diff --git a/src/primer/server/services/background_job_service.py b/src/primer/server/services/background_job_service.py
--- a/src/primer/server/services/background_job_service.py
+++ b/src/primer/server/services/background_job_service.py
@@ -1,5 +1,6 @@
from __future__ import annotations
+import time
from datetime import UTC, datetime, timedelta
from typing import TYPE_CHECKING, Any
@@ -172,7 +173,7 @@
processed += 1
job_id, job_type, payload, attempts, max_attempts = claimed
- job_started = _utcnow_naive()
+ job_started = time.perf_counter()
with start_span(
"background_job.execute",
{
@@ -209,7 +210,7 @@
finally:
record_histogram(
"primer.background_jobs.execution.duration_ms",
- (_utcnow_naive() - job_started).total_seconds() * 1000,
+ (time.perf_counter() - job_started) * 1000,
{"job_type": job_type},
)You can send follow-ups to the cloud agent here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: HTTP error path missing duration histogram recording
- Added duration_ms calculation and record_histogram call to the exception handler so error requests now contribute to latency metrics alongside the counter.
- ✅ Fixed: Noisy "disabled" counters fire on every cache call
- Removed the per-request 'disabled' counter emissions from both get_cached_json and set_cached_json when Redis client is unavailable, since this is a static configuration state not an actionable per-request signal.
Or push these changes by commenting:
@cursor push da972ef0a7
Preview (da972ef0a7)
diff --git a/src/primer/server/services/analytics_cache_service.py b/src/primer/server/services/analytics_cache_service.py
--- a/src/primer/server/services/analytics_cache_service.py
+++ b/src/primer/server/services/analytics_cache_service.py
@@ -74,11 +74,6 @@
def get_cached_json(namespace: str, params: dict[str, Any]) -> Any | None:
client = _get_redis_client()
if client is None:
- record_counter(
- "primer.analytics_cache.requests",
- 1,
- {"namespace": namespace, "result": "disabled"},
- )
return None
try:
payload = client.get(_build_cache_key(namespace, params))
@@ -123,11 +118,6 @@
) -> None:
client = _get_redis_client()
if client is None:
- record_counter(
- "primer.analytics_cache.writes",
- 1,
- {"namespace": namespace, "result": "disabled"},
- )
return
try:
client.setex(
diff --git a/src/primer/server/services/observability_service.py b/src/primer/server/services/observability_service.py
--- a/src/primer/server/services/observability_service.py
+++ b/src/primer/server/services/observability_service.py
@@ -123,19 +123,19 @@
try:
response = await call_next(request)
except Exception as exc:
+ duration_ms = (perf_counter() - started) * 1000
route_path = _route_path(request)
+ attributes = {
+ "http.method": request.method,
+ "http.route": route_path,
+ "http.status_code": "500",
+ }
if span is not None:
span.set_attribute("error.type", exc.__class__.__name__)
- span.set_attribute("http.route", route_path)
- record_counter(
- "primer.http.requests",
- 1,
- {
- "http.method": request.method,
- "http.route": route_path,
- "http.status_code": "500",
- },
- )
+ for key, value in attributes.items():
+ span.set_attribute(key, value)
+ record_counter("primer.http.requests", 1, attributes)
+ record_histogram("primer.http.request.duration_ms", duration_ms, attributes)
raise
duration_ms = (perf_counter() - started) * 1000
route_path = _route_path(request)You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit 487f855. Configure here.


Summary
Testing
Note
Medium Risk
Touches request handling middleware and background job execution paths to emit OpenTelemetry metrics/spans; misconfiguration or exporter issues could add overhead or noisy telemetry but is gated behind
otel_enabled.Overview
Adds an optional OpenTelemetry observability layer (new
observability_service) with request-level spans, duration histograms/counters, and log correlation via injectedotel_trace_id/otel_span_id, supporting OTLP HTTP export or console export.Instruments key backend hotspots: analytics cache now records hit/miss/error/decode/write counters, background job cycles and per-job executions are wrapped in spans and emit result/duration metrics, and heavy analytics aggregations (
compare,engineer_profile,project_workspace) are wrapped in top-level spans. Also updates config (otel_*settings), adds OpenTelemetry dependencies, and marks the roadmap item as shipped.Reviewed by Cursor Bugbot for commit 4a87c0e. Bugbot is set up for automated code reviews on this repo. Configure here.