adding error message to user message for traces#2410
adding error message to user message for traces#2410shagun-singh-inkeep wants to merge 3 commits intomainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
2 Skipped Deployments
|
🦋 Changeset detectedLatest commit: 37e4bba The changes in this PR will be included in the next version bump. This PR includes changesets to release 9 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
Surfaces OpenTelemetry error details in the trace timeline UI when message processing fails. Key changes:
|
There was a problem hiding this comment.
Two issues found: otelStatusCode is referenced in the UI but never populated for user message activities (dead code), and STATUS_MESSAGE vs OTEL_STATUS_DESCRIPTION naming is inconsistent with the existing agent_generation pattern.
| {a.hasError && a.otelStatusCode && ( | ||
| <Info label="Status code" value={a.otelStatusCode} /> | ||
| )} |
There was a problem hiding this comment.
a.otelStatusCode is never set for user message activities — the route handler (lines 1607–1608) only adds hasError and otelStatusDescription, and the SigNoz query payload for user messages doesn't fetch OTEL_STATUS_CODE. This block is dead code. Either remove it, or also extract and set otelStatusCode in the user message section of route.ts (and add SPAN_KEYS.OTEL_STATUS_CODE to the query's selectColumns).
| const spanEntryPoint = getString(span, SPAN_KEYS.INVOCATION_ENTRY_POINT, ''); | ||
| const triggerId = getString(span, SPAN_KEYS.TRIGGER_ID, ''); | ||
| const triggerInvocationId = getString(span, SPAN_KEYS.TRIGGER_INVOCATION_ID, ''); | ||
| const otelStatusDescription = getString(span, SPAN_KEYS.STATUS_MESSAGE, ''); |
There was a problem hiding this comment.
Naming inconsistency: this reads from SPAN_KEYS.STATUS_MESSAGE (status_message) but the variable is named otelStatusDescription. The agent_generation section (~line 1690) reads otelStatusDescription from SPAN_KEYS.OTEL_STATUS_DESCRIPTION (otel.status_description) instead, and uses STATUS_MESSAGE for statusMessage. Consider aligning with the existing pattern — either read from OTEL_STATUS_DESCRIPTION (or fall back to both like agent_generation does), or rename the variable to avoid confusion.
| @@ -0,0 +1,6 @@ | |||
| --- | |||
| "@inkeep/agents-manage-ui": patch | |||
| "@inkeep/agents-api": patch | |||
There was a problem hiding this comment.
The agents-api patch is only for the OTEL batch size default change (64 → 256), which is unrelated operational tuning. Consider splitting it into a separate changeset with its own message (e.g., "Increase OTEL batch export size default to 256") so the changelog is clear about what changed in each package.
There was a problem hiding this comment.
PR Review Summary
(3) Total Issues | Risk: Medium
🟠⚠️ Major (3) 🟠⚠️
Inline Comments:
- 🟠 Major:
route.ts:545MissingOTEL_STATUS_DESCRIPTIONin userMessages query — may cause error messages to not display - 🟠 Major:
route.ts:1577Extraction doesn't use fallback pattern like other activity types - 🟠 Major:
render-panel-content.tsx:286-288Dead code —otelStatusCodedisplay will never render for user messages
💭 Consider (2) 💭
💭 1) env.ts:118 + otel.ts:34 OTEL batch size increase (64→256)
Issue: The 4x increase in batch size means more spans are at risk if an export fails.
Why: Reasonable performance tuning, but worth documenting the rationale.
Fix: Consider adding a code comment explaining the change, or documenting in the changeset.
💭 2) route.ts + render-panel-content.tsx Test coverage gap
Issue: No unit tests exist for the trace timeline components or the signoz API route.
Why: This is pre-existing tech debt. The change itself is low-risk (simple string extraction and conditional rendering following existing patterns).
Fix: Consider creating a follow-up issue to add test coverage for the traces feature.
💡 APPROVE WITH SUGGESTIONS
Summary: The core idea of this PR is good — surfacing OTEL error descriptions in the trace timeline is valuable for debugging. However, there are three implementation issues that should be addressed:
-
Missing query field: The userMessages query needs to also fetch
OTEL_STATUS_DESCRIPTIONto match the pattern used by other activity types. Without this, errors may silently not display. -
Missing fallback in extraction: The extraction logic should use
STATUS_MESSAGE || OTEL_STATUS_DESCRIPTIONfallback pattern for consistency. -
Dead code: The
otelStatusCodedisplay block will never render for user messages since the data is never fetched or set. Either remove this dead code or complete the implementation.
The inline comments have 1-click suggestions for the first two issues. The third issue suggests removing the dead code block.
Discarded (4)
| Location | Issue | Reason Discarded |
|---|---|---|
env.ts / otel.ts |
Missing explicit maxQueueSize configuration |
Pre-existing issue not introduced by this PR |
render-panel-content.tsx |
Split-world pattern for error display conditions (hasError vs status === 'error') |
Pre-existing inconsistency, this PR follows the hasError pattern used by ai_generation/agent_generation |
.changeset/slimy-mirrors-sin.md |
Changeset message doesn't start with capital action verb | Too minor to flag |
| Route/component tests | Missing tests for trace components | Pre-existing tech debt; change is low-risk additive functionality |
Reviewers (5)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
pr-review-frontend |
1 | 0 | 0 | 0 | 1 | 0 | 0 |
pr-review-consistency |
4 | 0 | 0 | 0 | 2 | 0 | 2 |
pr-review-sre |
3 | 0 | 1 | 0 | 0 | 0 | 2 |
pr-review-tests |
2 | 0 | 1 | 0 | 0 | 0 | 1 |
pr-review-standards |
1 | 0 | 0 | 0 | 0 | 0 | 1 |
| Total | 11 | 0 | 2 | 0 | 3 | 0 | 6 |
| { | ||
| key: SPAN_KEYS.STATUS_MESSAGE, | ||
| ...QUERY_FIELD_CONFIGS.STRING_TAG, | ||
| }, |
There was a problem hiding this comment.
🟠 MAJOR: Missing OTEL_STATUS_DESCRIPTION in userMessages query
Issue: The userMessages query now includes STATUS_MESSAGE but does not include OTEL_STATUS_DESCRIPTION. Other queries in this file (toolCalls at lines 284-290, contextResolution at lines 335-341, agentGenerations at lines 443-449) fetch both keys to ensure error messages are captured regardless of which attribute the span uses.
Why: If a span only has otel.status_description set (the primary attribute set by setSpanWithError() in the tracer factory), the error message will not be displayed to users since it won't be fetched from SigNoz.
Fix:
| }, | |
| { | |
| key: SPAN_KEYS.STATUS_MESSAGE, | |
| ...QUERY_FIELD_CONFIGS.STRING_TAG, | |
| }, | |
| { | |
| key: SPAN_KEYS.OTEL_STATUS_DESCRIPTION, | |
| ...QUERY_FIELD_CONFIGS.STRING_TAG, | |
| }, |
Refs:
- toolCalls query pattern — fetches both STATUS_MESSAGE and OTEL_STATUS_DESCRIPTION
| const spanEntryPoint = getString(span, SPAN_KEYS.INVOCATION_ENTRY_POINT, ''); | ||
| const triggerId = getString(span, SPAN_KEYS.TRIGGER_ID, ''); | ||
| const triggerInvocationId = getString(span, SPAN_KEYS.TRIGGER_INVOCATION_ID, ''); | ||
| const otelStatusDescription = getString(span, SPAN_KEYS.STATUS_MESSAGE, ''); |
There was a problem hiding this comment.
🟠 MAJOR: Use fallback pattern for error message extraction
Issue: This line only extracts from STATUS_MESSAGE, but other activity types in this file use a fallback pattern that checks both STATUS_MESSAGE and OTEL_STATUS_DESCRIPTION. If the query is updated to fetch both keys (per the other comment), this extraction should also use the fallback pattern.
Why: The primary error message source from setSpanWithError() is otel.status_description. Without the fallback, error messages may not display even if the data is fetched.
Fix:
| const otelStatusDescription = getString(span, SPAN_KEYS.STATUS_MESSAGE, ''); | |
| const otelStatusDescription = getString(span, SPAN_KEYS.STATUS_MESSAGE, '') || getString(span, SPAN_KEYS.OTEL_STATUS_DESCRIPTION, ''); |
Refs:
- ai_assistant_message fallback pattern — uses
getString(span, SPAN_KEYS.STATUS_MESSAGE, '') || getString(span, SPAN_KEYS.OTEL_STATUS_DESCRIPTION, '')
| {a.hasError && a.otelStatusCode && ( | ||
| <Info label="Status code" value={a.otelStatusCode} /> | ||
| )} |
There was a problem hiding this comment.
🟠 MAJOR: Dead code — otelStatusCode will never be populated for user messages
Issue: This code block checks for a.otelStatusCode, but this field is never set for user message activities. Looking at the route handler (lines 1589-1614), user message activities only set hasError and otelStatusDescription — they do NOT set otelStatusCode. Additionally, the userMessages query does not include SPAN_KEYS.OTEL_STATUS_CODE in its select columns.
Why: This JSX will never render because a.otelStatusCode will always be undefined for user messages. This creates confusion for maintainers who might expect this UI element to work.
Fix: Either remove this dead code block, or complete the implementation by:
- Adding
SPAN_KEYS.OTEL_STATUS_CODEto the userMessages query selectColumns - Extracting and setting
otelStatusCodein the route handler's user message processing
| {a.hasError && a.otelStatusCode && ( | |
| <Info label="Status code" value={a.otelStatusCode} /> | |
| )} |
Refs:
- User message activity creation — does not set
otelStatusCode
There was a problem hiding this comment.
PR Review Summary
(2) Total Issues | Risk: Medium
Delta Review (1 commit since last review)
The commit claude addressed 1 of 3 issues from the prior review:
✅ Fixed: Added OTEL_STATUS_CODE to the userMessages query and extraction logic. The "dead code" issue for otelStatusCode display (render-panel-content.tsx:286-288) is now resolved — the UI block will now render when error spans include a status code.
🕐 Pending Recommendations (2)
Two prior issues remain unaddressed:
- 🟠
route.ts:545MissingOTEL_STATUS_DESCRIPTIONin userMessages query — error messages may silently not display if spans only haveotel.status_descriptionset - 🟠
route.ts:1581Missing fallback pattern — extraction should useSTATUS_MESSAGE || OTEL_STATUS_DESCRIPTIONlike aiAssistantMessages (lines 1628-1631)
Context: The aiAssistantMessages query fetches BOTH OTEL_STATUS_DESCRIPTION and STATUS_MESSAGE (lines 610-616), and its extraction uses the fallback pattern (lines 1628-1631). The userMessages query now fetches STATUS_MESSAGE and OTEL_STATUS_CODE, but still lacks OTEL_STATUS_DESCRIPTION.
💡 APPROVE WITH SUGGESTIONS
Summary: Good progress — the otelStatusCode dead code issue is now resolved. However, two prior issues about error message consistency remain: (1) add OTEL_STATUS_DESCRIPTION to the query, and (2) use the fallback pattern for extraction. These ensure error messages display regardless of which OTEL attribute the span uses. The inline comments from the prior review have 1-click suggestions ready to apply.
Reviewers (1)
| Reviewer | Returned | Main Findings | Consider | While You're Here | Inline Comments | Pending Recs | Discarded |
|---|---|---|---|---|---|---|---|
orchestrator (delta) |
3 | 0 | 0 | 0 | 0 | 2 | 1 |
| Total | 3 | 0 | 0 | 0 | 0 | 2 | 1 |
Note: Delta re-review — prior inline comments remain active and accurate.
|
This pull request has been automatically marked as stale because it has not had recent activity. If this PR is still relevant:
Thank you for your contributions! |
|
This pull request has been automatically closed due to inactivity. If you'd like to continue working on this, please:
Thank you for your understanding! |

No description provided.