Skip to content

adding error message to user message for traces#2410

Closed
shagun-singh-inkeep wants to merge 3 commits intomainfrom
TI-21
Closed

adding error message to user message for traces#2410
shagun-singh-inkeep wants to merge 3 commits intomainfrom
TI-21

Conversation

@shagun-singh-inkeep
Copy link
Copy Markdown
Collaborator

No description provided.

@shagun-singh-inkeep shagun-singh-inkeep marked this pull request as ready for review February 26, 2026 17:29
@vercel
Copy link
Copy Markdown

vercel Bot commented Feb 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agents-manage-ui Ready Ready Preview, Comment Feb 26, 2026 5:41pm
2 Skipped Deployments
Project Deployment Actions Updated (UTC)
agents-api Ignored Ignored Preview Feb 26, 2026 5:41pm
agents-docs Skipped Skipped Feb 26, 2026 5:41pm

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Feb 26, 2026

🦋 Changeset detected

Latest commit: 37e4bba

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 9 packages
Name Type
@inkeep/agents-manage-ui Patch
@inkeep/agents-api Patch
@inkeep/agents-cli Patch
@inkeep/agents-core Patch
@inkeep/agents-mcp Patch
@inkeep/agents-sdk Patch
@inkeep/agents-work-apps Patch
@inkeep/ai-sdk-provider Patch
@inkeep/create-agents Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog Bot commented Feb 26, 2026

Surfaces OpenTelemetry error details in the trace timeline UI when message processing fails.

Key changes:

  • Extracts otel.status_description from trace spans in agents-manage-ui/src/app/api/signoz/conversations/[conversationId]/route.ts
  • Displays error message and status code in red error bubble for failed user messages in agents-manage-ui/src/components/traces/timeline/render-panel-content.tsx
  • Increases OTEL batch export size from 64 to 256 spans in agents-api/src/env.ts and agents-manage-ui/src/otel.ts

Pullfrog  | View workflow run | Using Claude Code | Triggered by Pullfrogpullfrog.com𝕏

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two issues found: otelStatusCode is referenced in the UI but never populated for user message activities (dead code), and STATUS_MESSAGE vs OTEL_STATUS_DESCRIPTION naming is inconsistent with the existing agent_generation pattern.

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow runpullfrog.com𝕏

Comment on lines +286 to +288
{a.hasError && a.otelStatusCode && (
<Info label="Status code" value={a.otelStatusCode} />
)}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a.otelStatusCode is never set for user message activities — the route handler (lines 1607–1608) only adds hasError and otelStatusDescription, and the SigNoz query payload for user messages doesn't fetch OTEL_STATUS_CODE. This block is dead code. Either remove it, or also extract and set otelStatusCode in the user message section of route.ts (and add SPAN_KEYS.OTEL_STATUS_CODE to the query's selectColumns).

const spanEntryPoint = getString(span, SPAN_KEYS.INVOCATION_ENTRY_POINT, '');
const triggerId = getString(span, SPAN_KEYS.TRIGGER_ID, '');
const triggerInvocationId = getString(span, SPAN_KEYS.TRIGGER_INVOCATION_ID, '');
const otelStatusDescription = getString(span, SPAN_KEYS.STATUS_MESSAGE, '');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming inconsistency: this reads from SPAN_KEYS.STATUS_MESSAGE (status_message) but the variable is named otelStatusDescription. The agent_generation section (~line 1690) reads otelStatusDescription from SPAN_KEYS.OTEL_STATUS_DESCRIPTION (otel.status_description) instead, and uses STATUS_MESSAGE for statusMessage. Consider aligning with the existing pattern — either read from OTEL_STATUS_DESCRIPTION (or fall back to both like agent_generation does), or rename the variable to avoid confusion.

@@ -0,0 +1,6 @@
---
"@inkeep/agents-manage-ui": patch
"@inkeep/agents-api": patch
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The agents-api patch is only for the OTEL batch size default change (64 → 256), which is unrelated operational tuning. Consider splitting it into a separate changeset with its own message (e.g., "Increase OTEL batch export size default to 256") so the changelog is clear about what changed in each package.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(3) Total Issues | Risk: Medium

🟠⚠️ Major (3) 🟠⚠️

Inline Comments:

  • 🟠 Major: route.ts:545 Missing OTEL_STATUS_DESCRIPTION in userMessages query — may cause error messages to not display
  • 🟠 Major: route.ts:1577 Extraction doesn't use fallback pattern like other activity types
  • 🟠 Major: render-panel-content.tsx:286-288 Dead code — otelStatusCode display will never render for user messages

💭 Consider (2) 💭

💭 1) env.ts:118 + otel.ts:34 OTEL batch size increase (64→256)

Issue: The 4x increase in batch size means more spans are at risk if an export fails.
Why: Reasonable performance tuning, but worth documenting the rationale.
Fix: Consider adding a code comment explaining the change, or documenting in the changeset.

💭 2) route.ts + render-panel-content.tsx Test coverage gap

Issue: No unit tests exist for the trace timeline components or the signoz API route.
Why: This is pre-existing tech debt. The change itself is low-risk (simple string extraction and conditional rendering following existing patterns).
Fix: Consider creating a follow-up issue to add test coverage for the traces feature.


💡 APPROVE WITH SUGGESTIONS

Summary: The core idea of this PR is good — surfacing OTEL error descriptions in the trace timeline is valuable for debugging. However, there are three implementation issues that should be addressed:

  1. Missing query field: The userMessages query needs to also fetch OTEL_STATUS_DESCRIPTION to match the pattern used by other activity types. Without this, errors may silently not display.

  2. Missing fallback in extraction: The extraction logic should use STATUS_MESSAGE || OTEL_STATUS_DESCRIPTION fallback pattern for consistency.

  3. Dead code: The otelStatusCode display block will never render for user messages since the data is never fetched or set. Either remove this dead code or complete the implementation.

The inline comments have 1-click suggestions for the first two issues. The third issue suggests removing the dead code block.

Discarded (4)
Location Issue Reason Discarded
env.ts / otel.ts Missing explicit maxQueueSize configuration Pre-existing issue not introduced by this PR
render-panel-content.tsx Split-world pattern for error display conditions (hasError vs status === 'error') Pre-existing inconsistency, this PR follows the hasError pattern used by ai_generation/agent_generation
.changeset/slimy-mirrors-sin.md Changeset message doesn't start with capital action verb Too minor to flag
Route/component tests Missing tests for trace components Pre-existing tech debt; change is low-risk additive functionality
Reviewers (5)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
pr-review-frontend 1 0 0 0 1 0 0
pr-review-consistency 4 0 0 0 2 0 2
pr-review-sre 3 0 1 0 0 0 2
pr-review-tests 2 0 1 0 0 0 1
pr-review-standards 1 0 0 0 0 0 1
Total 11 0 2 0 3 0 6

{
key: SPAN_KEYS.STATUS_MESSAGE,
...QUERY_FIELD_CONFIGS.STRING_TAG,
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Missing OTEL_STATUS_DESCRIPTION in userMessages query

Issue: The userMessages query now includes STATUS_MESSAGE but does not include OTEL_STATUS_DESCRIPTION. Other queries in this file (toolCalls at lines 284-290, contextResolution at lines 335-341, agentGenerations at lines 443-449) fetch both keys to ensure error messages are captured regardless of which attribute the span uses.

Why: If a span only has otel.status_description set (the primary attribute set by setSpanWithError() in the tracer factory), the error message will not be displayed to users since it won't be fetched from SigNoz.

Fix:

Suggested change
},
{
key: SPAN_KEYS.STATUS_MESSAGE,
...QUERY_FIELD_CONFIGS.STRING_TAG,
},
{
key: SPAN_KEYS.OTEL_STATUS_DESCRIPTION,
...QUERY_FIELD_CONFIGS.STRING_TAG,
},

Refs:

const spanEntryPoint = getString(span, SPAN_KEYS.INVOCATION_ENTRY_POINT, '');
const triggerId = getString(span, SPAN_KEYS.TRIGGER_ID, '');
const triggerInvocationId = getString(span, SPAN_KEYS.TRIGGER_INVOCATION_ID, '');
const otelStatusDescription = getString(span, SPAN_KEYS.STATUS_MESSAGE, '');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Use fallback pattern for error message extraction

Issue: This line only extracts from STATUS_MESSAGE, but other activity types in this file use a fallback pattern that checks both STATUS_MESSAGE and OTEL_STATUS_DESCRIPTION. If the query is updated to fetch both keys (per the other comment), this extraction should also use the fallback pattern.

Why: The primary error message source from setSpanWithError() is otel.status_description. Without the fallback, error messages may not display even if the data is fetched.

Fix:

Suggested change
const otelStatusDescription = getString(span, SPAN_KEYS.STATUS_MESSAGE, '');
const otelStatusDescription = getString(span, SPAN_KEYS.STATUS_MESSAGE, '') || getString(span, SPAN_KEYS.OTEL_STATUS_DESCRIPTION, '');

Refs:

Comment on lines +286 to +288
{a.hasError && a.otelStatusCode && (
<Info label="Status code" value={a.otelStatusCode} />
)}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 MAJOR: Dead code — otelStatusCode will never be populated for user messages

Issue: This code block checks for a.otelStatusCode, but this field is never set for user message activities. Looking at the route handler (lines 1589-1614), user message activities only set hasError and otelStatusDescription — they do NOT set otelStatusCode. Additionally, the userMessages query does not include SPAN_KEYS.OTEL_STATUS_CODE in its select columns.

Why: This JSX will never render because a.otelStatusCode will always be undefined for user messages. This creates confusion for maintainers who might expect this UI element to work.

Fix: Either remove this dead code block, or complete the implementation by:

  1. Adding SPAN_KEYS.OTEL_STATUS_CODE to the userMessages query selectColumns
  2. Extracting and setting otelStatusCode in the route handler's user message processing
Suggested change
{a.hasError && a.otelStatusCode && (
<Info label="Status code" value={a.otelStatusCode} />
)}

Refs:

@github-actions github-actions Bot deleted a comment from claude Bot Feb 26, 2026
@vercel vercel Bot temporarily deployed to Preview – agents-docs February 26, 2026 17:39 Inactive
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

(2) Total Issues | Risk: Medium

Delta Review (1 commit since last review)

The commit claude addressed 1 of 3 issues from the prior review:

Fixed: Added OTEL_STATUS_CODE to the userMessages query and extraction logic. The "dead code" issue for otelStatusCode display (render-panel-content.tsx:286-288) is now resolved — the UI block will now render when error spans include a status code.

🕐 Pending Recommendations (2)

Two prior issues remain unaddressed:

  • 🟠 route.ts:545 Missing OTEL_STATUS_DESCRIPTION in userMessages query — error messages may silently not display if spans only have otel.status_description set
  • 🟠 route.ts:1581 Missing fallback pattern — extraction should use STATUS_MESSAGE || OTEL_STATUS_DESCRIPTION like aiAssistantMessages (lines 1628-1631)

Context: The aiAssistantMessages query fetches BOTH OTEL_STATUS_DESCRIPTION and STATUS_MESSAGE (lines 610-616), and its extraction uses the fallback pattern (lines 1628-1631). The userMessages query now fetches STATUS_MESSAGE and OTEL_STATUS_CODE, but still lacks OTEL_STATUS_DESCRIPTION.


💡 APPROVE WITH SUGGESTIONS

Summary: Good progress — the otelStatusCode dead code issue is now resolved. However, two prior issues about error message consistency remain: (1) add OTEL_STATUS_DESCRIPTION to the query, and (2) use the fallback pattern for extraction. These ensure error messages display regardless of which OTEL attribute the span uses. The inline comments from the prior review have 1-click suggestions ready to apply.

Reviewers (1)
Reviewer Returned Main Findings Consider While You're Here Inline Comments Pending Recs Discarded
orchestrator (delta) 3 0 0 0 0 2 1
Total 3 0 0 0 0 2 1

Note: Delta re-review — prior inline comments remain active and accurate.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 9, 2026

This pull request has been automatically marked as stale because it has not had recent activity.
It will be closed in 7 days if no further activity occurs.

If this PR is still relevant:

  • Rebase it on the latest main branch
  • Add a comment explaining its current status
  • Request a review if it's ready

Thank you for your contributions!

@github-actions github-actions Bot added the stale label Mar 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

This pull request has been automatically closed due to inactivity.

If you'd like to continue working on this, please:

  1. Create a new branch from the latest main
  2. Cherry-pick your commits or rebase your changes
  3. Open a new pull request

Thank you for your understanding!

@github-actions github-actions Bot closed this Mar 19, 2026
@github-actions github-actions Bot deleted the TI-21 branch March 19, 2026 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant