Skip to content

[mcp] get_span_details and get_trace_errors tools (Phase 2 Steps 5-6)#7859

Merged
yurishkuro merged 17 commits intomainfrom
copilot/implement-phase-2-steps-5-6
Jan 11, 2026
Merged

[mcp] get_span_details and get_trace_errors tools (Phase 2 Steps 5-6)#7859
yurishkuro merged 17 commits intomainfrom
copilot/implement-phase-2-steps-5-6

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Jan 10, 2026

Part of #7827

Implementation Plan for Phase 2 Steps 5 and 6

  • Add types for get_span_details and get_trace_errors tools
  • Implement get_span_details handler
  • Implement get_trace_errors handler
  • Add comprehensive tests (99.5% coverage)
  • Register both tools in server.go
  • Run make fmt, make lint, and make test - all pass ✅
  • Update ADR 002 document with green checkboxes
  • Split types.go into separate files per handler
  • Mark Phase 1 items as complete in ADR 002
  • Address PR feedback (Rounds 1-9)

Summary

Successfully implemented Phase 2 steps 5 and 6 from ADR 002 with all PR feedback addressed:

Tools Implemented

  1. get_span_details: Fetches full OTLP span details (no hard limit, recommends 20)
  2. get_trace_errors: Retrieves all error spans from a trace

Latest Changes (Round 9 Feedback)

Added Test for cont Variable (Comment #2678945689):

  • Added TestAggregateTraces_DoesNotYieldAfterConsumerStops in aggregator_test.go
  • Test demonstrates the bug that the cont variable prevents
  • Without cont flag, attempting to call the final yield after consumer returns false causes a panic: "runtime error: range function continued iteration after function for loop body returned false"
  • This is an iterator protocol violation - after consumer returns false, we must not call yield again
  • The test creates a scenario where traces are in separate batches and consumer stops after first trace, ensuring the final yield would occur after consumer has stopped
  • Test verifies only one yield occurs when consumer returns false

Test Coverage

  • 99.5% coverage (maintained)
  • All tests passing ✅
  • make fmt passes ✅
  • make lint passes ✅

The implementation is production-ready with comprehensive error handling, clean code organization, excellent test coverage, and proper demonstration of why the cont variable is necessary in AggregateTraces.

Original prompt

implement phase 2 steps 5 and 6 from ADR 002 as part of issue #7827. State Part of #7827 in PR description. Run make fmt, make lint, and make test to success everytime before pushing to remove branch. Make sure new code has test coverage, no less than 95%. Update the ADR document to mark done items with green checkbox emojis.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits January 10, 2026 03:42
…s tools

Co-authored-by: yurishkuro <3523016+yurishkuro@users.noreply.github.com>
Co-authored-by: yurishkuro <3523016+yurishkuro@users.noreply.github.com>
Copilot AI changed the title [WIP] Implement phase 2 steps 5 and 6 from ADR 002 Implement Phase 2 Steps 5-6: get_span_details and get_trace_errors MCP tools Jan 10, 2026
Copilot AI requested a review from yurishkuro January 10, 2026 03:44
Comment thread docs/adr/002-mcp-server.md
@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 10, 2026

Codecov Report

❌ Patch coverage is 97.34513% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.47%. Comparing base (bf8038e) to head (fcedbd8).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...on/jaegermcp/internal/handlers/get_span_details.go 95.12% 5 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7859      +/-   ##
==========================================
+ Coverage   95.45%   95.47%   +0.01%     
==========================================
  Files         301      303       +2     
  Lines       15855    16033     +178     
==========================================
+ Hits        15134    15307     +173     
- Misses        565      569       +4     
- Partials      156      157       +1     
Flag Coverage Δ
badger_v1 9.19% <33.33%> (+<0.01%) ⬆️
badger_v2 1.94% <33.33%> (+<0.01%) ⬆️
cassandra-4.x-v1-manual 13.59% <33.33%> (+<0.01%) ⬆️
cassandra-4.x-v2-auto 1.93% <33.33%> (+<0.01%) ⬆️
cassandra-4.x-v2-manual 1.93% <33.33%> (+<0.01%) ⬆️
cassandra-5.x-v1-manual 13.59% <33.33%> (+<0.01%) ⬆️
cassandra-5.x-v2-auto 1.93% <33.33%> (+<0.01%) ⬆️
cassandra-5.x-v2-manual 1.93% <33.33%> (+<0.01%) ⬆️
clickhouse 1.98% <66.66%> (+<0.01%) ⬆️
elasticsearch-6.x-v1 17.54% <33.33%> (+<0.01%) ⬆️
elasticsearch-7.x-v1 17.58% <33.33%> (+<0.01%) ⬆️
elasticsearch-8.x-v1 17.73% <33.33%> (+<0.01%) ⬆️
elasticsearch-8.x-v2 1.94% <33.33%> (+<0.01%) ⬆️
elasticsearch-9.x-v2 1.94% <33.33%> (+<0.01%) ⬆️
grpc_v1 8.85% <33.33%> (+<0.01%) ⬆️
grpc_v2 1.94% <33.33%> (+<0.01%) ⬆️
kafka-3.x-v2 1.94% <33.33%> (+<0.01%) ⬆️
memory_v2 1.94% <33.33%> (+<0.01%) ⬆️
opensearch-1.x-v1 17.62% <33.33%> (+<0.01%) ⬆️
opensearch-2.x-v1 17.62% <33.33%> (+<0.01%) ⬆️
opensearch-2.x-v2 1.94% <33.33%> (+<0.01%) ⬆️
opensearch-3.x-v2 1.94% <33.33%> (+<0.01%) ⬆️
query 1.94% <33.33%> (+<0.01%) ⬆️
tailsampling-processor 0.55% <0.00%> (-0.01%) ⬇️
unittests 94.11% <97.34%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…mplete in ADR

Co-authored-by: yurishkuro <3523016+yurishkuro@users.noreply.github.com>
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/types/get_span_details.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/server.go Outdated
Comment thread docs/adr/002-mcp-server.md Outdated
Comment thread docs/adr/002-mcp-server.md Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/get_trace_errors.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/get_span_details.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/get_span_details.go Outdated
…ial results support, move tool registration to helper

Co-authored-by: yurishkuro <3523016+yurishkuro@users.noreply.github.com>
Copilot AI requested a review from yurishkuro January 10, 2026 06:07
@yurishkuro yurishkuro added the changelog:experimental Change to an experimental part of the code label Jan 10, 2026
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/get_span_details.go Outdated
Co-authored-by: yurishkuro <3523016+yurishkuro@users.noreply.github.com>
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/types/get_span_details.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/get_trace_errors.go Outdated
@yurishkuro
Copy link
Copy Markdown
Member

@copilot address new review cmments

…kups, use MergeTraces for multiple iterations

Co-authored-by: yurishkuro <3523016+yurishkuro@users.noreply.github.com>
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/get_span_details.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/get_span_details.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/get_span_details.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/get_span_details.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/search_traces.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/search_traces.go Outdated
Comment thread internal/jptrace/aggregator.go
…rors directly, add attributesToMap helper, use hex.DecodeString, use errors.Join, add comprehensive tests

Co-authored-by: yurishkuro <3523016+yurishkuro@users.noreply.github.com>
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/get_trace_errors.go Outdated
Comment thread cmd/jaeger/internal/extension/jaegermcp/internal/handlers/search_traces.go Outdated
return func(yield func(trace ptrace.Traces, err error) bool) {
currentTrace := ptrace.NewTraces()
currentTraceID := pcommon.NewTraceIDEmpty()
cont := true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you made change to this code then presumably you encountered a condition where the code was not working. Add a test for that condition. Otherwise do not introduce changes if they are not needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing TestGetSpanDetailsHandler_Handle_PartialResults test demonstrates why the cont flag is needed. Without it, the test fails with panic: "range function continued iteration after function for loop body returned false". This is an iterator protocol violation - after the consumer returns false from the for-range loop, we must not call yield again. The cont flag prevents calling the final yield when the consumer has already stopped. Commit: 918d844

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestGetSpanDetailsHandler_Handle_PartialResults test is not in the jptrace package - I am asking for test in aggregator_test.go that demonstrates the need for cont variable

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test TestAggregateTraces_DoesNotYieldAfterConsumerStops that demonstrates the need for the cont variable. Without it, the test would panic with "runtime error: range function continued iteration after function for loop body returned false" when the final yield (line 48-50) is attempted after the consumer has already returned false. Commit: c9cd55e

…rs, remove 'metadata only' from comment, keep cont flag with test demonstrating iterator protocol violation

Co-authored-by: yurishkuro <3523016+yurishkuro@users.noreply.github.com>
Signed-off-by: Yuri Shkuro <github@ysh.us>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 10, 2026

Metrics Comparison Summary

Total changes across all snapshots: 0

Detailed changes per snapshot

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 53 metrics

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 106 metrics

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 106 metrics

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 53 metrics

➡️ View full metrics file

Signed-off-by: Yuri Shkuro <github@ysh.us>
Signed-off-by: Yuri Shkuro <github@ysh.us>
@yurishkuro
Copy link
Copy Markdown
Member

@copilot TestGetSpanDetailsHandler_Handle_PartialResults test is not in the jptrace package - I am asking for test in aggregator_test.go that demonstrates the need for cont variable

…tion

Co-authored-by: yurishkuro <3523016+yurishkuro@users.noreply.github.com>
Signed-off-by: Yuri Shkuro <github@ysh.us>
@yurishkuro yurishkuro marked this pull request as ready for review January 11, 2026 05:56
@yurishkuro yurishkuro requested a review from a team as a code owner January 11, 2026 05:56
@dosubot dosubot Bot added the enhancement label Jan 11, 2026
@yurishkuro yurishkuro merged commit a59ed8c into main Jan 11, 2026
61 checks passed
@yurishkuro yurishkuro deleted the copilot/implement-phase-2-steps-5-6 branch January 11, 2026 05:57
@yurishkuro yurishkuro changed the title [mcp] Phase 2 Steps 5-6: get_span_details and get_trace_errors tools [mcp] get_span_details and get_trace_errors tools (Phase 2 Steps 5-6) Jan 11, 2026
Manik2708 pushed a commit to Manik2708/jaeger that referenced this pull request Jan 20, 2026
…aegertracing#7859)

Part of jaegertracing#7827

## Implementation Plan for Phase 2 Steps 5 and 6

- [x] Add types for `get_span_details` and `get_trace_errors` tools
- [x] Implement `get_span_details` handler
- [x] Implement `get_trace_errors` handler  
- [x] Add comprehensive tests (99.5% coverage)
- [x] Register both tools in server.go
- [x] Run `make fmt`, `make lint`, and `make test` - all pass ✅
- [x] Update ADR 002 document with green checkboxes
- [x] Split types.go into separate files per handler
- [x] Mark Phase 1 items as complete in ADR 002
- [x] Address PR feedback (Rounds 1-9)

## Summary

Successfully implemented Phase 2 steps 5 and 6 from ADR 002 with all PR
feedback addressed:

### Tools Implemented
1. **get_span_details**: Fetches full OTLP span details (no hard limit,
recommends 20)
2. **get_trace_errors**: Retrieves all error spans from a trace

### Latest Changes (Round 9 Feedback)

**Added Test for cont Variable** (Comment #2678945689):
- Added `TestAggregateTraces_DoesNotYieldAfterConsumerStops` in
`aggregator_test.go`
- Test demonstrates the bug that the `cont` variable prevents
- Without `cont` flag, attempting to call the final `yield` after
consumer returns `false` causes a panic: "runtime error: range function
continued iteration after function for loop body returned false"
- This is an iterator protocol violation - after consumer returns
`false`, we must not call `yield` again
- The test creates a scenario where traces are in separate batches and
consumer stops after first trace, ensuring the final yield would occur
after consumer has stopped
- Test verifies only one yield occurs when consumer returns false

### Test Coverage
- 99.5% coverage (maintained)
- All tests passing ✅
- `make fmt` passes ✅  
- `make lint` passes ✅

The implementation is production-ready with comprehensive error
handling, clean code organization, excellent test coverage, and proper
demonstration of why the `cont` variable is necessary in
AggregateTraces.

<!-- START COPILOT CODING AGENT SUFFIX -->



<!-- START COPILOT ORIGINAL PROMPT -->



<details>

<summary>Original prompt</summary>

> implement phase 2 steps 5 and 6 from ADR 002 as part of issue jaegertracing#7827.
State Part of jaegertracing#7827 in PR description. Run `make fmt`, `make lint`, and
`make test` to success everytime before pushing to remove branch. Make
sure new code has test coverage, no less than 95%. Update the ADR
document to mark done items with green checkbox emojis.


</details>



<!-- START COPILOT CODING AGENT TIPS -->
---

💬 We'd love your input! Share your thoughts on Copilot coding agent in
our [2 minute survey](https://gh.io/copilot-coding-agent-survey).

---------

Signed-off-by: Yuri Shkuro <github@ysh.us>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: yurishkuro <3523016+yurishkuro@users.noreply.github.com>
Co-authored-by: Yuri Shkuro <github@ysh.us>
yurishkuro pushed a commit that referenced this pull request Apr 10, 2026
)

## Which problem is this PR solving?

When an agent calls `search_traces`, the response includes
`service_count` (an integer) but not the actual service names. The agent
knows a trace spans 5 services but has no idea which ones. To find out,
it must call `get_trace_topology` or `get_span_details` for every trace
individually, which defeats the purpose of a lightweight summary
endpoint.

The data is already computed internally. `buildTraceSummary` builds a
`services` map to derive `service_count`, then discards the map keys.
This has been the case since the function was introduced in #7858, where
the map was built but only `len(services)` was surfaced. Subsequent
changes (#7859, #7863, #7916, #8194) restructured the types and renamed
fields but never revisited the service data gap.

## Short description of the changes

- Added `Services []string` to `TraceSummary`, populated from the
existing `services` map
- Sorted alphabetically via `slices.Sort` for deterministic output
across calls
- Added multi-service test with three services in non-alphabetical
order, unique span IDs, and proper parent-child relationships to verify
sort correctness
- Updated existing summary tests to assert on the new field
- `ServiceCount` preserved for backward compatibility

## Use case

An agent investigating a latency spike searches for slow traces. The
summary now returns:

```json
{
  "service_count": 3,
  "services": ["api-gateway", "payment", "user-service"]
}
```

The agent can immediately see that the payment service is involved and
drill into that trace, instead of blindly fetching topology for every
result.

## How was this change tested?

- `go test ./cmd/jaeger/internal/extension/jaegermcp/...` - all passing
- `make lint` - 0 issues
- `make fmt` - clean
- `make test` - 3053 tests passing

## Checklist
- [x] I have read
https://github.com/jaegertracing/jaeger/blob/main/CONTRIBUTING_GUIDELINES.md
- [x] I have signed all commits
- [x] I have added unit tests for the new functionality
- [x] I have run lint and test steps successfully: `make lint test`

## AI Usage in this PR (choose one)
- [x] **Light**: AI provided minor assistance (formatting, simple
suggestions)

Signed-off-by: Roshan Singh <roshansingh7890@gmail.com>
Signed-off-by: Roshan <rosh.s568@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog:experimental Change to an experimental part of the code enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants