Skip to content

feat(telemetry): close five OTel GenAI semantic convention emission gaps (#1035)#1036

Merged
planetf1 merged 10 commits into
generative-computing:mainfrom
planetf1:cs/issue-1035
May 13, 2026
Merged

feat(telemetry): close five OTel GenAI semantic convention emission gaps (#1035)#1036
planetf1 merged 10 commits into
generative-computing:mainfrom
planetf1:cs/issue-1035

Conversation

@planetf1
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 commented May 7, 2026

Misc PR

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

Surfaces data Mellea already holds at span-emission time. No new collection
points, no new span types, no backend changes, no performance-sensitive paths.

Gap Attribute(s) Where
1. Provider identity gen_ai.provider.name alongside legacy gen_ai.system start_generate_span, instrument_generate_from_context/raw
2. Conversation identity gen_ai.conversation.id from existing session_id ContextVar start_generate_span
4. Error telemetry error.type (Stable OTel) + ERROR status on stream failures finalize_backend_span (error-path only) → core/base.py

Deferred: Gap 3 — prompt template attributes. Withdrawn per review (@jakelorocco): wrong layer — belongs at the formatter render path inside BackendTracingPlugin. Follow-up: epic #444 Phase 2 (depends on #1045).

Deferred: model_options / request params. Withdrawn per review (@ajbozarth, @jakelorocco): no backend call site passes model_options today. Deferred to GENERATION_PRE_CALL plugin hook post-#1045.

Deferred: Gap 5 — content capture. gen_ai.input.messages, gen_ai.output.messages, gen_ai.system_instructions require backend changes; post-#1045 these belong in BackendTracingPlugin. The MELLEA_TRACE_CONTENT env-var flag and add_span_event() helper added here are the infrastructure gap 5 will need.

Note: MELLEA_TRACE_CONTENT will be renamed to MELLEA_TRACES_CONTENT under #1046. The OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT alias is unaffected.

Spec: OTel GenAI semconv v1.37.0: gen_ai.system deprecated in favour of gen_ai.provider.name; both emitted for one release cycle.

Files changed:

  • mellea/telemetry/backend_instrumentation.pyget_provider_name(); start_generate_span() extended with conversation id; finalize_backend_span() (error-path); record_token_usage() extended with cache/reasoning tokens.
  • mellea/core/base.py — stream error path uses finalize_backend_span(error=...).
  • mellea/telemetry/tracing.pyis_content_tracing_enabled(), add_span_event(), _TRACE_CONTENT_ENABLED flag.
  • test/telemetry/test_genai_semconv_emission.py — 9 pure-unit tests (no live backend or OTel SDK required).
  • docs/examples/telemetry/otel_genai_semconv_example.py — runnable against otelite.

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code as added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

  • AI coding assistants used

planetf1 added 5 commits May 7, 2026 14:55
…lpers

Add MELLEA_TRACE_CONTENT env-var gate (also recognises the standard
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT) and expose
add_span_event() as a safe no-op wrapper. Both exported from
mellea.telemetry and mellea.telemetry.tracing.

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…ate attrs

Five OTel GenAI semconv gaps closed (issue generative-computing#1035):

1. gen_ai.provider.name emitted alongside legacy gen_ai.system (semconv
   v1.37.0 migration; keep both for dashboard back-compat).

2. gen_ai.conversation.id mapped from existing session_id ContextVar; the
   existing mellea.session_id attribute is preserved alongside it.

3. llm.prompt_template.template emitted unconditionally from Instruction
   and GenerativeStub; llm.prompt_template.variables gated behind
   MELLEA_TRACE_CONTENT (user data).

4. error.type (Stable OTel) set on the error path in the new
   finalize_backend_span() helper alongside set_span_error().
   finalize_backend_span() replaces the three-line record_token_usage +
   record_response_metadata + end_backend_span pattern in each backend.

5. gen_ai.input.messages, gen_ai.output.messages, gen_ai.system_instructions
   emitted as structured JSON (spec v1.37.0 schema) when MELLEA_TRACE_CONTENT
   is enabled. No deprecated per-role events (gen_ai.user.message etc.) are
   emitted. A gen_ai.client.inference.operation.details span event is added
   as a marker for log-oriented receivers.

Also adds gen_ai.request.temperature/top_p/top_k/frequency_penalty/
presence_penalty from model_options, and cache/reasoning token attrs in
record_token_usage().

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…etry

Instruction: capture _template_description (raw string before Jinja
substitution) and _user_variables (copy) in __init__; expose via
prompt_template_metadata() returning (template, variables, version)|None.

GenerativeStub: capture f_kwargs on each call; expose via
prompt_template_metadata() using the function docstring as the template
and f_kwargs as the variables.

Neither change affects runtime behaviour — data is retained for
duck-typed use by start_generate_span().

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Replace the duplicated record_token_usage + record_response_metadata +
end_backend_span pattern in each backend's post_processing() with a
single finalize_backend_span() call that also passes the conversation
and output text for content capture.

Pass model_options into start_generate_span() so request-parameter
attributes (temperature, top_p, etc.) are surfaced on the span.

The stream error path in core/base.py is also consolidated through
finalize_backend_span(error=...).

No behaviour change on the success path; error spans now carry
error.type and ERROR status instead of silently closing.

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
test/telemetry/test_genai_semconv_emission.py — 20 pure-unit tests
covering each of the five gaps (no live backend or OTel SDK required):
  - gen_ai.provider.name + gen_ai.system dual-emission
  - gen_ai.conversation.id from session_id ContextVar
  - llm.prompt_template.* from Instruction (always / gated)
  - error.type + ERROR status via finalize_backend_span
  - gen_ai.input/output.messages structured JSON (gated)
  - no deprecated per-role events emitted
  - finalize_backend_span robustness (None span, broken span)

docs/examples/telemetry/otel_genai_semconv_example.py — runnable
example for human verification against otelite, demonstrating all
five attributes and the error path.

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
@github-actions github-actions Bot added the enhancement New feature or request label May 7, 2026
…t capture (gap 5)

Remove finalize_backend_span success-path consolidation and all content
capture helpers (_emit_content_attributes, _conversation_to_parts).
Revert all five backend files to upstream/main — gap 5 requires touching
every backend and is better reviewed in isolation.

finalize_backend_span is kept as an error-path-only helper (sets
error.type + ERROR status, then closes the span) used by the stream
error path in ModelOutputThunk.__aiter__.

Full implementation including gap 5 is preserved on cs/issue-1035-full.

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
@ajbozarth
Copy link
Copy Markdown
Contributor

Flagging this PR against the refreshed tracing epic #444 (rewritten today) so the two efforts stay aligned.

Compatible to merge as-is. None of the four gaps landed here conflict with Phase 1 of the refresh — you're closing real semconv gaps regardless of whether emission happens in backend_instrumentation.py or in the Phase 1 tracing plugins. The attributes this PR adds (gen_ai.provider.name, gen_ai.conversation.id, llm.prompt_template.*, gen_ai.request.{temperature, top_p, ...}, error.type, cache/reasoning tokens) will be preserved by the Phase 1 plugin migration — see #1045 acceptance criteria, which explicitly require no regression on anything this PR adds.

Two coordination items for after this lands:

  1. Gap 5 (content capture) should rebase onto Phase 1, not land before it. Your deferral note says gap 5 needs to touch post_processing() in every backend — that's exactly the coupling Phase 1 eliminates. After refactor: move tracing onto plugin/hook pattern #1045, conversation/output data is available to plugins via GenerationPostCallPayload and mot.generation without any backend changes. Keeping cs/issue-1035-full on the shelf until Phase 1 merges saves 5 backend files of churn.

  2. MELLEA_TRACE_CONTENT will get renamed under refactor: rename tracing env vars to plural and align with OTel semconv #1046 alongside MELLEA_TRACE_APPLICATION / MELLEA_TRACE_BACKEND → plural MELLEA_TRACES_*. Rationale: OTel's signal-specific endpoint env var is already plural (OTEL_EXPORTER_OTLP_TRACES_ENDPOINT), and MELLEA_METRICS_* is plural, so we're aligning with the OTel standard partner var. One-release deprecation warning on the old MELLEA_TRACE_* names. The OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT alias you added is the OTel standard and stays unchanged.

Nothing here blocks merging. Flagging proactively so the order of operations is intentional.

@planetf1 planetf1 marked this pull request as ready for review May 11, 2026 13:41
@planetf1 planetf1 requested review from a team, jakelorocco and nrfulton as code owners May 11, 2026 13:41
@planetf1
Copy link
Copy Markdown
Contributor Author

planetf1 commented May 11, 2026

Thanks @ajbozarth — hadn't noticed the #1045 angle for gap 5, good to know before I went further down that path. Updated the description with both points.

Copy link
Copy Markdown
Contributor

@jakelorocco jakelorocco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think logging these attributes in our telemetry traces makes sense. I have a few concerns about how it's being done here.

Comment thread docs/examples/telemetry/otel_genai_semconv_example.py Outdated
Comment thread docs/examples/telemetry/README.md Outdated
Comment thread mellea/stdlib/components/instruction.py Outdated
Comment thread mellea/telemetry/backend_instrumentation.py Outdated
Comment thread mellea/telemetry/backend_instrumentation.py Outdated
Copy link
Copy Markdown
Contributor

@ajbozarth ajbozarth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few items from Claude, only one big one.

Comment thread mellea/telemetry/backend_instrumentation.py Outdated
Comment thread docs/examples/telemetry/otel_genai_semconv_example.py Outdated
Comment thread mellea/stdlib/components/genstub.py Outdated
Comment thread mellea/stdlib/components/instruction.py Outdated
Drop gap 3 (prompt-template capture) and the model_options/_REQUEST_PARAM_MAP
plumbing in response to review feedback from @jakelorocco and @ajbozarth.

Jake's objection to gap 3 is correct: stashing template state on Instruction
and GenerativeStub is the wrong layer — it only covers two component types,
captures pre-substitution values, and puts telemetry concerns inside domain
objects. The right implementation is at the formatter render path, which
covers all component types. That work belongs after generative-computing#1045 lands.

The model_options/_REQUEST_PARAM_MAP block was dead code: no backend call
site passes model_options, and even if wired the values would be
pre-substitution. Per Nathan's review, the right call is to drop rather than
carry forward a no-op. Request-param emission also belongs in the post-generative-computing#1045
plugin layer where the wire-format dict is visible.

What remains in this PR:
  gap 1 — gen_ai.provider.name alongside legacy gen_ai.system
  gap 2 — gen_ai.conversation.id from session_id ContextVar
  gap 4 — error.type + ERROR status via finalize_backend_span
  cache/reasoning token fields in record_token_usage
  MELLEA_TRACE_CONTENT flag + add_span_event (infrastructure for future gap 5)

Also fix OTel_SERVICE_NAME typo in the example (case-sensitive on Linux) and
rewrite the example docstring and README entry to be PR-independent.

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Remove the `action.prompt_template_metadata = None` assignment left over
from the gap-3 prompt-template work that was withdrawn from this PR.
The attribute is never read by `start_generate_span` in the trimmed
implementation, making the line misleading.

Add three unit tests for `add_span_event` (event forwarded to span,
None-span no-op, empty-attributes default) patching `_OTEL_AVAILABLE`
since the test environment has no OTel SDK installed.

Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
@planetf1 planetf1 requested review from ajbozarth and jakelorocco May 13, 2026 07:49
@planetf1
Copy link
Copy Markdown
Contributor Author

All CI checks are now passing (re-run cleared a Docker image pull flake on the sandbox test — pre-existing flake, unrelated to this PR's changes).

The review comments from @ajbozarth and @jakelorocco have been addressed in the commits already on this branch: gaps 3 and model_options withdrawn per your feedback; gap 5 deferred with the infrastructure in place. Happy to clarify anything further. Requesting re-review when you have a moment — thanks!

- Guard end_backend_span in its own try/except so SDK errors on the
  streaming error path cannot mask the original backend exception
- Wire get_provider_name into all three span-creation functions so
  internal code uses it (was set but calling get_system_name directly,
  contradicting the docstring guidance)
- Fix span_attrs: dict → dict[str, Any] per project typing conventions
- Replace _backend.model_id private-attr mutation in example with
  start_session(model_id=...) public API
- Add qualitative marker to example so it does not run in the fast loop
- Add test_finalize_never_raises_if_end_span_raises to cover the
  now-guarded end_backend_span code path

Assisted-by: Claude Code
@planetf1
Copy link
Copy Markdown
Contributor Author

Pushed a follow-up commit (5557734) addressing the self-review findings from the panel:

  • finalize_backend_span: end_backend_span is now wrapped in its own try/except so an SDK error on the streaming error path cannot mask the original backend exception
  • get_provider_name: wired into all three span-creation functions (was computed but not called; gen_ai.provider.name was being set via system_name instead)
  • Type annotation: span_attrs: dictdict[str, Any]
  • Example: replaced m2._backend.model_id = ... private-attr mutation with start_session(model_id=...) public API; added qualitative marker so it does not run in the default fast loop
  • New test: test_finalize_never_raises_if_end_span_raises covers the now-guarded end_backend_span path

13 tests passing. Re-requesting review from @ajbozarth, @jakelorocco, @nrfulton.

Copy link
Copy Markdown
Contributor

@ajbozarth ajbozarth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few non-blocking nits from Claude:

Comment thread mellea/telemetry/backend_instrumentation.py Outdated
Comment thread mellea/telemetry/backend_instrumentation.py
Comment thread docs/examples/telemetry/otel_genai_semconv_example.py Outdated
…x example error path

instrument_generate_from_context was imported but never called by any backend
(all backends use start_generate_span); remove function, __all__ entry, stale
imports in ollama.py and openai.py, and the corresponding test.

Example error path now uses an unreachable base_url (localhost:19999) instead
of a bogus model name, which could cause Ollama to attempt a pull rather than
fail deterministically.

Assisted-by: Claude Code
@planetf1 planetf1 enabled auto-merge May 13, 2026 17:24
@planetf1 planetf1 added this pull request to the merge queue May 13, 2026
Merged via the queue into generative-computing:main with commit 79769a3 May 13, 2026
8 checks passed
@planetf1 planetf1 deleted the cs/issue-1035 branch May 13, 2026 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OTel omissions/gaps: chat content events, gen_ai.conversation.id, prompt templates, error status

3 participants