Skip to content

Python: MAF observability stream wrapper can fail streamed agent execution with ContextVar reset error #6866

Description

@websterian

Summary

Enabling Microsoft Agent Framework observability for Python changes the behavior of streamed agent execution by wrapping agent.run(..., stream=True) with telemetry pull contexts and cleanup/finalization hooks. In an Azure Functions HTTP/SSE streaming scenario, this wrapper can cause an otherwise successful streamed agent response to fail at stream completion with a ContextVar reset error.

This is concerning because observability should not be able to interrupt normal agent execution. If telemetry capture or cleanup fails, the agent stream should still complete normally and telemetry failure should be isolated.

Environment

  • Python: 3.13
  • Microsoft Agent Framework:
    • agent-framework-core==1.10.0
    • agent-framework-openai==1.10.0
  • OpenTelemetry:
    • azure-monitor-opentelemetry==1.8.8
    • opentelemetry-api==1.40.0
    • opentelemetry-sdk==1.40.0
  • Host: Azure Functions Python worker, local func start
  • Execution style: HTTP endpoint returning SSE
  • Agent execution path: agent.run(..., stream=True)

Configuration

The issue reproduces when MAF instrumentation is enabled:

ENABLE_INSTRUMENTATION=true
ENABLE_SENSITIVE_DATA=false

The same streamed agent path completes successfully when instrumentation is disabled:

ENABLE_INSTRUMENTATION=false

Minimal code shape

The application creates a MAF agent and streams updates to an HTTP/SSE response:

stream = agent.run(full_messages, stream=True)

async for update in stream:
    # translate AgentResponseUpdate into SSE events
    ...

With instrumentation enabled, the returned stream appears to be wrapped by MAF observability. The stream produces text deltas successfully, but fails during completion/finalization.

Observed behavior

The model response streams successfully. The application receives and emits text events such as:

start-step
text-start
text-delta
...
text-end

Then, when the stream is exhausted, MAF observability cleanup runs and raises:

ValueError: <Token var=<ContextVar name='inner_response_telemetry_captured_fields' default=None ...>> was created in a different Context

Full relevant traceback:

File ...site-packages\agent_framework\_types.py, line 3149, in __anext__
    update = await self._iterator.__anext__()
StopAsyncIteration

During handling of the above exception, another exception occurred:

File ...\runners\agent_runner.py, line 605, in run_prompt_agent_stream
    async for update in stream:

File ...site-packages\agent_framework\_types.py, line 3152, in __anext__
    await self._run_cleanup_hooks()

File ...site-packages\agent_framework\_types.py, line 3357, in _run_cleanup_hooks
    await result

File ...site-packages\agent_framework\observability.py, line 1878, in _finalize_stream
    INNER_RESPONSE_TELEMETRY_CAPTURED_FIELDS.reset(inner_response_telemetry_captured_fields_token)

ValueError: <Token var=<ContextVar name='inner_response_telemetry_captured_fields' default=None ...>> was created in a different Context

The HTTP request itself may still return 200 because the app catches stream exceptions and closes the SSE response, but the domain agent run is marked failed. The stream does not emit the normal completion lifecycle events.

Expected behavior

Enabling observability should not alter the success/failure semantics of streamed agent execution.

Expected behavior would be one of:

  1. The telemetry wrapper handles its ContextVar cleanup correctly across async contexts.
  2. If telemetry finalization fails, the failure is captured/logged internally and does not propagate into the user's async for update in stream.
  3. The stream still completes normally and returns/raises only errors from the actual agent/model/tool execution, not telemetry cleanup.

Actual behavior

With ENABLE_INSTRUMENTATION=true, stream cleanup/finalization raises from MAF observability internals. That exception bubbles out of the streamed execution loop and causes the application to treat the agent run as failed.

With ENABLE_INSTRUMENTATION=false, the same E2E scenario emits the expected completion sequence:

text-end
finish-step
data-workflow-end status=completed
finish
[DONE]

Why this is high impact

Observability should be non-invasive. In this case, enabling observability fundamentally changes the stream implementation by wrapping the response stream and adding cleanup hooks. A telemetry cleanup failure can therefore fail regular agent execution even after the model response has completed successfully.

For production systems, this makes it unsafe to enable MAF observability on streamed agents because telemetry can become part of the critical execution path.

Related observation

The 1.10.0 release notes mention fixes around telemetry context handling, including background agent telemetry context errors and streaming span context preservation. This issue appears to be in the same family, but affects streamed agent consumption in an Azure Functions HTTP/SSE async context.

Repro notes

We verified locally:

  • Same test with ENABLE_INSTRUMENTATION=true: stream emits text, then fails during MAF _finalize_stream with ContextVar reset error.
  • Same test with ENABLE_INSTRUMENTATION=false: stream completes normally with final lifecycle events and the run is marked completed.

Suggested fix direction

The telemetry cleanup/finalization hook should not propagate cleanup errors into the user stream. More importantly, ContextVar tokens should be set and reset in the exact same async context, or the cleanup logic should guard against cross-context reset attempts.

At minimum, observability failures should be isolated from agent execution so enabling telemetry cannot turn a successful streamed run into a failed run.

Metadata

Metadata

Assignees

No one assigned

    Labels

    pythonUsage: [Issues, PRs], Target: PythontriageUsage: [Issues], Target: All issues that still need to be triaged

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions