Summary
Enabling Microsoft Agent Framework observability for Python changes the behavior of streamed agent execution by wrapping agent.run(..., stream=True) with telemetry pull contexts and cleanup/finalization hooks. In an Azure Functions HTTP/SSE streaming scenario, this wrapper can cause an otherwise successful streamed agent response to fail at stream completion with a ContextVar reset error.
This is concerning because observability should not be able to interrupt normal agent execution. If telemetry capture or cleanup fails, the agent stream should still complete normally and telemetry failure should be isolated.
Environment
- Python: 3.13
- Microsoft Agent Framework:
agent-framework-core==1.10.0
agent-framework-openai==1.10.0
- OpenTelemetry:
azure-monitor-opentelemetry==1.8.8
opentelemetry-api==1.40.0
opentelemetry-sdk==1.40.0
- Host: Azure Functions Python worker, local
func start
- Execution style: HTTP endpoint returning SSE
- Agent execution path:
agent.run(..., stream=True)
Configuration
The issue reproduces when MAF instrumentation is enabled:
ENABLE_INSTRUMENTATION=true
ENABLE_SENSITIVE_DATA=false
The same streamed agent path completes successfully when instrumentation is disabled:
ENABLE_INSTRUMENTATION=false
Minimal code shape
The application creates a MAF agent and streams updates to an HTTP/SSE response:
stream = agent.run(full_messages, stream=True)
async for update in stream:
# translate AgentResponseUpdate into SSE events
...
With instrumentation enabled, the returned stream appears to be wrapped by MAF observability. The stream produces text deltas successfully, but fails during completion/finalization.
Observed behavior
The model response streams successfully. The application receives and emits text events such as:
start-step
text-start
text-delta
...
text-end
Then, when the stream is exhausted, MAF observability cleanup runs and raises:
ValueError: <Token var=<ContextVar name='inner_response_telemetry_captured_fields' default=None ...>> was created in a different Context
Full relevant traceback:
File ...site-packages\agent_framework\_types.py, line 3149, in __anext__
update = await self._iterator.__anext__()
StopAsyncIteration
During handling of the above exception, another exception occurred:
File ...\runners\agent_runner.py, line 605, in run_prompt_agent_stream
async for update in stream:
File ...site-packages\agent_framework\_types.py, line 3152, in __anext__
await self._run_cleanup_hooks()
File ...site-packages\agent_framework\_types.py, line 3357, in _run_cleanup_hooks
await result
File ...site-packages\agent_framework\observability.py, line 1878, in _finalize_stream
INNER_RESPONSE_TELEMETRY_CAPTURED_FIELDS.reset(inner_response_telemetry_captured_fields_token)
ValueError: <Token var=<ContextVar name='inner_response_telemetry_captured_fields' default=None ...>> was created in a different Context
The HTTP request itself may still return 200 because the app catches stream exceptions and closes the SSE response, but the domain agent run is marked failed. The stream does not emit the normal completion lifecycle events.
Expected behavior
Enabling observability should not alter the success/failure semantics of streamed agent execution.
Expected behavior would be one of:
- The telemetry wrapper handles its
ContextVar cleanup correctly across async contexts.
- If telemetry finalization fails, the failure is captured/logged internally and does not propagate into the user's
async for update in stream.
- The stream still completes normally and returns/raises only errors from the actual agent/model/tool execution, not telemetry cleanup.
Actual behavior
With ENABLE_INSTRUMENTATION=true, stream cleanup/finalization raises from MAF observability internals. That exception bubbles out of the streamed execution loop and causes the application to treat the agent run as failed.
With ENABLE_INSTRUMENTATION=false, the same E2E scenario emits the expected completion sequence:
text-end
finish-step
data-workflow-end status=completed
finish
[DONE]
Why this is high impact
Observability should be non-invasive. In this case, enabling observability fundamentally changes the stream implementation by wrapping the response stream and adding cleanup hooks. A telemetry cleanup failure can therefore fail regular agent execution even after the model response has completed successfully.
For production systems, this makes it unsafe to enable MAF observability on streamed agents because telemetry can become part of the critical execution path.
Related observation
The 1.10.0 release notes mention fixes around telemetry context handling, including background agent telemetry context errors and streaming span context preservation. This issue appears to be in the same family, but affects streamed agent consumption in an Azure Functions HTTP/SSE async context.
Repro notes
We verified locally:
- Same test with
ENABLE_INSTRUMENTATION=true: stream emits text, then fails during MAF _finalize_stream with ContextVar reset error.
- Same test with
ENABLE_INSTRUMENTATION=false: stream completes normally with final lifecycle events and the run is marked completed.
Suggested fix direction
The telemetry cleanup/finalization hook should not propagate cleanup errors into the user stream. More importantly, ContextVar tokens should be set and reset in the exact same async context, or the cleanup logic should guard against cross-context reset attempts.
At minimum, observability failures should be isolated from agent execution so enabling telemetry cannot turn a successful streamed run into a failed run.
Summary
Enabling Microsoft Agent Framework observability for Python changes the behavior of streamed agent execution by wrapping
agent.run(..., stream=True)with telemetry pull contexts and cleanup/finalization hooks. In an Azure Functions HTTP/SSE streaming scenario, this wrapper can cause an otherwise successful streamed agent response to fail at stream completion with aContextVarreset error.This is concerning because observability should not be able to interrupt normal agent execution. If telemetry capture or cleanup fails, the agent stream should still complete normally and telemetry failure should be isolated.
Environment
agent-framework-core==1.10.0agent-framework-openai==1.10.0azure-monitor-opentelemetry==1.8.8opentelemetry-api==1.40.0opentelemetry-sdk==1.40.0func startagent.run(..., stream=True)Configuration
The issue reproduces when MAF instrumentation is enabled:
The same streamed agent path completes successfully when instrumentation is disabled:
Minimal code shape
The application creates a MAF agent and streams updates to an HTTP/SSE response:
With instrumentation enabled, the returned stream appears to be wrapped by MAF observability. The stream produces text deltas successfully, but fails during completion/finalization.
Observed behavior
The model response streams successfully. The application receives and emits text events such as:
Then, when the stream is exhausted, MAF observability cleanup runs and raises:
Full relevant traceback:
The HTTP request itself may still return
200because the app catches stream exceptions and closes the SSE response, but the domain agent run is marked failed. The stream does not emit the normal completion lifecycle events.Expected behavior
Enabling observability should not alter the success/failure semantics of streamed agent execution.
Expected behavior would be one of:
ContextVarcleanup correctly across async contexts.async for update in stream.Actual behavior
With
ENABLE_INSTRUMENTATION=true, stream cleanup/finalization raises from MAF observability internals. That exception bubbles out of the streamed execution loop and causes the application to treat the agent run as failed.With
ENABLE_INSTRUMENTATION=false, the same E2E scenario emits the expected completion sequence:Why this is high impact
Observability should be non-invasive. In this case, enabling observability fundamentally changes the stream implementation by wrapping the response stream and adding cleanup hooks. A telemetry cleanup failure can therefore fail regular agent execution even after the model response has completed successfully.
For production systems, this makes it unsafe to enable MAF observability on streamed agents because telemetry can become part of the critical execution path.
Related observation
The 1.10.0 release notes mention fixes around telemetry context handling, including background agent telemetry context errors and streaming span context preservation. This issue appears to be in the same family, but affects streamed agent consumption in an Azure Functions HTTP/SSE async context.
Repro notes
We verified locally:
ENABLE_INSTRUMENTATION=true: stream emits text, then fails during MAF_finalize_streamwithContextVarreset error.ENABLE_INSTRUMENTATION=false: stream completes normally with final lifecycle events and the run is marked completed.Suggested fix direction
The telemetry cleanup/finalization hook should not propagate cleanup errors into the user stream. More importantly,
ContextVartokens should be set and reset in the exact same async context, or the cleanup logic should guard against cross-context reset attempts.At minimum, observability failures should be isolated from agent execution so enabling telemetry cannot turn a successful streamed run into a failed run.