Python: refactor FoundryHostedAgentHistoryProvider onto Foundry SDK by eavanvalkenburg · Pull Request #5637 · microsoft/agent-framework

eavanvalkenburg · 2026-05-05T08:50:09Z

Motivation and Context

This PR is the first piece of the Python hosting Channels work that landed as ADR / spec in #5549 and is being merged into the feature/python-hosting integration branch. It refactors agent-framework-foundry-hosting so the rest of the hosting-Channels stack can build on top of a clean shared layer.

Description

Splits the previously monolithic _responses.py into focused modules:
- _history_provider.py — FoundryHostedAgentHistoryProvider exposed as the public history-provider entry point.
- _shared.py / _ids.py — internal building blocks reused across Responses and the upcoming hosting Channels.
Updates the public __init__.py exports to surface the new history-provider type and prepare for the hosting integration.
Adds focused unit tests (tests/test_history_provider.py, expanded tests/test_responses.py).
Workspace + lockfile updates so the package keeps resolving cleanly.

Stack

This is the first PR (PR-1 of 9) in the hosting-Channels stack, all targeting feature/python-hosting:

#	Branch	Status
PR-1	`refactor/foundry-hosted-agent-history-provider`	this PR
PR-2	`feat/hosting-core`	independent of PR-1
PR-3..7	per-channel packages	depend on PR-2
PR-8	`feat/hosting-samples`	depends on PR-1 + PR-2..7

PR-1 is independent of all the others — can merge in any order relative to PR-2.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? No — internal refactor; the only public surface change is the new FoundryHostedAgentHistoryProvider re-export.

… azure.ai.agentserver SDK Rebuilds the Foundry hosted-agent history provider on top of ``azure.ai.agentserver``'s ``FoundryStorageProvider`` instead of the in-house ``_HttpStorageBackend``. Splits the monolithic ``_responses.py`` into focused modules: - ``_history_provider.py`` — new ``FoundryHostedAgentHistoryProvider`` that talks to the SDK's ``FoundryStorageProvider``, threads ``response_id`` / ``previous_response_id`` through ``ContextVar``s via ``bind_request_context``, and lifts host-bound isolation keys (``x-agent-{user,chat}-isolation-key``) from the optional ``agent_framework_hosting`` package into a provider-local ``IsolationContext`` so the storage layer carries the correct partition keys without channels having to know about them. - ``_shared.py`` — extracts all SDK ``Item`` / ``OutputItem`` ↔ framework ``Message`` conversion helpers into one place so both ``_responses.py`` and the new history provider can share them. Restores ``_convert_file_data`` for inline ``input_file`` payloads, and the hosted-MCP routing for ``custom_tool_call_output`` items whose ``call_id`` carries the ``mcp_*`` prefix. - ``_ids.py`` — shared id helpers. - ``_responses.py`` — shrinks ~700 lines, re-exports converters for back-compat with existing tests. - ``tests/test_history_provider.py`` — exercises the new provider against a fake SDK backend; the host-isolation test is gated on the optional ``agent_framework_hosting`` import. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Automated Code Review

Reviewers: 4 | Confidence: 82%

✓ Correctness

The new _shared.py module extracts and consolidates conversion helpers from _responses.py. The code is structurally sound and consistent with the existing implementation. The _collect_unknown_keys / _inject_extras pair is symmetric, ensuring correct round-trip semantics. The test coverage is thorough, including end-to-end round-trip tests through InMemoryResponseProvider. No blocking correctness issues were found.

✓ Security Reliability

No actionable issues found in this dimension.

✓ Test Coverage

No actionable issues found in this dimension.

✗ Design Approach

The consolidation into _shared.py is directionally good, but two design issues in the new persistence path would corrupt history semantics. First, mcp_approval_response is reconstructed with the response record id instead of the approval request id, which breaks the framework’s own approval-response contract and can cause replayed history to target the wrong request. Second, the text-only fallback now concatenates adjacent text fragments without separators, so persisted messages can come back with different user-visible text than the original Message carried.

Flagged Issues

MCP approval responses are loaded with Content.id = mcp_resp.id, but the framework and OpenAI adapter both treat Content.id as the approval_request_id (_types.py:1289-1292, _chat_client.py:1645-1649). Replaying history can therefore approve the wrong MCP request.

Automated review by eavanvalkenburg's agents

Adds an optional `local_storage_root: str | Path | None` parameter to `FoundryHostedAgentHistoryProvider`. When set and the provider is running outside a Foundry Hosted Agent container, conversations are persisted to JSONL files via `agent_framework.FileHistoryProvider` laid out as: {root}/{user_key or '~none'}/{chat_key or '~none'}/{session_id}.jsonl Hosted mode (FOUNDRY_HOSTING_ENVIRONMENT set) ignores the option with a one-time INFO log so Foundry storage always wins on the platform. The in-memory fallback is unchanged when the option is omitted. Path safety: isolation segments are validated against the same character allowlist FileHistoryProvider uses for session-id stems and base64-url-encoded with a reserved "~iso-" prefix when unsafe. "~none" sentinel for missing keys can never collide with a real isolation key (real keys starting with "~" are encoded). The resolved target dir is also re-checked to be inside the configured root. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- _shared.py:_capture_raw narrows `except Exception` to `except TypeError` and emits a WARNING with traceback so the lossy fallback to a synthesized round-trip is observable. Mirrors the reviewer suggestion. - _history_provider.py:save_messages narrows `except Exception` to `except FoundryStorageError` so only storage-validation failures (4xx/5xx, opaque server errors) are swallowed. Network / TLS / auth / payload-builder bugs propagate so the caller can retry / alert. Adds an instance-level `failed_writes` counter operators can poll for silent-drop visibility. - _history_provider.py id-stamping loop: drops the `contextlib.suppress(AttributeError, TypeError)` around `item.id = new_id` so SDK contract changes surface in the test suite instead of silently corrupting the chain (the storage backend rejects the entire `create_response` with HTTP 500 when synthetic prefix-based ids leak through). `import contextlib` removed. - tests: * Unit-cover `foundry_response_id` / `foundry_response_id_factory` / `foundry_item_id` so SDK `IdGenerator` contract changes are caught locally. * Cover the `save_messages` wire payload: required-by-storage fields (`background`, `parallel_tool_calls`, `instructions`, `agent_reference`), env-var-driven stamping (`FOUNDRY_AGENT_NAME` / `FOUNDRY_AGENT_VERSION` / `FOUNDRY_AGENT_SESSION_ID` / `MODEL_DEPLOYMENT_NAME` with `AZURE_AI_MODEL_DEPLOYMENT_NAME` fallback), and the rule that `model` / `agent_session_id` / `agent_reference.version` are omitted (not stamped to `None`) when their env vars are unset. * Cover the `FOUNDRY_AGENT_SESSION_ID` last-resort chain anchor on both the get and save paths, including the prefix gate that blocks non-`caresp_*`/`resp_*` values from reaching storage, and the precedence rule that a host binding wins over the env. * Replace the old `test_save_messages_swallows_backend_errors` with two tests asserting the new contract: storage errors are swallowed and bump `failed_writes`; everything else propagates and leaves the counter at zero. 141 unit tests pass; mypy + pyright + ruff clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

TaoChenOSU · 2026-05-06T21:10:54Z

+        """Return ``True`` when running inside a Foundry Hosted Agent container.
+
+        Detection uses the ``FOUNDRY_HOSTING_ENVIRONMENT`` environment
+        variable, the same signal :class:`ResponsesAgentServerHost` uses to
+        switch between hosted and local storage backends.
+        """
+        return bool(os.environ.get(_ENV_FOUNDRY_HOSTING_ENVIRONMENT))


This will break if the platform decides to change the env variable. We should use AgentConfig.is_hosted()

TaoChenOSU · 2026-05-06T21:15:12Z

+            :class:`FileHistoryProvider` and ``session_id`` is used as
+            the literal file stem.
+        """
+        isolation = kwargs.get("isolation") or _host_isolation() or get_current_isolation()


I am not sure if we should manage the context isolation in our layer. For hosting scenario, isolation should be managed by the infrastructure. For local scenario, why would it require isolation?

TaoChenOSU · 2026-05-06T21:16:43Z

+            # The Foundry Hosted Agent runtime stamps the previous turn's
+            # response id into ``FOUNDRY_AGENT_SESSION_ID`` for the
+            # following turn's container, so we can walk back from it
+            # directly without keeping any cross-request state ourselves.
+            env_session = os.environ.get("FOUNDRY_AGENT_SESSION_ID") or None


The session ID is bound to the container, not the conversation. I don't think the session ID will give you any history. Has this been tested in a hosted environment?

TaoChenOSU · 2026-05-06T21:21:39Z

+        backend = self._resolve_backend()
+
+        try:
+            item_ids = await backend.get_history_item_ids(


The ResponseContext provides a get_history method that handles retrieving the items and isolation. Could we reuse that?

I'd like to actually explore the idea of encapsulating the foundry history provider into our own provider abstraction.

moonbox3 added the python label May 5, 2026

github-actions Bot reviewed May 5, 2026

View reviewed changes

moonbox3 reviewed May 6, 2026

View reviewed changes

TaoChenOSU approved these changes May 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: refactor FoundryHostedAgentHistoryProvider onto Foundry SDK#5637

Python: refactor FoundryHostedAgentHistoryProvider onto Foundry SDK#5637
eavanvalkenburg wants to merge 3 commits intomicrosoft:feature/python-hostingfrom
eavanvalkenburg:refactor/foundry-hosted-agent-history-provider

eavanvalkenburg commented May 5, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TaoChenOSU May 6, 2026

Uh oh!

TaoChenOSU May 6, 2026

Uh oh!

TaoChenOSU May 6, 2026 •

edited

Loading

Uh oh!

TaoChenOSU May 6, 2026

Uh oh!

TaoChenOSU May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

eavanvalkenburg commented May 5, 2026

Motivation and Context

Description

Stack

Contribution Checklist

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✗ Design Approach

Flagged Issues

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TaoChenOSU May 6, 2026

Choose a reason for hiding this comment

Uh oh!

TaoChenOSU May 6, 2026

Choose a reason for hiding this comment

Uh oh!

TaoChenOSU May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TaoChenOSU May 6, 2026

Choose a reason for hiding this comment

Uh oh!

TaoChenOSU May 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

TaoChenOSU May 6, 2026 •

edited

Loading