Port from Chat Completions API to Responses API by pamelafox · Pull Request #292 · Azure-Samples/rag-postgres-openai-python

pamelafox · 2026-04-09T22:55:30Z

Purpose

Migrate the entire application from the OpenAI Chat Completions API to the newer Responses API. This modernizes the codebase to use client.responses.create() instead of client.chat.completions.create(), aligning with OpenAI's recommended API surface going forward.

Key changes:

Backend core (src/backend/fastapi_app/)

openai_clients.py: Replace AsyncAzureOpenAI with AsyncOpenAI using base_url="{endpoint}/openai/v1/" for Azure OpenAI. Remove the AZURE_OPENAI_VERSION parameter (no longer needed with the /v1/ endpoint). Remove the GitHub Models host option.
dependencies.py: Simplify client creation to use a unified AsyncOpenAI client for both Azure and OpenAI.com, with token-based auth or API key.
rag_simple.py / rag_advanced.py: Replace chat.completions.create() calls with responses.create(). Update message format from ChatCompletionMessageParam to ResponseInputItemParam. Update streaming to use Responses API event types.
query_rewriter.py: Port query rewriting and search argument extraction to use responses.create() with the new tool call format (ResponseFunctionToolCall).
embeddings.py: Switch Azure embed client from AsyncAzureOpenAI to AsyncOpenAI with /v1/ base URL.
api_models.py: Remove the seed field from ChatRequestOverrides.
prompts/query_fewshots.json: Update few-shot examples to use Responses API message format.

Infrastructure (infra/)

Remove AZURE_OPENAI_VERSION from main.bicep and main.parameters.json.
Update default chat model to gpt-5.4 with deployment version 2026-03-05.

Evals

generate_ground_truth.py: Port to Responses API with updated tool definitions.

Tests

Update all mock fixtures in conftest.py to return Responses API objects (Response, ResponseFunctionToolCall, etc.) instead of Chat Completions objects.
Simplify test_openai_clients.py to reflect the unified client approach.
Update all snapshot JSON/JSONL files in tests/snapshots/ for the new response format.

Config/DevContainer

Remove AZURE_OPENAI_VERSION from .env.sample, azure.yaml, and CI workflow.
Update openai package version constraint.

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

[x] Yes

AZURE_OPENAI_VERSION environment variable is removed — existing .env files with this variable will still work (it's just ignored), but the Bicep parameter is gone.
The seed field is removed from the chat request overrides API.
The GitHub Models host (OPENAI_CHAT_HOST=github) is removed.
The default chat model changed to gpt-5.4 — existing environments may need to update their model deployment.

Type of change

[x] Refactoring (no functional changes)
- Note: The chat API request contract changed (`messages` field renamed to `input`, using `ResponseInputItemParam` format) and response models were renamed (`ChatResponse` → `RetrievalResponse`/`RetrievalResponseDelta`).

Code quality checklist

See CONTRIBUTING.md for more details.

The current tests all pass (python -m pytest).
I added tests that prove my fix is effective or that my feature works
I ran python -m pytest --cov to verify 100% coverage of added lines
I ran python -m mypy to check for type errors
I either used the pre-commit hooks or ran ruff manually on my code.

…and follow-up Qs

Copilot

Pull request overview

This PR migrates the app’s OpenAI integration from the Chat Completions API to the newer Responses API end-to-end (backend, evals, tests, and the React client), while also removing some legacy configuration surface (e.g., AZURE_OPENAI_VERSION, GitHub Models).

Changes:

Switch backend RAG flows + query/tooling + mocks from chat.completions.create() to responses.create() (including new streaming event shapes).
Update the frontend to call /chat and /chat/stream directly and parse NDJSON Responses-style streaming events.
Remove AZURE_OPENAI_VERSION, seed, sessionState, follow-up questions, and GitHub Models support; bump OpenAI-related dependencies and update infra defaults (model/deployment).

Show a summary per file

File	Description
tests/test_openai_clients.py	Updates client tests to assert Responses API usage.
tests/test_dependencies.py	Updates expected default Azure chat model/deployment to `gpt-5.4`.
tests/test_api_routes.py	Renames request field from `messages` to `input` in route tests.
tests/snapshots/test_api_routes/test_simple_chat_streaming_flow/simple_chat_streaming_flow_response.jsonlines	Updates streaming snapshot to new NDJSON event format (currently captures an error).
tests/snapshots/test_api_routes/test_simple_chat_flow/simple_chat_flow_response.json	Updates non-streaming snapshot shape (`output_text`, system role in thoughts).
tests/snapshots/test_api_routes/test_simple_chat_flow_message_history/simple_chat_flow_message_history_response.json	Updates message-history snapshot shape (`output_text`, system role in thoughts).
tests/snapshots/test_api_routes/test_advanced_chat_streaming_flow/advanced_chat_streaming_flow_response.jsonlines	Updates advanced streaming snapshot to new NDJSON event format (currently captures an error).
tests/snapshots/test_api_routes/test_advanced_chat_flow/advanced_chat_flow_response.json	Updates advanced non-streaming snapshot shape (`output_text`, tool-call IDs, system role in thoughts).
tests/e2e.py	Removes `sessionState` request assertion from Playwright test.
tests/conftest.py	Reworks OpenAI mocks/fixtures to return Responses API objects and events.
src/frontend/src/pages/chat/Chat.tsx	Replaces `@microsoft/ai-chat-protocol` client with `fetch` + NDJSON stream parsing; uses `output_text`.
src/frontend/src/components/Answer/Answer.tsx	Adapts rendering to `output_text` and removes follow-up question UI.
src/frontend/src/api/models.ts	Updates frontend API types for new request (`input`) and response (`output_text`) shapes and streaming events.
src/frontend/package.json	Drops `@microsoft/ai-chat-protocol`, adds `ndjson-readablestream`.
src/frontend/package-lock.json	Lockfile updates for dependency changes.
src/backend/requirements.txt	Updates locked backend dependencies (notably `openai`, `openai-agents`, `mcp`, `pydantic`).
src/backend/pyproject.toml	Widens `openai` constraint to `<3.0.0` and pins `openai-agents>=0.13.6`.
src/backend/fastapi_app/update_embeddings.py	Removes GitHub-host embedding column branch.
src/backend/fastapi_app/routes/api_routes.py	Routes now accept `ChatRequest.input` and stream NDJSON `RetrievalResponseDelta` events.
src/backend/fastapi_app/rag_simple.py	Ports simple RAG to `OpenAIResponsesModel`, new response/delta schema, and Responses streaming events.
src/backend/fastapi_app/rag_base.py	Removes `seed` from chat params computation.
src/backend/fastapi_app/rag_advanced.py	Ports advanced RAG to `OpenAIResponsesModel`, new response/delta schema, and Responses streaming events.
src/backend/fastapi_app/query_rewriter.py	Updates tool schema + argument extraction to Responses API response/tool-call types.
src/backend/fastapi_app/prompts/query_fewshots.json	Updates few-shot tool-call IDs to match new function-call item format.
src/backend/fastapi_app/openai_clients.py	Unifies to `AsyncOpenAI` with Azure `/openai/v1/` base URL and removes GitHub host option.
src/backend/fastapi_app/embeddings.py	Narrows embedding client type to `AsyncOpenAI` (no Azure-specific client type).
src/backend/fastapi_app/dependencies.py	Uses `azure.identity.aio`, removes GitHub-host params, updates Azure defaults to `gpt-5.4`.
src/backend/fastapi_app/api_models.py	Changes API contract: `messages`→`input`, `message`→`output_text`, delta schema changes, removes followups/sessionState/seed.
src/backend/fastapi_app/init.py	Narrows stored OpenAI client types to `AsyncOpenAI`.
infra/main.parameters.json	Updates default Azure chat model/deployment/version to `gpt-5.4` / `2026-03-05`.
infra/main.bicep	Removes `AZURE_OPENAI_VERSION` parameter/env wiring and output.
evals/generate_ground_truth.py	Ports eval ground-truth generation to `responses.create()` and new tool-call extraction.
evals/evaluate.py	Removes GitHub Models branch from eval config loader.
evals/eval_config.json	Removes `seed` override (answer-path config not yet updated to `output_text`).
azure.yaml	Removes `AZURE_OPENAI_VERSION` from azd pipeline env var list.
AGENTS.md	Documents how to recompile backend requirements with `uv pip compile`.
.vscode/settings.json	Sets default Python env manager to system.
.github/workflows/evaluate.yaml	Removes `AZURE_OPENAI_VERSION` from evaluation workflow env.
.github/copilot-instructions.md	Updates docs to remove GitHub Models + whitespace cleanup.
.env.sample	Removes `AZURE_OPENAI_VERSION` and GitHub Models vars; updates defaults to `gpt-5.4`.
.devcontainer/Dockerfile	Installs `zstd` in devcontainer image.
.devcontainer/devcontainer.json	Removes duplicated/unused extensions (notably `vscode-python-envs` entries).

Copilot's findings

Files not reviewed (1)

src/frontend/package-lock.json: Language not supported

Comments suppressed due to low confidence (2)

src/backend/fastapi_app/openai_clients.py:68

Similarly, create_openai_embed_client uses os.environ["AZURE_OPENAI_EMBED_DEPLOYMENT"], which can raise KeyError even though defaults are computed in common_parameters(). Align this with dependency defaults (use os.getenv with the same default) or avoid requiring the env var at client-construction time since the deployment is supplied later per-request.

    OPENAI_EMBED_HOST = os.getenv("OPENAI_EMBED_HOST")
    if OPENAI_EMBED_HOST == "azure":
        azure_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
        azure_deployment = os.environ["AZURE_OPENAI_EMBED_DEPLOYMENT"]
        if api_key := os.getenv("AZURE_OPENAI_KEY"):

evals/eval_config.json:15

target_response_answer_jmespath is still set to message.content, but the /chat response shape has moved to output_text. Update this config (and any consumers) so evals read the answer from the new field; otherwise evaluations will fail or produce empty answers.

            "temperature": 0.3
        }
    },
    "target_response_answer_jmespath": "message.content",
    "target_response_context_jmespath": "context.data_points"

Files reviewed: 41/42 changed files
Comments generated: 5

pamelafox added 4 commits April 9, 2026 22:07

Port to Responses API

0b2b41d

Revert temperature change

eea45c3

Update async credentials, rm GitHub Models

cfa570d

Port frontend to be more Responses like, remove unused session state …

596153f

…and follow-up Qs

pamelafox requested a review from Copilot April 11, 2026 05:20

Copilot started reviewing on behalf of pamelafox April 11, 2026 05:22 View session

Copilot AI reviewed Apr 11, 2026

View reviewed changes

pamelafox added 4 commits April 11, 2026 05:30

Hopefully fix the CI and tests

46c371b

Use async cred for pg in prod

36d54c6

Fix CI and mypy

670b83e

Exclude 3.10 from macos arm64, they dont work

6914ac8

pamelafox merged commit c0e286f into main Apr 11, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port from Chat Completions API to Responses API#292

Port from Chat Completions API to Responses API#292
pamelafox merged 8 commits into
mainfrom
responses-api

pamelafox commented Apr 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pamelafox commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Key changes:

Does this introduce a breaking change?

Type of change

Code quality checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pamelafox commented Apr 9, 2026 •

edited

Loading