Port from Chat Completions API to Responses API#292
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Migrate the entire application from the OpenAI Chat Completions API to the newer Responses API. This modernizes the codebase to use
client.responses.create()instead ofclient.chat.completions.create(), aligning with OpenAI's recommended API surface going forward.Key changes:
Backend core (
src/backend/fastapi_app/)openai_clients.py: ReplaceAsyncAzureOpenAIwithAsyncOpenAIusingbase_url="{endpoint}/openai/v1/"for Azure OpenAI. Remove theAZURE_OPENAI_VERSIONparameter (no longer needed with the/v1/endpoint). Remove the GitHub Models host option.dependencies.py: Simplify client creation to use a unifiedAsyncOpenAIclient for both Azure and OpenAI.com, with token-based auth or API key.rag_simple.py/rag_advanced.py: Replacechat.completions.create()calls withresponses.create(). Update message format fromChatCompletionMessageParamtoResponseInputItemParam. Update streaming to use Responses API event types.query_rewriter.py: Port query rewriting and search argument extraction to useresponses.create()with the new tool call format (ResponseFunctionToolCall).embeddings.py: Switch Azure embed client fromAsyncAzureOpenAItoAsyncOpenAIwith/v1/base URL.api_models.py: Remove theseedfield fromChatRequestOverrides.prompts/query_fewshots.json: Update few-shot examples to use Responses API message format.Infrastructure (
infra/)AZURE_OPENAI_VERSIONfrommain.bicepandmain.parameters.json.gpt-5.4with deployment version2026-03-05.Evals
generate_ground_truth.py: Port to Responses API with updated tool definitions.Tests
conftest.pyto return Responses API objects (Response,ResponseFunctionToolCall, etc.) instead of Chat Completions objects.test_openai_clients.pyto reflect the unified client approach.tests/snapshots/for the new response format.Config/DevContainer
AZURE_OPENAI_VERSIONfrom.env.sample,azure.yaml, and CI workflow.openaipackage version constraint.Does this introduce a breaking change?
When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.
AZURE_OPENAI_VERSIONenvironment variable is removed — existing.envfiles with this variable will still work (it's just ignored), but the Bicep parameter is gone.seedfield is removed from the chat request overrides API.OPENAI_CHAT_HOST=github) is removed.gpt-5.4— existing environments may need to update their model deployment.Type of change
Code quality checklist
See CONTRIBUTING.md for more details.
python -m pytest).python -m pytest --covto verify 100% coverage of added linespython -m mypyto check for type errorsruffmanually on my code.