Skip to content

.Net: fix(connectors): Support request-level ModelId overrides for Google, Vertex AI, and OpenAI#13999

Open
Yusuftmle wants to merge 12 commits into
microsoft:mainfrom
Yusuftmle:fix/google-connector-model-id-override
Open

.Net: fix(connectors): Support request-level ModelId overrides for Google, Vertex AI, and OpenAI#13999
Yusuftmle wants to merge 12 commits into
microsoft:mainfrom
Yusuftmle:fix/google-connector-model-id-override

Conversation

@Yusuftmle
Copy link
Copy Markdown

@Yusuftmle Yusuftmle commented May 12, 2026

Summary

This PR addresses and resolves the issue where Google AI, Vertex AI, and OpenAI connectors ignored request-level ModelId overrides (such as those supplied via PromptExecutionSettings or EmbeddingGenerationOptions) and instead always defaulted to using the model ID passed during service initialization/constructor calls.

Resolves #13287

Note

Also includes changes from #13011 (OpenAIResponseAgent exception handling).

Motivation

In modern multi-agent or dynamically-routed applications, developers often initialize a single LLM or Embedding service instance (e.g., with a default fast model) and want to dynamically override the model to a larger or specialized version for specific requests using the request-level execution options.

Previously:

  • Google AI and Vertex AI connectors pre-computed endpoint URIs synchronously inside their client/service constructors, ignoring any custom ModelId supplied on the request options.
  • The OpenAIChatCompletionService was hardcoded to forward only this._client.ModelId to its core client, completely bypassing executionSettings?.ModelId.

With this PR, all of these connectors now seamlessly support dynamic, request-level model overriding with optimal performance pathways.


Changes Made

Microsoft.SemanticKernel.Connectors.Google

  1. GeminiPromptExecutionSettings.cs:

    • Updated FromExecutionSettings to ensure that settings.ModelId is explicitly copied from the original executionSettings.ModelId during option translation/deserialization.
  2. GoogleAIEmbeddingClient.cs & VertexAIEmbeddingClient.cs:

    • Stored version and location parameters (_apiVersion, _location, _projectId) in private fields.
    • Introduced a highly-optimized fast-path in GetEmbeddingEndpoint(string modelId) to reuse the constructor-instantiated _embeddingEndpoint if the requested model matches the default.
    • Refactored GetEmbeddingRequest to accept the resolved, validated modelId directly to avoid redundant whitespace checking.
    • Updated GenerateEmbeddingsAsync to fetch the endpoint dynamically per-request, seamlessly directing the API call to the overridden model.
  3. GeminiChatCompletionClient.cs:

    • Saved API version and project details from Google AI and Vertex AI constructors in dedicated private fields.
    • Implemented GetEndpoints(string modelId) to dynamically construct both generation and streaming endpoints for the requested model override.
    • Updated chat generation (GenerateChatMessageAsync) and streaming (StreamGenerateChatMessageAsync) routines to resolve the runtime modelId and use the dynamic endpoints (with clean variable discards for unused endpoints).
    • Propagated the resolved modelId into all response and metadata factories so that the returned chat content has the correct executing model name.

Microsoft.SemanticKernel.Connectors.OpenAI

  1. OpenAIChatCompletionService.cs:
    • Updated all completion and text generation methods (GetChatMessageContentsAsync, GetStreamingChatMessageContentsAsync, GetTextContentsAsync, and GetStreamingTextContentsAsync) to forward overridden models using a robust string.IsNullOrWhiteSpace check to prevent forwarding invalid model identifiers.

Test Coverage

  1. Google AI & Vertex AI:

    • Added unit test GetChatMessageContentsAsyncUsesModelIdFromExecutionSettingsAsync inside GoogleAIGeminiChatCompletionServiceTests.cs to verify that overriding ModelId in GeminiPromptExecutionSettings correctly updates the target endpoint URI and returned model properties.
    • All 435 tests in the Google suite pass successfully.
  2. OpenAI:

    • Added unit test GetChatMessageContentsAsyncUsesModelIdFromExecutionSettingsAsync inside OpenAIChatCompletionServiceTests.cs to verify that overriding ModelId in OpenAIPromptExecutionSettings successfully routes the OpenAI payload with the overridden model identifier.
    • All 132 tests in the OpenAI suite pass successfully.

Copilot AI review requested due to automatic review settings May 12, 2026 06:39
@Yusuftmle Yusuftmle requested a review from a team as a code owner May 12, 2026 06:39
@moonbox3 moonbox3 added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel labels May 12, 2026
@github-actions github-actions Bot changed the title fix(connectors): Support request-level ModelId overrides for Google, Vertex AI, and OpenAI .Net: fix(connectors): Support request-level ModelId overrides for Google, Vertex AI, and OpenAI May 12, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the .NET Google (Gemini / Vertex AI) and OpenAI connectors to honor request-level ModelId overrides (e.g., via PromptExecutionSettings / EmbeddingGenerationOptions) instead of always using the constructor-provided default model.

Changes:

  • OpenAI chat completion service now forwards executionSettings.ModelId to the underlying client when provided.
  • Google AI / Vertex AI embedding + Gemini chat clients now resolve endpoint URIs per request using the overridden model id.
  • Adds unit tests validating ModelId override behavior; additionally introduces new OpenAI Response Agent exception-wrapping behavior + tests (not mentioned in the PR description).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
dotnet/src/Connectors/Connectors.OpenAI/Services/OpenAIChatCompletionService.cs Prefer request-level ModelId when calling ClientCore.
dotnet/src/Connectors/Connectors.OpenAI.UnitTests/Services/OpenAIChatCompletionServiceTests.cs Adds test asserting OpenAI payload + result use overridden model id.
dotnet/src/Connectors/Connectors.Google/GeminiPromptExecutionSettings.cs Ensures ModelId is preserved when translating from generic execution settings.
dotnet/src/Connectors/Connectors.Google/Core/VertexAI/VertexAIEmbeddingClient.cs Builds Vertex embedding endpoint per request to support model override.
dotnet/src/Connectors/Connectors.Google/Core/GoogleAI/GoogleAIEmbeddingClient.cs Builds Google embedding endpoint per request to support model override.
dotnet/src/Connectors/Connectors.Google/Core/Gemini/Clients/GeminiChatCompletionClient.cs Builds Gemini generation/streaming endpoints per request and propagates overridden model id into responses.
dotnet/src/Connectors/Connectors.Google.UnitTests/Services/GoogleAIGeminiChatCompletionServiceTests.cs Adds test verifying overridden model id is used in request URI and returned content.
dotnet/src/Agents/UnitTests/OpenAI/OpenAIResponseAgentExceptionTests.cs New tests for exception wrapping behavior in OpenAIResponseAgent.
dotnet/src/Agents/OpenAI/OpenAIResponseAgent.cs Adds provider exception wrapping for streaming and non-streaming invocation paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +100 to +105
Kernel? kernel = null,
CancellationToken cancellationToken = default)
=> this._client.GetChatMessageContentsAsync(this._client.ModelId, chatHistory, executionSettings, kernel, cancellationToken);
=> this._client.GetChatMessageContentsAsync(executionSettings?.ModelId ?? this._client.ModelId, chatHistory, executionSettings, kernel, cancellationToken);
? state.ExecutionSettings.ModelId
: this._modelId;

var (generationEndpoint, streamingEndpoint) = this.GetEndpoints(modelId);
Comment on lines 19 to 23
private readonly string _embeddingModelId;
private readonly GoogleAIVersion _apiVersion;
private readonly Uri _embeddingEndpoint;
private readonly int? _dimensions;

Comment on lines 74 to +80
Verify.NotNullOrEmpty(data);

string modelId = !string.IsNullOrWhiteSpace(options?.ModelId) ? options.ModelId : this._embeddingModelId;
var geminiRequest = this.GetEmbeddingRequest(data, options);

using var httpRequestMessage = await this.CreateHttpRequestAsync(geminiRequest, this._embeddingEndpoint).ConfigureAwait(false);
var endpoint = this.GetEmbeddingEndpoint(modelId);
using var httpRequestMessage = await this.CreateHttpRequestAsync(geminiRequest, endpoint).ConfigureAwait(false);
Comment on lines 19 to 25
private readonly string _embeddingModelId;
private readonly VertexAIVersion _apiVersion;
private readonly string _location;
private readonly string _projectId;
private readonly Uri _embeddingEndpoint;
private readonly int? _dimensions;

Comment on lines +155 to +158
await foreach (var result in mappedResults.ConfigureAwait(false))
{
await NotifyMessagesAsync().ConfigureAwait(false);
yield return new(result, agentThread);
Comment on lines +62 to +78
try
{
agentThread = await this.EnsureThreadExistsWithMessagesAsync(messages, thread, cancellationToken).ConfigureAwait(false);
extensionsContextOptions = await this.FinalizeInvokeOptionsAsync(messages, options, agentThread, cancellationToken).ConfigureAwait(false);

ChatHistory chatHistory = [.. messages];
invokeResults = ResponseThreadActions.InvokeAsync(
this,
chatHistory,
agentThread,
extensionsContextOptions,
cancellationToken);
}
catch (Exception ex) when (ex is not OperationCanceledException)
{
throw new KernelException($"OpenAI provider error for agent '{this.Name}': {ex.Message}", ex);
}
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 4 | Confidence: 92% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Design Approach


Automated review by Yusuftmle's agents

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

.Net: Bug: Google AI client ignores modelId passed to PromptExecutionSettings

3 participants