feat(openai): add LLM.with_aws_bedrock for OpenAI models on Amazon Bedrock#5978
Open
piyush-gambhir wants to merge 4 commits into
Open
feat(openai): add LLM.with_aws_bedrock for OpenAI models on Amazon Bedrock#5978piyush-gambhir wants to merge 4 commits into
piyush-gambhir wants to merge 4 commits into
Conversation
8 tasks
…drock Mirror the Azure OpenAI integration with a `with_aws_bedrock` factory that builds an `openai.AsyncBedrockOpenAI` client. It resolves the regional Bedrock endpoint from `aws_region` (or AWS_REGION/AWS_DEFAULT_REGION) and authenticates with a Bedrock bearer token (AWS_BEARER_TOKEN_BEDROCK) or a refreshable `bedrock_token_provider` for short-lived credentials. - bump openai dependency to >=2.40 (first release with Bedrock support) - add BedrockChatModels literal (gpt-oss-20b / gpt-oss-120b) - add AsyncBedrockTokenProvider type alias - document Amazon Bedrock in the plugin README and module docstring - add hermetic unit tests for the new factory
… fix gpt-oss mantle model IDs GPT-5.5 and GPT-5.4 are Responses-API-only on Bedrock, so add `with_aws_bedrock` to the Responses LLM. It builds an `openai.AsyncBedrockOpenAI` client and disables the WebSocket transport, since Bedrock only exposes the HTTP Responses path. Also fix the Chat Completions model IDs: the `bedrock-mantle` endpoint uses `openai.gpt-oss-120b` / `openai.gpt-oss-20b` (no `-1:0` suffix — that form is only for the bedrock-runtime InvokeModel/Converse APIs). - add BedrockResponsesModels literal (gpt-5.5, gpt-5.4, gpt-oss-120b/20b) - correct BedrockChatModels and the chat default model - extend the unit tests with the Responses variant
The openai SDK's AsyncBedrockOpenAI always derives the `/openai/v1` mantle path,
which only serves the gpt-5.x models. The gpt-oss open-weight models live on `/v1`
and are rejected on `/openai/v1` ("model isn't supported on this route"). Resolve
the correct path from the model id so gpt-oss works without a manual base_url.
Verified end-to-end against Bedrock (us-east-2): gpt-5.5 / gpt-5.4 (Responses) and
gpt-oss-120b (Chat Completions) all stream completions.
… ids `_supports_reasoning_effort` and the effort-level check matched bare model ids (e.g. `gpt-5.4`), so the `openai.`-prefixed Bedrock ids (`openai.gpt-5.4`) silently skipped the `effort="none"` default that direct OpenAI usage applies. Strip the `openai.` prefix so the same model behaves consistently across both constructors. Verified end-to-end on Bedrock: `openai.gpt-5.4` now sends reasoning effort="none" and streams a completion.
051a5c3 to
f71747d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds support for OpenAI models served through Amazon Bedrock, mirroring the existing
LLM.with_azure(...)integration. Bedrock's OpenAI-compatiblebedrock-mantleendpoint serves two families with different API surfaces and URL paths, so this adds awith_aws_bedrock(...)factory to both the Chat Completions and Responses LLMs:gpt-oss-120b,gpt-oss-20b/v1openai.LLM.with_aws_bedrock(...)gpt-5.5,gpt-5.4/openai/v1openai.responses.LLM.with_aws_bedrock(...)Both build an
openai.AsyncBedrockOpenAIclient and resolve the regional endpoint automatically. The openai SDK only ever derives the/openai/v1path (which serves the gpt-5.x models); the gpt-oss models live on/v1, so the correct path is resolved from the model id. For the Responses variant the WebSocket transport is disabled automatically, since Bedrock only exposes the HTTP Responses path.Usage
api_key,aws_region, andbase_urlare inferred fromAWS_BEARER_TOKEN_BEDROCK,AWS_REGION/AWS_DEFAULT_REGION, andAWS_BEDROCK_BASE_URLrespectively when not passed explicitly.api_keyandbedrock_token_providerare mutually exclusive.Changes
LLM.with_aws_bedrock(...)on the Chat Completions LLM (gpt-oss)responses.LLM.with_aws_bedrock(...)on the Responses LLM (gpt-5.5,gpt-5.4,gpt-oss)/v1for gpt-oss,/openai/v1for gpt-5.x)BedrockChatModels/BedrockResponsesModelsliteralsAsyncBedrockTokenProvidertype alias for refreshable credentialsopenaito>=2.40— the first release shippingAsyncBedrockOpenAItests/test_openai_bedrock.py)Testing
ruff format/ruff check— cleanmypy(strict) onlivekit.plugins.openai— cleanpytest tests/test_openai_bedrock.py --unit— 5 passedgpt-5.5andgpt-5.4(Responses) andgpt-oss-120b(Chat Completions) all stream completions through the wrappers.The Chat Completions variant uses
_strict_tool_schema=False, consistent with the other open-weight (gpt-oss) OpenAI-compatible providers in this plugin (Cerebras, SambaNova). Model availability/paths/regions confirmed against the AWS model cards for gpt-oss-120b, GPT-5.5, and GPT-5.4.