Skip to content

feat(openai): add LLM.with_aws_bedrock for OpenAI models on Amazon Bedrock#5978

Open
piyush-gambhir wants to merge 4 commits into
livekit:mainfrom
piyush-gambhir:feat/openai-aws-bedrock
Open

feat(openai): add LLM.with_aws_bedrock for OpenAI models on Amazon Bedrock#5978
piyush-gambhir wants to merge 4 commits into
livekit:mainfrom
piyush-gambhir:feat/openai-aws-bedrock

Conversation

@piyush-gambhir
Copy link
Copy Markdown
Contributor

@piyush-gambhir piyush-gambhir commented Jun 5, 2026

Summary

Adds support for OpenAI models served through Amazon Bedrock, mirroring the existing LLM.with_azure(...) integration. Bedrock's OpenAI-compatible bedrock-mantle endpoint serves two families with different API surfaces and URL paths, so this adds a with_aws_bedrock(...) factory to both the Chat Completions and Responses LLMs:

Models API on Bedrock Mantle path Factory
gpt-oss-120b, gpt-oss-20b Chat Completions (+ Responses) /v1 openai.LLM.with_aws_bedrock(...)
gpt-5.5, gpt-5.4 Responses only /openai/v1 openai.responses.LLM.with_aws_bedrock(...)

Both build an openai.AsyncBedrockOpenAI client and resolve the regional endpoint automatically. The openai SDK only ever derives the /openai/v1 path (which serves the gpt-5.x models); the gpt-oss models live on /v1, so the correct path is resolved from the model id. For the Responses variant the WebSocket transport is disabled automatically, since Bedrock only exposes the HTTP Responses path.

Usage

from livekit.plugins import openai

# Chat Completions — gpt-oss
llm = openai.LLM.with_aws_bedrock(model="openai.gpt-oss-120b", aws_region="us-east-2")

# Responses API — gpt-5.5 / gpt-5.4
llm = openai.responses.LLM.with_aws_bedrock(model="openai.gpt-5.5", aws_region="us-east-2")

# Refreshable credentials (e.g. aws-bedrock-token-generator)
from aws_bedrock_token_generator import provide_token
llm = openai.responses.LLM.with_aws_bedrock(
    model="openai.gpt-5.5", bedrock_token_provider=provide_token, aws_region="us-east-2",
)

api_key, aws_region, and base_url are inferred from AWS_BEARER_TOKEN_BEDROCK, AWS_REGION/AWS_DEFAULT_REGION, and AWS_BEDROCK_BASE_URL respectively when not passed explicitly. api_key and bedrock_token_provider are mutually exclusive.

Model IDs: the mantle (OpenAI-compatible) endpoint uses IDs without the -1:0 suffix (e.g. openai.gpt-oss-120b). The -1:0 form is only for the bedrock-runtime InvokeModel/Converse APIs.

Changes

  • LLM.with_aws_bedrock(...) on the Chat Completions LLM (gpt-oss)
  • responses.LLM.with_aws_bedrock(...) on the Responses LLM (gpt-5.5, gpt-5.4, gpt-oss)
  • model-aware mantle path resolution (/v1 for gpt-oss, /openai/v1 for gpt-5.x)
  • BedrockChatModels / BedrockResponsesModels literals
  • AsyncBedrockTokenProvider type alias for refreshable credentials
  • Bump openai to >=2.40 — the first release shipping AsyncBedrockOpenAI
  • README + module docstring mention Amazon Bedrock
  • Hermetic unit tests (tests/test_openai_bedrock.py)

Testing

  • ruff format / ruff check — clean
  • mypy (strict) on livekit.plugins.openai — clean
  • pytest tests/test_openai_bedrock.py --unit — 5 passed
  • Verified end-to-end against live Bedrock (us-east-2): gpt-5.5 and gpt-5.4 (Responses) and gpt-oss-120b (Chat Completions) all stream completions through the wrappers.

The Chat Completions variant uses _strict_tool_schema=False, consistent with the other open-weight (gpt-oss) OpenAI-compatible providers in this plugin (Cerebras, SambaNova). Model availability/paths/regions confirmed against the AWS model cards for gpt-oss-120b, GPT-5.5, and GPT-5.4.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

devin-ai-integration[bot]

This comment was marked as resolved.

…drock

Mirror the Azure OpenAI integration with a `with_aws_bedrock` factory that
builds an `openai.AsyncBedrockOpenAI` client. It resolves the regional Bedrock
endpoint from `aws_region` (or AWS_REGION/AWS_DEFAULT_REGION) and authenticates
with a Bedrock bearer token (AWS_BEARER_TOKEN_BEDROCK) or a refreshable
`bedrock_token_provider` for short-lived credentials.

- bump openai dependency to >=2.40 (first release with Bedrock support)
- add BedrockChatModels literal (gpt-oss-20b / gpt-oss-120b)
- add AsyncBedrockTokenProvider type alias
- document Amazon Bedrock in the plugin README and module docstring
- add hermetic unit tests for the new factory
… fix gpt-oss mantle model IDs

GPT-5.5 and GPT-5.4 are Responses-API-only on Bedrock, so add `with_aws_bedrock`
to the Responses LLM. It builds an `openai.AsyncBedrockOpenAI` client and disables
the WebSocket transport, since Bedrock only exposes the HTTP Responses path.

Also fix the Chat Completions model IDs: the `bedrock-mantle` endpoint uses
`openai.gpt-oss-120b` / `openai.gpt-oss-20b` (no `-1:0` suffix — that form is only
for the bedrock-runtime InvokeModel/Converse APIs).

- add BedrockResponsesModels literal (gpt-5.5, gpt-5.4, gpt-oss-120b/20b)
- correct BedrockChatModels and the chat default model
- extend the unit tests with the Responses variant
The openai SDK's AsyncBedrockOpenAI always derives the `/openai/v1` mantle path,
which only serves the gpt-5.x models. The gpt-oss open-weight models live on `/v1`
and are rejected on `/openai/v1` ("model isn't supported on this route"). Resolve
the correct path from the model id so gpt-oss works without a manual base_url.

Verified end-to-end against Bedrock (us-east-2): gpt-5.5 / gpt-5.4 (Responses) and
gpt-oss-120b (Chat Completions) all stream completions.
… ids

`_supports_reasoning_effort` and the effort-level check matched bare model ids
(e.g. `gpt-5.4`), so the `openai.`-prefixed Bedrock ids (`openai.gpt-5.4`) silently
skipped the `effort="none"` default that direct OpenAI usage applies. Strip the
`openai.` prefix so the same model behaves consistently across both constructors.

Verified end-to-end on Bedrock: `openai.gpt-5.4` now sends reasoning effort="none"
and streams a completion.
@piyush-gambhir piyush-gambhir force-pushed the feat/openai-aws-bedrock branch from 051a5c3 to f71747d Compare June 6, 2026 05:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant