Skip to content

aws_bedrock_embeddings: support Cohere input_type and v4 response#4473

Open
squiidz wants to merge 3 commits into
mainfrom
aws_bedrock_embeddings_missing_input_type_field
Open

aws_bedrock_embeddings: support Cohere input_type and v4 response#4473
squiidz wants to merge 3 commits into
mainfrom
aws_bedrock_embeddings_missing_input_type_field

Conversation

@squiidz

@squiidz squiidz commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Cohere embedding models on Bedrock require an input_type field
(search_document, search_query, etc.) to produce correct asymmetric
embeddings, and Cohere Embed v4 changed the response schema from
{"embeddings": [[...]]} to {"embeddings": {"float": [[...]]}}.

Dispatch the request body shape and response parsing on the model ID,
using a substring match on "cohere" so regional inference profiles
like us.cohere.embed-v4:0 are detected. Both v3 and v4 response
shapes are now accepted. Titan model code paths are unchanged

@claude

claude Bot commented Jun 1, 2026

Copy link
Copy Markdown

Commits
LGTM

Review
Single-commit PR adding Cohere input_type support and Cohere v4 response parsing to aws_bedrock_embeddings. Code paths are dispatched on a cohere substring match so regional inference profiles are handled; Titan paths are unchanged.

  1. Test naming convention violation in internal/impl/aws/bedrock/processor_embeddings_test.go#L90-L134: six TestParseEmbeddingsResponse_* functions use underscore separators. The project test pattern explicitly requires camelCase ("Test function names use camelCase, not underscores").

@redpanda-data redpanda-data deleted a comment from claude Bot Jun 1, 2026
  Cohere embedding models on Bedrock require an `input_type` field
  (`search_document`, `search_query`, etc.) to produce correct asymmetric
  embeddings, and Cohere Embed v4 changed the response schema from
  `{"embeddings": [[...]]}` to `{"embeddings": {"float": [[...]]}}`.

  Dispatch the request body shape and response parsing on the model ID,
  using a substring match on "cohere" so regional inference profiles
  like `us.cohere.embed-v4:0` are detected. Both v3 and v4 response
  shapes are now accepted. Titan model code paths are unchanged
@squiidz squiidz force-pushed the aws_bedrock_embeddings_missing_input_type_field branch from ec687d9 to c7e85df Compare June 1, 2026 13:32
@claude

claude Bot commented Jun 1, 2026

Copy link
Copy Markdown

Commits
LGTM

Review
The PR adds Cohere model support (input_type field and v4 response schema) to the aws_bedrock_embeddings processor, including a substring-based model detection that handles regional inference profiles, and unit tests covering both v3/v4 response shapes and error cases. Configuration, error wrapping, license headers, and registration patterns all conform to the project conventions.

LGTM

squiidz added 2 commits June 12, 2026 08:51
…e time

Bedrock rejects Cohere embed requests without an input_type, so fail
once at config parse with a clear message instead of per message at
runtime. Add a changelog entry for the Cohere request-shape fix.
assert.ErrorContains(t, err, "expected a single embeddings response")
}

func TestNewProcessor_CohereRequiresInputType(t *testing.T) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test function names should use camelCase without underscores per the project's test conventions: write TestNewProcessorCohereRequiresInputType instead of TestNewProcessor_CohereRequiresInputType (the convention is "Write TestMyProcessorBadArgs, not TestMyProcessor_BadArgs").

@claude

claude Bot commented Jun 15, 2026

Copy link
Copy Markdown

Commits

  1. Commit aws_bedrock_embeddings: validate input_type for Cohere models at parse time — the message body states "Add a changelog entry for the Cohere request-shape fix", but no changelog file appears in this PR. Either include the changelog change or drop that claim from the message.

Otherwise commit granularity and message format are good.

Review

The Cohere request/response dispatch is clean and well-tested (request shape, v3/v4 response parsing, parse-time validation). One minor convention issue:

  1. processor_embeddings_test.go#L142TestNewProcessor_CohereRequiresInputType uses an underscore; project test conventions require camelCase test names without underscores.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant