Skip to content

feat: add Forge embedders integration#3385

Open
JCorners68 wants to merge 2 commits into
deepset-ai:mainfrom
JCorners68:add-forge-embedders
Open

feat: add Forge embedders integration#3385
JCorners68 wants to merge 2 commits into
deepset-ai:mainfrom
JCorners68:add-forge-embedders

Conversation

@JCorners68
Copy link
Copy Markdown

What this adds

A new forge embedders integration (forge-haystack) under integrations/forge/, providing:

  • ForgeTextEmbedder(OpenAITextEmbedder)
  • ForgeDocumentEmbedder(OpenAIDocumentEmbedder)

Forge serves an OpenAI-compatible embeddings API, so both components subclass Haystack's built-in OpenAI embedders and default to the Forge endpoint:

  • api_base_url="https://api.voxell.ai/v1"
  • api_key=Secret.from_env_var("FORGE_API_KEY")
  • model="forge-pro" (other accepted strings: forge-turbo, forge-ultra-4k, plus OpenAI-compatible aliases text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002)

A dimensions parameter is exposed (passed through to the underlying OpenAI embedder) since Forge models support Matryoshka representation learning. to_dict()/from_dict() are implemented via default_to_dict/default_from_dict, and a SUPPORTED_MODELS ClassVar lists the accepted model strings.

This mirrors the existing OpenAI-compatible embedder integrations — the merged Perplexity integration (#3262) and integrations/mistral/ — using the same thin-subclass pattern. No new SDK dependency is added; haystack-ai already ships the OpenAI client.

How it was tested

From integrations/forge/:

  • hatch run fmt-check — passes
  • hatch run test:types — mypy clean (3 source files)
  • hatch run test:unit-cov-retry — 16 unit tests pass, 100% coverage of both modules (integration tests requiring a real FORGE_API_KEY skip, as expected)

Unit tests mirror the Perplexity/Mistral embedder tests: they assert init defaults/metadata and to_dict/from_dict round-trips, and do not make live API calls.

The scaffold was generated with scripts/create_new_integration.py --name forge --type embedders, which also added the GitHub workflow, labeler entry, coverage-comment workflow entry, and the root README row.


I authored 100% of this contribution and have the right to submit it.

Adds a Forge embedders integration that mirrors the Perplexity and Mistral
OpenAI-compatible embedder integrations. ForgeTextEmbedder and
ForgeDocumentEmbedder subclass Haystack's built-in OpenAI embedders and
default to the Forge OpenAI-compatible API (https://api.voxell.ai/v1,
FORGE_API_KEY, model forge-pro).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@JCorners68 JCorners68 requested a review from a team as a code owner June 2, 2026 18:36
@JCorners68 JCorners68 requested review from julian-risch and removed request for a team June 2, 2026 18:36
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions github-actions Bot added topic:CI type:documentation Improvements or additions to documentation labels Jun 2, 2026
The from_dict overrides on ForgeTextEmbedder and ForgeDocumentEmbedder
called default_from_dict directly, leaving the serialized api_key as a
plain dict. On older Haystack versions (tested floor 2.22.0, exercised by
the lowest-direct-dependencies CI job), default_from_dict does not
auto-deserialize secrets, so the parent OpenAI embedder __init__ then
called .resolve_value() on a dict and raised AttributeError.

Deserialize the api_key Secret in place before default_from_dict, matching
the standard Haystack integration pattern. Fixes the 4 failing from_dict /
round-trip unit tests across all supported Haystack versions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

topic:CI type:documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants