Embedding-resolution helper on the hybrid-search aggregator (sync + async)
Parent: 46729
Depends on: 46730, 46731
Goal
Add a single helper that takes the embeddingParameterMap from the plan, calls the customer-provided generator once with the full batch, and returns the augmented parameter list. No mutation of caller state.
Scope
Add (sync) on _HybridSearchContextAggregator:
def _resolve_embeddings(self):
embedding_map = self._partitioned_query_ex_info.get_embedding_parameter_map()
if not embedding_map:
return
generator = self._options.get("embeddingGenerator")
if generator is None:
raise ValueError(
"Query requires embedding generation but no embedding_generator "
"was passed to query_items."
)
keys, texts = zip(*sorted(embedding_map.items())) # stable order
vectors = generator.generate_embeddings(list(texts))
if len(vectors) != len(texts):
raise ValueError(
f"embedding_generator returned {len(vectors)} vectors for {len(texts)} texts"
)
for i, v in enumerate(vectors):
if v is None:
raise ValueError(f"embedding_generator returned a null vector at index {i}")
extra = [{"name": k, "value": list(v)} for k, v in zip(keys, vectors)]
base = list(self._parameters or [])
self._parameters = base + extra
Mirror in aio/hybrid_search_aggregator.py:
async def _resolve_embeddings_async(self):
...
vectors = await generator.generate_embeddings_async(list(texts))
...
Add a type guard at the async entry point: if a sync EmbeddingGenerator is passed (no generate_embeddings_async), raise TypeError early with an actionable message.
Non-goals
- Do NOT yet call this from
_run_hybrid_search; that lives in 46733.
- Do NOT add the diagnostics span here; that lives in 46734 (but the call-site for the span will be inside this helper).
Files touched
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/hybrid_search_aggregator.py
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/aio/hybrid_search_aggregator.py
Acceptance
- Helper is non-mutating for the no-op case (no map → no parameter changes).
- Sort-stable (same map → same parameter order).
- Raises
ValueError with clear messages on cardinality mismatch / null entry / missing generator.
- Async path raises
TypeError if a sync generator is passed.
- 100 % unit-test coverage of the helper (covered by 46735).
Embedding-resolution helper on the hybrid-search aggregator (sync + async)
Parent: 46729
Depends on: 46730, 46731
Goal
Add a single helper that takes the
embeddingParameterMapfrom the plan, calls the customer-provided generator once with the full batch, and returns the augmented parameter list. No mutation of caller state.Scope
Add (sync) on
_HybridSearchContextAggregator:Mirror in
aio/hybrid_search_aggregator.py:Add a type guard at the async entry point: if a sync
EmbeddingGeneratoris passed (nogenerate_embeddings_async), raiseTypeErrorearly with an actionable message.Non-goals
_run_hybrid_search; that lives in 46733.Files touched
sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/hybrid_search_aggregator.pysdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/aio/hybrid_search_aggregator.pyAcceptance
ValueErrorwith clear messages on cardinality mismatch / null entry / missing generator.TypeErrorif a sync generator is passed.