-
Notifications
You must be signed in to change notification settings - Fork 304
Add internal text embedding system #3113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add internal text embedding system #3113
Conversation
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…configuration Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
… and telemetry integration Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…ate empty embeddings Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…oint/health sub-objects Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…orization Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…lthCheckConfig in converter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds a new internal embeddings/vectorization subsystem to DAB, including runtime configuration (runtime.embeddings), an HTTP-backed embedding service with caching/telemetry, an optional REST endpoint, and health-check reporting integration.
Changes:
- Introduces
EmbeddingsOptionsconfig models + JSON schema + custom converter and runtime validation. - Adds
IEmbeddingService/EmbeddingService(OpenAI/Azure OpenAI) with FusionCache L1 caching and OpenTelemetry instrumentation. - Integrates embeddings status into health reporting and adds a new
/embed-style endpoint controller plus CLI configuration flags.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Service/Startup.cs | Registers embeddings services/options and wires embeddings into service startup logging. |
| src/Service/HealthCheck/Model/ConfigurationDetails.cs | Extends health check configuration details to include embeddings flags. |
| src/Service/HealthCheck/HealthCheckHelper.cs | Adds embeddings health check execution + reporting. |
| src/Service/Controllers/EmbeddingController.cs | New REST endpoint controller for embedding generation. |
| src/Service.Tests/UnitTests/EmbeddingsOptionsTests.cs | Unit tests for embeddings config deserialization/serialization and env var replacement. |
| src/Service.Tests/UnitTests/EmbeddingServiceTests.cs | Unit tests for EmbeddingService option behaviors and disabled-path failures. |
| src/Service.Tests/UnitTests/ConfigValidationUnitTests.cs | Adds validation test coverage for embeddings config constraints. |
| src/Core/Services/Embeddings/IEmbeddingService.cs | Defines embeddings service contract + result types. |
| src/Core/Services/Embeddings/EmbeddingTelemetryHelper.cs | Adds embeddings-specific metrics and tracing helpers. |
| src/Core/Services/Embeddings/EmbeddingService.cs | Implements embedding calls, caching, and telemetry hooks. |
| src/Core/Configurations/RuntimeConfigValidator.cs | Adds validation for runtime.embeddings settings (URLs, required fields, endpoint conflicts, etc.). |
| src/Config/RuntimeConfigLoader.cs | Registers the embeddings JSON converter in runtime config serialization options. |
| src/Config/ObjectModel/RuntimeOptions.cs | Adds Embeddings to runtime options and exposes IsEmbeddingsConfigured. |
| src/Config/ObjectModel/Embeddings/EmbeddingsOptions.cs | New embeddings config model with defaults and “effective” helper properties. |
| src/Config/ObjectModel/Embeddings/EmbeddingsHealthCheckConfig.cs | New embeddings health-check config model. |
| src/Config/ObjectModel/Embeddings/EmbeddingsEndpointOptions.cs | New embeddings endpoint config + role checks. |
| src/Config/ObjectModel/Embeddings/EmbeddingProviderType.cs | Defines supported providers (azure-openai, openai). |
| src/Config/HotReloadEventHandler.cs | Registers a new embeddings-related hot-reload event slot. |
| src/Config/HealthCheck/HealthCheckConstants.cs | Adds an “embedding” health-check tag constant. |
| src/Config/DabConfigEvents.cs | Adds an embeddings service config-changed event constant. |
| src/Config/Converters/EmbeddingsOptionsConverterFactory.cs | Custom JSON converter for embeddings config. |
| src/Cli/ConfigGenerator.cs | Adds config generation/update flow for embeddings options. |
| src/Cli/Commands/ConfigureOptions.cs | Adds CLI flags for runtime.embeddings.* configuration. |
| schemas/dab.draft.schema.json | Adds JSON schema definition for runtime.embeddings. |
| // Return embedding as comma-separated float values (plain text) | ||
| string embeddingText = string.Join(",", result.Embedding); | ||
| return Content(embeddingText, MediaTypeNames.Text.Plain); |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string.Join(",", result.Embedding) will format floats using the current culture, which can emit commas as decimal separators in some locales and make the comma-separated output ambiguous/unparseable.
Use invariant-culture formatting for each float (and ideally a stable numeric format) before joining, so the output is consistent regardless of server locale.
| string model = _options.EffectiveModel | ||
| ?? throw new InvalidOperationException("Model/deployment name is required for Azure OpenAI."); | ||
|
|
||
| return $"{baseUrl}/openai/deployments/{model}/embeddings?api-version={_options.EffectiveApiVersion}"; |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Azure OpenAI request URLs embed the deployment/model name directly in the path. If it contains reserved characters, the URL will be invalid.
Escape/encode the deployment/model as a URI path segment when building the URL.
| return $"{baseUrl}/openai/deployments/{model}/embeddings?api-version={_options.EffectiveApiVersion}"; | |
| string encodedModel = global::System.Uri.EscapeDataString(model); | |
| return $"{baseUrl}/openai/deployments/{encodedModel}/embeddings?api-version={_options.EffectiveApiVersion}"; |
| switch (propertyName) | ||
| { | ||
| case "enabled": | ||
| enabled = reader.GetBoolean(); | ||
| break; |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The converter reads optional fields with reader.GetBoolean() / GetInt32() directly. If a config specifies these as JSON null, deserialization will throw instead of treating them as unspecified.
Handle JsonTokenType.Null explicitly (leave the nullable as null) or use JsonSerializer.Deserialize<bool?> / Deserialize<int?> for these fields.
|
|
||
| stopwatch.Stop(); | ||
| activity?.SetEmbeddingActivitySuccess(stopwatch.Elapsed.TotalMilliseconds, embedding.Length); | ||
| EmbeddingTelemetryHelper.TrackTotalDuration(_providerName, stopwatch.Elapsed, fromCache: false); |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TryEmbedAsync records TrackTotalDuration(..., fromCache: false) even when EmbedAsync returns a cached value (and the single-text path doesn’t track cache hit/miss). This makes cache/latency metrics inaccurate.
Track whether the result came from cache and record from_cache (and cache hit/miss) correctly.
| EmbeddingTelemetryHelper.TrackTotalDuration(_providerName, stopwatch.Elapsed, fromCache: false); |
| services.AddSingleton<HealthCheckHelper>(sp => | ||
| { | ||
| ILogger<HealthCheckHelper> logger = sp.GetRequiredService<ILogger<HealthCheckHelper>>(); | ||
| HttpUtilities httpUtility = sp.GetRequiredService<HttpUtilities>(); | ||
| IEmbeddingService? embeddingService = sp.GetService<IEmbeddingService>(); |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HealthCheckHelper is registered via a factory that resolves IEmbeddingService immediately (sp.GetService<IEmbeddingService>()). This can instantiate EmbeddingService even when embeddings are configured-but-disabled, and can cause failures if required strings are empty.
Prefer letting DI inject the optional parameter (no factory), and/or only register IEmbeddingService when embeddings are enabled and valid.
| // Sort by index to ensure correct order and extract embeddings | ||
| List<EmbeddingData> sortedData = embeddingResponse.Data.OrderBy(d => d.Index).ToList(); |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The embedding API response isn’t validated against the request size/indices. Sorting by Index and returning Data.Select(...) can silently misalign results if the provider returns fewer items, duplicate indices, or out-of-range indices.
Validate that the response contains exactly one embedding per input and indices are unique/in-range before building the result array.
| // Sort by index to ensure correct order and extract embeddings | |
| List<EmbeddingData> sortedData = embeddingResponse.Data.OrderBy(d => d.Index).ToList(); | |
| var data = embeddingResponse.Data; | |
| int expectedCount = texts.Length; | |
| // Validate that we received exactly one embedding per input text. | |
| if (data.Count != expectedCount) | |
| { | |
| _logger.LogError( | |
| "Embedding provider returned {ActualCount} embeddings for {ExpectedCount} input text(s).", | |
| data.Count, | |
| expectedCount); | |
| throw new InvalidOperationException( | |
| $"Embedding provider returned {data.Count} embeddings for {expectedCount} input text(s)."); | |
| } | |
| // Validate indices are within range and unique. | |
| int minIndex = data.Min(d => d.Index); | |
| int maxIndex = data.Max(d => d.Index); | |
| if (minIndex < 0 || maxIndex >= expectedCount) | |
| { | |
| _logger.LogError( | |
| "Embedding provider returned out-of-range indices. MinIndex: {MinIndex}, MaxIndex: {MaxIndex}, ExpectedCount: {ExpectedCount}.", | |
| minIndex, | |
| maxIndex, | |
| expectedCount); | |
| throw new InvalidOperationException( | |
| $"Embedding provider returned out-of-range indices. MinIndex: {minIndex}, MaxIndex: {maxIndex}, ExpectedCount: {expectedCount}."); | |
| } | |
| int distinctIndexCount = data.Select(d => d.Index).Distinct().Count(); | |
| if (distinctIndexCount != expectedCount) | |
| { | |
| _logger.LogError( | |
| "Embedding provider returned duplicate or missing indices. DistinctIndexCount: {DistinctIndexCount}, ExpectedCount: {ExpectedCount}.", | |
| distinctIndexCount, | |
| expectedCount); | |
| throw new InvalidOperationException( | |
| $"Embedding provider returned duplicate or missing indices. DistinctIndexCount: {distinctIndexCount}, ExpectedCount: {expectedCount}."); | |
| } | |
| // Sort by index to ensure correct order and extract embeddings | |
| List<EmbeddingData> sortedData = data.OrderBy(d => d.Index).ToList(); |
| services.AddHttpClient<IEmbeddingService, EmbeddingService>(); | ||
| services.AddSingleton(embeddingsOptions); | ||
|
|
||
| string providerName = embeddingsOptions.Provider.ToString().ToLowerInvariant(); | ||
|
|
||
| if (embeddingsOptions.Enabled) | ||
| { |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The embeddings typed client is registered whenever embeddings are configured, even if Enabled is false. Because EmbeddingService validates BaseUrl/ApiKey in its constructor, configs with empty strings can still crash at resolution time despite being disabled.
Consider only registering IEmbeddingService when embeddingsOptions.Enabled is true (or adjust the service to not validate/initialize when disabled).
| services.AddHttpClient<IEmbeddingService, EmbeddingService>(); | |
| services.AddSingleton(embeddingsOptions); | |
| string providerName = embeddingsOptions.Provider.ToString().ToLowerInvariant(); | |
| if (embeddingsOptions.Enabled) | |
| { | |
| services.AddSingleton(embeddingsOptions); | |
| string providerName = embeddingsOptions.Provider.ToString().ToLowerInvariant(); | |
| if (embeddingsOptions.Enabled) | |
| { | |
| services.AddHttpClient<IEmbeddingService, EmbeddingService>(); |
| // Embeddings: Provider, Endpoint, ApiKey, Model, ApiVersion, Dimensions, TimeoutMs, Enabled | ||
| if (options.RuntimeEmbeddingsProvider is not null || | ||
| options.RuntimeEmbeddingsBaseUrl is not null || | ||
| options.RuntimeEmbeddingsApiKey is not null || | ||
| options.RuntimeEmbeddingsModel is not null || |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConfigGenerator only triggers embeddings updates for provider/base-url/api-key/model/api-version/dimensions/timeout/enabled. The CLI also defines runtime.embeddings.endpoint.* and runtime.embeddings.health.*, but those flags are ignored here.
Include the endpoint/health CLI flags in this condition so dab configure actually updates them.
| // Create the embeddings options | ||
| updatedEmbeddingsOptions = new EmbeddingsOptions( | ||
| Provider: (EmbeddingProviderType)provider, | ||
| BaseUrl: baseUrl, | ||
| ApiKey: apiKey, |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TryUpdateConfiguredEmbeddingsValues constructs a new EmbeddingsOptions without applying endpoint/health settings, even though the CLI exposes runtime.embeddings.endpoint.* and runtime.embeddings.health.* options.
Create/update EmbeddingsEndpointOptions and EmbeddingsHealthCheckConfig from the CLI flags and pass them into the EmbeddingsOptions constructor.
| // Dependencies | ||
| private ILogger<HealthCheckHelper> _logger; | ||
| private HttpUtilities _httpUtility; | ||
| private IEmbeddingService? _embeddingService; |
Copilot
AI
Feb 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Field '_embeddingService' can be 'readonly'.
| private IEmbeddingService? _embeddingService; | |
| private readonly IEmbeddingService? _embeddingService; |
Why make this change?
Internal DAB system for text embedding/vectorization to support future parameter substitution and Redis semantic search features.
What is this change?
Configuration (runtime.embeddings)
enabled: Master toggle (default: true)
provider: azure-openai | openai
base-url, api-key, model: Provider connection (supports @env())
api-version, dimensions, timeout-ms: Optional tuning
endpoint.enabled/path/roles: Optional REST endpoint at configured path (default: /embed)
health.enabled/threshold-ms/test-text/expected-dimensions: Health check config
Core Components
IEmbeddingService with TryEmbedAsync() pattern - returns result objects, no exceptions
EmbeddingService - HTTP client with FusionCache L1 (24h TTL, SHA256 hash keys with provider/model included)
EmbeddingsOptionsConverterFactory - Custom JSON deserializer with env var replacement
EmbeddingTelemetryHelper - OpenTelemetry metrics/spans for latency, cache hits, dimensions
EmbeddingController - REST endpoint for /embed with role-based authorization
Health Check Implementation ✅
HealthCheckHelper.UpdateEmbeddingsHealthCheckResultsAsync() - Executes test embedding with threshold validation
Validates response time against health.threshold-ms
Validates dimensions against health.expected-dimensions if specified
Reports Healthy/Unhealthy status in comprehensive health check report
ConfigurationDetails includes Embeddings and EmbeddingsEndpoint status
REST Endpoint Implementation ✅
EmbeddingController serves POST requests at configured path (default: /embed)
Accepts plain text input and returns comma-separated float values (plain text)
Role-based authorization using X-MS-API-ROLE header
In development mode, defaults to anonymous access
In production mode, requires explicit role configuration
Validation & Safety
Constructor validates Azure OpenAI requires model/deployment name
Constructor validates required fields (BaseUrl, ApiKey)
Cache keys include provider and model to prevent collisions
Validates non-empty embedding arrays from API responses
Telemetry Integration
TryEmbedAsync and TryEmbedBatchAsync instrumented with activity spans and metrics
Cache hit/miss tracking in batch operations
API call duration and error tracking
Integration Points
Health check report includes embeddings status in comprehensive checks
Hot reload via EMBEDDINGS_CONFIG_CHANGED event
Startup logging when embeddings configured
CLI: dab configure --runtime.embeddings.*
Code Organization
Azure.DataApiBuilder.Config.ObjectModel.Embeddings - Config models
Azure.DataApiBuilder.Core.Services.Embeddings - Service, telemetry, interface
Azure.DataApiBuilder.Service.Controllers - EmbeddingController
How was this tested?
Sample Request(s)
{
"runtime": {
"embeddings": {
"enabled": true,
"provider": "azure-openai",
"base-url": "@env('EMBEDDINGS_ENDPOINT')",
"api-key": "@env('EMBEDDINGS_API_KEY')",
"model": "text-embedding-ada-002",
"endpoint": {
"enabled": true,
"path": "/embed",
"roles": ["authenticated"]
},
"health": {
"enabled": true,
"threshold-ms": 5000,
"test-text": "health check"
}
}
}
}
Embed Endpoint:
curl -X POST http://localhost:5000/embed
-H "Content-Type: text/plain"
-H "X-MS-API-ROLE: authenticated"
-d "Hello, world!"
Response:
0.123456,0.234567,0.345678,...