[Enh]: Internal subsystem for embedding text.

@robertopc1

## What?

A subsystem in DAB to fetch embeddings from text. 

## Why?

For the sake of future use around cache and parameter substitution. 

## Configuration (`runtime.embeddings`)

```json
{
  "runtime": {
    "embeddings": {
      "enabled": true,
      "provider": "azure-openai",
      "base-url": "@env('EMBEDDINGS_ENDPOINT')",
      "api-key": "@env('EMBEDDINGS_API_KEY')",
      "model": "text-embedding-ada-002",
      "api-version": "2024-02-01",
      "dimensions": 1536,
      "timeout-ms": 30000,
      "endpoint": {
        "enabled": true,
        "path": "/embed",
        "roles": ["authenticated"]
      },
      "health": {
        "enabled": true,
        "threshold-ms": 1000
      }
    }
  }
}
```

#### Parameters 

| Parameter                | Type            | Required | Default         | Description                                                                                          |
|--------------------------|-----------------|----------|-----------------|------------------------------------------------------------------------------------------------------|
| `enabled`                | `bool`          | No       | `true`          | Master toggle for embedding subsystem                                                                 |
| `provider`               | `string`        | Yes      | -               | `azure-openai` or `openai`                                                                           |
| `base-url`               | `string`        | Yes      | -               | Provider base URL                                                                                    |
| `api-key`                | `string`        | Yes      | -               | Authentication key (can use `@env()` indirection)                                                     |
| `model`                  | `string`        | Yes      | OpenAI: `text-embedding-3-small` | Deployment name (Azure) or model name (OpenAI)                                   |
| `api-version`            | `string`        | Azure only | `2024-02-01` | Azure API version (prevent deprecation issues)                                                       |
| `dimensions`             | `int`           | No       | Model default   | Output vector size (used for Redis schema alignment)                                                 |
| `timeout-ms`             | `int`           | No       | `30000`         | Request timeout in milliseconds                                                                      |
| `endpoint.enabled`       | `bool`          | No       | `false`         | Whether to expose `/embed` REST API                                                                  |
| `endpoint.path`          | `string`        | No       | `/embed`        | Embed endpoint path (customizable)                                                                   |
| `endpoint.roles`         | `string[]`      | No       | `["authenticated"]` | List of roles allowed to access embed endpoint                                               |
| `health.enabled`         | `bool`          | No       | `true`         | Enable embedding health check                                                                        |
| `health.threshold-ms`    | `int`           | No       | `5000`          | Acceptable embedding latency threshold in ms for health check                                        |

### Command line

- Easily configure via `dab configure --runtime.embeddings.*`

### REST Endpoint

- Optionally exposes `/embed` for embedding on demand (secured by role)
- When `host.mode` is `development`, then `anonymous` is automatically allowed. 

### Telemetry & Caching

- OpenTelemetry spans/metrics for latency, cache hits/misses, and error rate
- L1 (SHA256-based, 24h TTL, cache keys incorporate provider & model)

### Health, Hot Reload, Startup

- Health check covers embedding provider & endpoint
- Listens for hot-reload event and reloads
- Startup logs embedding configuration info

## REST API Reference

_Azure OpenAI:_
- URL: POST {base-url}/openai/deployments/{model}/embeddings?api-version={api-version}
- Headers: api-key: {api-key}
- Body: `{ "input": string | string[], "dimensions": 1536 }`

_OpenAI:_
- URL: POST {base-url}/v1/embeddings
- Headers: Authorization: Bearer {api-key}
- Body: `{ "model": "text-embedding-3-small", "input": string | string[], "dimensions": 1536 }`

Parameter	Type	Required	Default	Description
`enabled`	`bool`	No	`true`	Master toggle for embedding subsystem
`provider`	`string`	Yes	-	`azure-openai` or `openai`
`base-url`	`string`	Yes	-	Provider base URL
`api-key`	`string`	Yes	-	Authentication key (can use `@env()` indirection)
`model`	`string`	Yes	OpenAI: `text-embedding-3-small`	Deployment name (Azure) or model name (OpenAI)
`api-version`	`string`	Azure only	`2024-02-01`	Azure API version (prevent deprecation issues)
`dimensions`	`int`	No	Model default	Output vector size (used for Redis schema alignment)
`timeout-ms`	`int`	No	`30000`	Request timeout in milliseconds
`endpoint.enabled`	`bool`	No	`false`	Whether to expose `/embed` REST API
`endpoint.path`	`string`	No	`/embed`	Embed endpoint path (customizable)
`endpoint.roles`	`string[]`	No	`["authenticated"]`	List of roles allowed to access embed endpoint
`health.enabled`	`bool`	No	`true`	Enable embedding health check
`health.threshold-ms`	`int`	No	`5000`	Acceptable embedding latency threshold in ms for health check

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enh]: Internal subsystem for embedding text. #3103

What?

Why?

Configuration (`runtime.embeddings`)

Parameters

Command line

REST Endpoint

Telemetry & Caching

Health, Hot Reload, Startup

REST API Reference

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Enh]: Internal subsystem for embedding text. #3103

Description

What?

Why?

Configuration (runtime.embeddings)

Parameters

Command line

REST Endpoint

Telemetry & Caching

Health, Hot Reload, Startup

REST API Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Configuration (`runtime.embeddings`)