Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions sdk/cs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ The Foundry Local C# SDK provides a .NET interface for running AI models locally
- **Model catalog** — browse and search all available models; filter by cached or loaded state
- **Lifecycle management** — download, load, unload, and remove models programmatically
- **Chat completions** — synchronous and `IAsyncEnumerable` streaming via OpenAI-compatible types
- **Embeddings** — generate text embeddings via OpenAI-compatible API
- **Audio transcription** — transcribe audio files with streaming support
- **Download progress** — wire up an `Action<float>` callback for real-time download percentage
- **Model variants** — select specific hardware/quantization variants per model alias
Expand Down Expand Up @@ -246,6 +247,24 @@ chatClient.Settings.TopP = 0.9f;
chatClient.Settings.FrequencyPenalty = 0.5f;
```

### Embeddings

```csharp
var embeddingClient = await model.GetEmbeddingClientAsync();

// Generate an embedding
var response = await embeddingClient.GenerateEmbeddingAsync("The quick brown fox jumps over the lazy dog");
var embedding = response.Data[0].Embedding; // List<double>
Console.WriteLine($"Dimensions: {embedding.Count}");
```

#### Embedding Settings

```csharp
embeddingClient.Settings.Dimensions = 512; // optional: reduce dimensionality
embeddingClient.Settings.EncodingFormat = "float"; // "float" or "base64"
```

### Audio Transcription

```csharp
Expand Down
2 changes: 2 additions & 0 deletions sdk/cs/docs/api/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@

[OpenAIChatClient](./microsoft.ai.foundry.local.openaichatclient.md)

[OpenAIEmbeddingClient](./microsoft.ai.foundry.local.openaiembeddingclient.md)

[Parameter](./microsoft.ai.foundry.local.parameter.md)

[PromptTemplate](./microsoft.ai.foundry.local.prompttemplate.md)
Expand Down
18 changes: 18 additions & 0 deletions sdk/cs/docs/api/microsoft.ai.foundry.local.imodel.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,24 @@ Optional cancellation token.
[Task&lt;OpenAIAudioClient&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.task-1)<br>
OpenAI.AudioClient

### **GetEmbeddingClientAsync(Nullable&lt;CancellationToken&gt;)**

Get an OpenAI API based EmbeddingClient

```csharp
Task<OpenAIEmbeddingClient> GetEmbeddingClientAsync(Nullable<CancellationToken> ct)
```

#### Parameters

`ct` [Nullable&lt;CancellationToken&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.nullable-1)<br>
Optional cancellation token.

#### Returns

[Task&lt;OpenAIEmbeddingClient&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.task-1)<br>
OpenAI.EmbeddingClient

### **SelectVariant(IModel)**

Select a model variant from [IModel.Variants](./microsoft.ai.foundry.local.imodel.md#variants) to use for [IModel](./microsoft.ai.foundry.local.imodel.md) operations.
Expand Down
14 changes: 14 additions & 0 deletions sdk/cs/docs/api/microsoft.ai.foundry.local.model.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,20 @@ public Task<OpenAIAudioClient> GetAudioClientAsync(Nullable<CancellationToken> c

[Task&lt;OpenAIAudioClient&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.task-1)<br>

### **GetEmbeddingClientAsync(Nullable&lt;CancellationToken&gt;)**

```csharp
public Task<OpenAIEmbeddingClient> GetEmbeddingClientAsync(Nullable<CancellationToken> ct)
```

#### Parameters

`ct` [Nullable&lt;CancellationToken&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.nullable-1)<br>

#### Returns

[Task&lt;OpenAIEmbeddingClient&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.task-1)<br>

### **UnloadAsync(Nullable&lt;CancellationToken&gt;)**

```csharp
Expand Down
14 changes: 14 additions & 0 deletions sdk/cs/docs/api/microsoft.ai.foundry.local.modelvariant.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,3 +181,17 @@ public Task<OpenAIAudioClient> GetAudioClientAsync(Nullable<CancellationToken> c
#### Returns

[Task&lt;OpenAIAudioClient&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.task-1)<br>

### **GetEmbeddingClientAsync(Nullable&lt;CancellationToken&gt;)**

```csharp
public Task<OpenAIEmbeddingClient> GetEmbeddingClientAsync(Nullable<CancellationToken> ct)
```

#### Parameters

`ct` [Nullable&lt;CancellationToken&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.nullable-1)<br>

#### Returns

[Task&lt;OpenAIEmbeddingClient&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.task-1)<br>
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# OpenAIEmbeddingClient

Namespace: Microsoft.AI.Foundry.Local

Embedding Client that uses the OpenAI API.
Implemented using Betalgo.Ranul.OpenAI SDK types.

```csharp
public class OpenAIEmbeddingClient
```

Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [OpenAIEmbeddingClient](./microsoft.ai.foundry.local.openaiembeddingclient.md)<br>
Attributes [NullableContextAttribute](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.compilerservices.nullablecontextattribute), [NullableAttribute](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.compilerservices.nullableattribute)

## Properties

### **Settings**

Settings to use for embedding requests using this client.

```csharp
public EmbeddingSettings Settings { get; }
```

#### Property Value

EmbeddingSettings<br>

## Methods

### **GenerateEmbeddingAsync(String, Nullable&lt;CancellationToken&gt;)**

Generate embeddings for the given input text.

```csharp
public Task<EmbeddingCreateResponse> GenerateEmbeddingAsync(string input, Nullable<CancellationToken> ct)
```

#### Parameters

`input` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
The text to generate embeddings for.

`ct` [Nullable&lt;CancellationToken&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.nullable-1)<br>
Optional cancellation token.

#### Returns

[Task&lt;EmbeddingCreateResponse&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.task-1)<br>
Embedding response containing the embedding vector.
2 changes: 2 additions & 0 deletions sdk/cs/src/Detail/JsonSerializationContext.cs
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ namespace Microsoft.AI.Foundry.Local.Detail;
[JsonSerializable(typeof(ChatCompletionCreateResponse))]
[JsonSerializable(typeof(AudioCreateTranscriptionRequest))]
[JsonSerializable(typeof(AudioCreateTranscriptionResponse))]
[JsonSerializable(typeof(EmbeddingCreateRequestExtended))]
[JsonSerializable(typeof(EmbeddingCreateResponse))]
[JsonSerializable(typeof(string[]))] // list loaded or cached models
[JsonSerializable(typeof(EpInfo[]))]
[JsonSerializable(typeof(EpDownloadResult))]
Expand Down
5 changes: 5 additions & 0 deletions sdk/cs/src/Detail/Model.cs
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,11 @@ public async Task<OpenAIAudioClient> GetAudioClientAsync(CancellationToken? ct =
return await SelectedVariant.GetAudioClientAsync(ct).ConfigureAwait(false);
}

public async Task<OpenAIEmbeddingClient> GetEmbeddingClientAsync(CancellationToken? ct = null)
{
return await SelectedVariant.GetEmbeddingClientAsync(ct).ConfigureAwait(false);
}

public async Task UnloadAsync(CancellationToken? ct = null)
{
await SelectedVariant.UnloadAsync(ct).ConfigureAwait(false);
Expand Down
17 changes: 17 additions & 0 deletions sdk/cs/src/Detail/ModelVariant.cs
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,13 @@ public async Task<OpenAIAudioClient> GetAudioClientAsync(CancellationToken? ct =
.ConfigureAwait(false);
}

public async Task<OpenAIEmbeddingClient> GetEmbeddingClientAsync(CancellationToken? ct = null)
{
return await Utils.CallWithExceptionHandling(() => GetEmbeddingClientImplAsync(ct),
"Error getting embedding client for model", _logger)
.ConfigureAwait(false);
}

private async Task<bool> IsLoadedImplAsync(CancellationToken? ct = null)
{
var loadedModels = await _modelLoadManager.ListLoadedModelsAsync(ct).ConfigureAwait(false);
Expand Down Expand Up @@ -193,6 +200,16 @@ private async Task<OpenAIAudioClient> GetAudioClientImplAsync(CancellationToken?
return new OpenAIAudioClient(Id);
}

private async Task<OpenAIEmbeddingClient> GetEmbeddingClientImplAsync(CancellationToken? ct = null)
{
if (!await IsLoadedAsync(ct))
{
throw new FoundryLocalException($"Model {Id} is not loaded. Call LoadAsync first.");
}

return new OpenAIEmbeddingClient(Id);
}

public void SelectVariant(IModel variant)
{
throw new FoundryLocalException(
Expand Down
7 changes: 7 additions & 0 deletions sdk/cs/src/IModel.cs
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,13 @@ Task DownloadAsync(Action<float>? downloadProgress = null,
/// <returns>OpenAI.AudioClient</returns>
Task<OpenAIAudioClient> GetAudioClientAsync(CancellationToken? ct = null);

/// <summary>
/// Get an OpenAI API based EmbeddingClient
/// </summary>
/// <param name="ct">Optional cancellation token.</param>
/// <returns>OpenAI.EmbeddingClient</returns>
Task<OpenAIEmbeddingClient> GetEmbeddingClientAsync(CancellationToken? ct = null);

/// <summary>
/// Variants of the model that are available. Variants of the model are optimized for different devices.
/// </summary>
Expand Down
81 changes: 81 additions & 0 deletions sdk/cs/src/OpenAI/EmbeddingClient.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
// --------------------------------------------------------------------------------------------------------------------
// <copyright company="Microsoft">
// Copyright (c) Microsoft. All rights reserved.
// </copyright>
// --------------------------------------------------------------------------------------------------------------------

namespace Microsoft.AI.Foundry.Local;

using Betalgo.Ranul.OpenAI.ObjectModels.ResponseModels;

using Microsoft.AI.Foundry.Local.Detail;
using Microsoft.AI.Foundry.Local.OpenAI;
using Microsoft.Extensions.Logging;

/// <summary>
/// Embedding Client that uses the OpenAI API.
/// Implemented using Betalgo.Ranul.OpenAI SDK types.
/// </summary>
public class OpenAIEmbeddingClient
{
private readonly string _modelId;

private readonly ICoreInterop _coreInterop = FoundryLocalManager.Instance.CoreInterop;
private readonly ILogger _logger = FoundryLocalManager.Instance.Logger;

internal OpenAIEmbeddingClient(string modelId)
{
_modelId = modelId;
}

/// <summary>
/// Settings that are supported by Foundry Local for embeddings.
/// </summary>
public record EmbeddingSettings
{
/// <summary>
/// The number of dimensions the resulting output embeddings should have.
/// Only supported by some models.
/// </summary>
public int? Dimensions { get; set; }

/// <summary>
/// The format to return the embeddings in. Can be either "float" or "base64".
/// </summary>
public string? EncodingFormat { get; set; }
}

/// <summary>
/// Settings to use for embedding requests using this client.
/// </summary>
public EmbeddingSettings Settings { get; } = new();

/// <summary>
/// Generate embeddings for the given input text.
/// </summary>
/// <param name="input">The text to generate embeddings for.</param>
/// <param name="ct">Optional cancellation token.</param>
/// <returns>Embedding response containing the embedding vector.</returns>
public async Task<EmbeddingCreateResponse> GenerateEmbeddingAsync(string input,
CancellationToken? ct = null)
{
return await Utils.CallWithExceptionHandling(
() => GenerateEmbeddingImplAsync(input, ct),
"Error during embedding generation.", _logger).ConfigureAwait(false);
}

private async Task<EmbeddingCreateResponse> GenerateEmbeddingImplAsync(string input,
CancellationToken? ct)
{
var embeddingRequest = EmbeddingCreateRequestExtended.FromUserInput(_modelId, input, Settings);
var embeddingRequestJson = embeddingRequest.ToJson();

var request = new CoreInteropRequest { Params = new() { { "OpenAICreateRequest", embeddingRequestJson } } };
var response = await _coreInterop.ExecuteCommandAsync("embeddings", request,
ct ?? CancellationToken.None).ConfigureAwait(false);

var embeddingResponse = response.ToEmbeddingResponse(_logger);

return embeddingResponse;
}
}
81 changes: 81 additions & 0 deletions sdk/cs/src/OpenAI/EmbeddingRequestResponseTypes.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
// --------------------------------------------------------------------------------------------------------------------
// <copyright company="Microsoft">
// Copyright (c) Microsoft. All rights reserved.
// </copyright>
// --------------------------------------------------------------------------------------------------------------------

namespace Microsoft.AI.Foundry.Local.OpenAI;

using System.Text.Json;
using System.Text.Json.Serialization;

using Betalgo.Ranul.OpenAI.ObjectModels.ResponseModels;

using Microsoft.AI.Foundry.Local.Detail;
using Microsoft.Extensions.Logging;

// https://platform.openai.com/docs/api-reference/embeddings/create
internal record EmbeddingCreateRequestExtended
{
[JsonPropertyName("input")]
public string? Input { get; set; }

[JsonPropertyName("model")]
public string? Model { get; set; }

[JsonPropertyName("dimensions")]
public int? Dimensions { get; set; }

[JsonPropertyName("encoding_format")]
public string? EncodingFormat { get; set; }

internal static EmbeddingCreateRequestExtended FromUserInput(string modelId,
string input,
OpenAIEmbeddingClient.EmbeddingSettings settings)
{
return new EmbeddingCreateRequestExtended
{
Model = modelId,
Input = input,
Dimensions = settings.Dimensions,
EncodingFormat = settings.EncodingFormat
};
}
}

internal static class EmbeddingRequestResponseExtensions
{
internal static string ToJson(this EmbeddingCreateRequestExtended request)
{
return JsonSerializer.Serialize(request, JsonSerializationContext.Default.EmbeddingCreateRequestExtended);
}

internal static EmbeddingCreateResponse ToEmbeddingResponse(this ICoreInterop.Response response, ILogger logger)
{
if (response.Error != null)
{
logger.LogError("Error from embeddings: {Error}", response.Error);
throw new FoundryLocalException($"Error from embeddings command: {response.Error}");
}

if (string.IsNullOrWhiteSpace(response.Data))
{
logger.LogError("Embeddings command returned no data");
throw new FoundryLocalException("Embeddings command returned null or empty response data");
}

return response.Data.ToEmbeddingResponse(logger);
}

internal static EmbeddingCreateResponse ToEmbeddingResponse(this string responseData, ILogger logger)
{
var output = JsonSerializer.Deserialize(responseData, JsonSerializationContext.Default.EmbeddingCreateResponse);
if (output == null)
{
logger.LogError("Failed to deserialize EmbeddingCreateResponse (length={Length})", responseData.Length);
throw new JsonException("Failed to deserialize EmbeddingCreateResponse");
}

return output;
}
}
Loading
Loading