Skip to content

Cherry-pick round 1#578

Closed
prathikr wants to merge 21 commits intoreleases/rel-1.0.0from
prathikrao/cherry-pick-rel-1.0.0-rd1
Closed

Cherry-pick round 1#578
prathikr wants to merge 21 commits intoreleases/rel-1.0.0from
prathikrao/cherry-pick-rel-1.0.0-rd1

Conversation

@prathikr
Copy link
Copy Markdown
Contributor

@prathikr prathikr commented Apr 2, 2026

No description provided.

prathikr and others added 16 commits April 2, 2026 10:03
Fixes the following errors I encountered when migrating our
packaging/publishing pipelines to onnxruntime-release-pipelines

```
Starting: Secure Supply Chain Analysis (auto-injected by policy)
==============================================================================
Task         : Secure Supply Chain Analysis
Description  : A task to scan for vulnerabilities in your software supply chain. Formerly "NuGet Security Analysis".
Version      : 0.2.216
Author       : Microsoft Corporation
Help         : See https://aka.ms/sscatask for more information.
==============================================================================
Telemetry ID: 29518951-f4fb-4d5c-a56e-110cbb97c51b
For more information please visit: https://aka.ms/sscatask
Scanning repository contents at source path: E:\_work\1\s
> Starting Multifeed Nuget Security Analysis:
##[warning]samples/cs/GettingStarted/nuget.config - Multiple feeds declared. (https://aka.ms/cfs/nuget)
##[warning]sdk/cs/NuGet.config - Multiple feeds declared. (https://aka.ms/cfs/nuget)
> Starting Multifeed Corext Analysis:
> Starting Multifeed Python Security Analysis:
> Starting CFS NuGet Analysis:
##[warning]samples/cs/GettingStarted/nuget.config - CFS0013: Package source has value that is not an Azure Artifacts feed. (https://aka.ms/cfs/nuget)
##[warning]sdk/cs/NuGet.config - CFS0013: Package source has value that is not an Azure Artifacts feed. (https://aka.ms/cfs/nuget)
##[warning]sdk_legacy/cs/samples/TestApp/TestApp.csproj - CFS0011: Missing in scope NuGet.config file(s). (https://aka.ms/cfs/nuget)
##[warning]sdk_legacy/cs/src/Microsoft.AI.Foundry.Local.csproj - CFS0011: Missing in scope NuGet.config file(s). (https://aka.ms/cfs/nuget)
##[warning]sdk_legacy/cs/test/FoundryLocal.Tests/FoundryLocal.Tests.csproj - CFS0011: Missing in scope NuGet.config file(s). (https://aka.ms/cfs/nuget)
> Starting CFS NPM Analysis:
##[warning]www/.npmrc - CFS0002: Missing default registry. (https://aka.ms/cfs/npm)
##[warning]samples/js/chat-and-audio-foundry-local/package.json - CFS0001: Missing sibling .npmrc file. (https://aka.ms/cfs/npm)
##[warning]samples/js/copilot-sdk-foundry-local/package.json - CFS0001: Missing sibling .npmrc file. (https://aka.ms/cfs/npm)
##[warning]samples/js/electron-chat-application/package.json - CFS0001: Missing sibling .npmrc file. (https://aka.ms/cfs/npm)
##[warning]samples/js/tool-calling-foundry-local/package.json - CFS0001: Missing sibling .npmrc file. (https://aka.ms/cfs/npm)
##[warning]sdk/js/package.json - CFS0001: Missing sibling .npmrc file. (https://aka.ms/cfs/npm)
##[warning]sdk_legacy/js/package.json - CFS0001: Missing sibling .npmrc file. (https://aka.ms/cfs/npm)
> Starting CFS Maven Analysis:
> Starting CFS Cargo Analysis:
##[warning]samples/rust/Cargo.toml - CFS0041: Missing associated .cargo/config.toml file. (https://aka.ms/cfs/cargo)
##[warning]samples/rust/audio-transcription-example/Cargo.toml - CFS0041: Missing associated .cargo/config.toml file. (https://aka.ms/cfs/cargo)
##[warning]samples/rust/foundry-local-webserver/Cargo.toml - CFS0041: Missing associated .cargo/config.toml file. (https://aka.ms/cfs/cargo)
##[warning]samples/rust/native-chat-completions/Cargo.toml - CFS0041: Missing associated .cargo/config.toml file. (https://aka.ms/cfs/cargo)
##[warning]samples/rust/tool-calling-foundry-local/Cargo.toml - CFS0041: Missing associated .cargo/config.toml file. (https://aka.ms/cfs/cargo)
##[warning]sdk/rust/Cargo.toml - CFS0041: Missing associated .cargo/config.toml file. (https://aka.ms/cfs/cargo)
##[warning]sdk_legacy/rust/Cargo.toml - CFS0041: Missing associated .cargo/config.toml file. (https://aka.ms/cfs/cargo)
> Starting CFS CoreXT Analysis:
> Starting CFS CDPx Analysis:
> Starting DockerFile Analysis:
> Starting Kubernetes Deployment File Analysis:
> Starting Helm Charts Analysis:
> Starting Pipeline Configuration Security Analysis:
Azure Artifacts Configuration Analysis found 19 package configuration files in the repository which do not comply with Microsoft package feed security policies. The specific problems and links to their mitigations are listed above. If you need further assistance, please visit https://aka.ms/cfs/detectors .
##[error]NuGet Security Analysis found 2 NuGet package configuration files in the repository which do not comply with Microsoft package feed security policies. The specific problems are listed above. Please visit https://aka.ms/cfs/nuget for more details.
```

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com>
Co-authored-by: Prathik Rao <prathikrao@microsoft.com>
- [x] Convert JS SDK streaming APIs from callbacks to async iterables
- [x] Add `return()` hook to async iterators to prevent unbounded
buffering on early break
- [x] Add guards in streaming callbacks to skip work after error or
cancellation
- [x] Fix test assertions to assert synchronous throws directly
- [x] Replace O(n) `chunks.shift()` with O(1) head-index dequeue with
compaction
- [x] Guard against concurrent `next()` calls with `nextInFlight` flag
- [x] Add comment explaining native stream cancellation limitation in
`return()`
- [x] Fix docs example for `completeStreamingChat(messages, tools)`
overload to pass `tools`
- [x] Regenerate TypeDoc API docs
- [x] Type-check, code review, and security scan
- [x] Add comments explaining why local variable captures are needed
(closures lose `this`)
- [x] Add comments clarifying promise-resolve wake-up pattern in
`.then()` handler
- [x] Add structural comments explaining the AsyncIterable/AsyncIterator
factory pattern
- [x] Apply same readability improvements to chatClient.ts

<!-- START COPILOT CODING AGENT TIPS -->
---

⚡ Quickly spin up Copilot coding agent tasks from anywhere on your macOS
or Windows machine with [Raycast](https://gh.io/cca-raycast-docs).

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: baijumeswani <12852605+baijumeswani@users.noreply.github.com>
…ackages (#555)

no longer need `npm install --winml` as `npm install` with the separate
packages will fetch the appropriate binaries

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
mvp

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Replaces the hardcoded privacy statement URL in the footer with the
Microsoft short-link redirect.

## Changes
- **`www/src/lib/components/home/footer.svelte`**: Updated `href` from
`https://www.microsoft.com/en-us/privacy/privacystatement` →
`https://go.microsoft.com/fwlink/?LinkId=521839`

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: MaanavD <24942306+MaanavD@users.noreply.github.com>
SDK: add contextLength, inputModalities, outputModalities, capabilities
- C# ModelInfo: add ContextLength, InputModalities, OutputModalities,
Capabilities
- JS ModelInfo/IModel/Model/ModelVariant: add new fields and convenience
getters
- Rust ModelInfo: add new fields; Model: add accessor methods

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: maanavd <maanavdalal@gmail.com>
Part 1 of Rust changes (have part 2 but don't have time to test it now).

This is mostly improving perf by reducing cloning and fixing some bugs +
making code more readable (avoiding early returns).
Python SDK: add contextLength, inputModalities, outputModalities,
capabilities; also added tests for these fields

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use IModel in the public API. Changes allow ICatalog and IModel to be
stubbed for testing as you no longer need a concrete Model or
ModelVariant class.
- Make Model and ModelVariant implementation details
- Add variant info and selection to IModel so it works with either Model
or ModelVariant
- Move GetLatestVersion to Catalog and take IModel as input
- ModelVariant has insufficient info to implement this and intuitively
the catalog should know this information.
- Update tests
  - fix usage of test config file for shared test data path

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: skottmckay <979079+skottmckay@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
## Foundry Local Packaging Pipeline

### Summary

This PR introduces the **Foundry Local Packaging Pipeline**, a unified
ADO pipeline that builds, signs, and tests Foundry Local Core (FLC) for
all platforms, packages it as NuGet and Python wheels, then builds,
signs, and tests the C#, JS, Python, and Rust SDKs — for both standard
and WinML variants.

**Pipeline stages:**
1. **Build FLC** — Native AOT binaries for win-x64, win-arm64,
linux-x64, osx-arm64
2. **Package FLC** — Multi-platform NuGet package + Python wheels from
the built binaries
3. **Build SDKs** — C#, JS, Python, Rust using the packaged FLC
4. **Test SDKs** — Validate each SDK against the pipeline-built FLC

**Produced artifacts:** `flc-nuget`, `flc-nuget-winml`, `flc-wheels`,
`flc-wheels-winml`, `cs-sdk`, `cs-sdk-winml`, `js-sdk`, `js-sdk-winml`,
`python-sdk`, `python-sdk-winml`, `rust-sdk`, `rust-sdk-winml`

**SDK Changes:**
1. Adds ability for python sdk to skip installing native depenencies and
use pre-installed binaries like `foundry-local-core`, `onnxruntime`,
`onnxruntime-genai`
2. Adjusts APIs to leverage new download_and_register_eps native interop
call for manually downloading and registering EPs
3. Adds temporary nuget.config to github actions c# pipeline to allow
ORT-Nightly to auto-fetch missing dependencies from upstream nuget.org

### Test coverage

All SDK tests currently run on **win-x64 only**. Additional platform
test jobs are blocked on infrastructure:

- **Windows ARM64** — waiting on a 1ES-hosted win-arm64 pool
- **macOS ARM64** — waiting on a 1ES-hosted macOS ARM64 pool
- **Linux x64** — waiting on the Linux onnxruntime dependency to be
stabilized

TODOs are tracked in the pipeline YAML for each.

### Build strategy

All FLC builds (including win-arm64 and osx-arm64) run on **x64
machines** because .NET Native AOT supports cross-compilation. The
win-arm64 build cross-compiles from x64 Windows — see [Cross-compilation
docs](https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/cross-compile#windows).
Linux builds run on its own respective x64 hosted image.

### Origin

- **Foundry Local Core build steps** were lifted from
`neutron-server/.pipelines/FoundryLocalCore/`
- **SDK build/test steps** were lifted from `Foundry-Local/.github/`

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
To ensure that our docs on MS Learn have accurate code samples, we will
update the docs so they consume the code from this repo. In this repo,
we will run a test to ensure that the samples work - if there is a break
in the samples then this should be fix before a PR can be merged in.

- Add named regions to 15 existing sample files (CS, JS, Python, Rust)
- Create 3 missing Python samples (audio-transcription, web-server,
langchain-integration)
- Create 16 tutorial sample projects (4 tutorials x 4 languages)
- Add samples-integration-test.yml CI workflow

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fixes issue with xml comments in c# sdk:

```
Foundry-Local\sdk\cs\src\ICatalog.cs(62,73): error CS1573: Parameter 'ct' has no matching param tag in 
  the XML comment for 'ICatalog.GetLatestVersionAsync(IModel, CancellationToken?)' (but other parameters do)
```

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com>
Makes execution provider (EP) management explicit across all SDKs and
adds real-time per-EP download progress
reporting. Previously, EP downloads happened implicitly during catalog
access with no granular progress visibility.
Now callers explicitly discover, download, and monitor EPs with typed
APIs and streaming progress callbacks.

  What's included

  Explicit EP discovery and download (all SDKs)

- DiscoverEps() / discoverEps() / discover_eps() — returns typed EpInfo
with name and registration status
- DownloadAndRegisterEpsAsync() / downloadAndRegisterEps() /
download_and_register_eps() — downloads and registers
  EPs, returns typed EpDownloadResult
   - Catalog access no longer blocks on EP downloads

  Per-EP progress callbacks (all SDKs)

- C#: DownloadAndRegisterEpsAsync(names, Action<string, double>
progressCallback, ct) — uses
ExecuteCommandWithCallbackAsync; parses wire format with
CultureInfo.InvariantCulture for locale safety
- JS: downloadAndRegisterEpsWithProgress(names?, progressCallback?) —
uses executeCommandStreaming
- Python: download_and_register_eps(names, progress_callback) — uses
execute_command_with_callback
- Rust: download_and_register_eps_with_progress(names, FnMut(&str, f64))
— parses "name|percent" wire format inside
   the SDK

  Live Audio Transcription (C#)

- New LiveAudioTranscriptionSession with real-time streaming over
WebSocket
- Supports start/stop/send audio chunks with configurable output types
   - Unit tests with mocked CoreInterop

  Other improvements

- Typed EpInfo / EpDownloadResult in dedicated type files across all
SDKs
   - EP unit tests for JS and Python
- Removed implicit 6-hour catalog TTL caching (delegated to native core)
   - New CoreInterop methods for callback-based command execution (C#)
   - AOT-compatible JSON serialization context for EP types (C#)

  Testing
   - New unit tests for EP discovery/download in JS and Python

  Breaking changes

- Catalog no longer implicitly triggers EP downloads — callers must
explicitly call DownloadAndRegisterEpsAsync /
downloadAndRegisterEps / download_and_register_eps before accessing
hardware-accelerated models.

---------

Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…C# docs (#565)

Mirrors the C# changes from #556 across all language bindings: public
APIs use the `IModel` interface instead of concrete
`Model`/`ModelVariant` types, `GetLatestVersion` moves from `Model` to
`Catalog`, and `ModelVariant` becomes an implementation detail.

- Added `info`, `variants`, `selected_variant`/`selectedVariant`,
`select_variant`/`selectVariant` to the abstract interface
- `ModelVariant` implements these as self-referential
(`variants=[self]`, `selected_variant=self`, `select_variant` throws)

```python
model = catalog.get_model("qwen2.5-0.5b")
for v in model.variants:          # List[IModel], not List[ModelVariant]
    print(v.info.name, v.id)
model.select_variant(v)           # takes IModel, not ModelVariant

latest = catalog.get_latest_version(model)  # moved from Model to Catalog
```

- `list_models()` → `List[IModel]` (was `List[Model]`)
- `get_model()` → `Optional[IModel]` (was `Optional[Model]`)
- `get_model_variant()` → `Optional[IModel]` (was
`Optional[ModelVariant]`)
- `get_cached_models()` / `get_loaded_models()` → `List[IModel]` (was
`List[ModelVariant]`)

Moved from `Model` to `Catalog` since `ModelVariant` lacks sufficient
context to implement it. Takes any `IModel` and resolves the latest
version by name matching against the variant list.

- Added `Model::info()` (delegates to selected variant)
- Added `Catalog::get_latest_version(&self, model: &Arc<Model>) ->
Result<Arc<ModelVariant>>`

- README, API docs (`ICatalog`, `IModel`, `Model`, `ModelVariant`)
updated to reflect `IModel` return types
- `ModelVariant` docs marked as internal
- Samples updated to avoid direct `ModelVariant` type references
- `GetLatestVersionAsync` added to `ICatalog` docs

---------

Co-authored-by: Baiju Meswani <bmeswani@microsoft.com>
Co-authored-by: Nenad Banfic <46795300+nenad1002@users.noreply.github.com>
#486)

Here's the corrected PR description with all names aligned to the actual
code:

---

Adds real-time audio streaming support to the Foundry Local JS SDK,
enabling live microphone-to-text transcription via ONNX Runtime GenAI
ASR.

The existing AudioClient only supports file-based transcription. This PR
introduces `LiveAudioTranscriptionSession` that accepts continuous PCM
audio chunks (e.g., from a microphone) and returns partial/final
transcription results as an async iterable.

- `src/openai/liveAudioTranscriptionClient.ts` — Streaming session with
`start()`, `append()`, `getTranscriptionStream()`, `stop()`, `dispose()`
- `src/openai/liveAudioTranscriptionTypes.ts` —
`LiveAudioTranscriptionResponse` and `CoreErrorResponse` interfaces,
`tryParseCoreError()` helper
- `src/detail/coreInterop.ts` — Added `executeCommandWithBinary()`
method and `StreamingRequestBuffer` struct for binary PCM data transport
- app.js — E2E example with microphone capture (naudiodon2) and
synthetic audio fallback
- `test/openai/liveAudioTranscription.test.ts` — Unit tests for
types/settings and E2E test with synthetic PCM audio

- `src/imodel.ts` — Added `createLiveTranscriptionSession()` to
interface
- `src/model.ts` — Delegates to
`selectedVariant.createLiveTranscriptionSession()`
- `src/modelVariant.ts` — Implementation (creates new
`LiveAudioTranscriptionSession(modelId, coreInterop)`)
- `src/index.ts` — Exports `LiveAudioTranscriptionSession`,
`LiveAudioTranscriptionOptions`, `LiveAudioTranscriptionResponse`,
`TranscriptionContentPart`

```js
const session = model.createLiveTranscriptionSession();

session.settings.sampleRate = 16000;
session.settings.channels = 1;
session.settings.language = "en";

await session.start();

// Push audio from microphone callback
await session.append(pcmBytes);

// Read results as async iterable
for await (const result of session.getTranscriptionStream()) {
    console.log(result.content[0].text);
}

await session.stop();
```

- **Internal async push queue** — Bounded `AsyncQueue<T>` serializes
audio pushes from any context (safe for mic callbacks) and provides
backpressure via FIFO resolver queue. Mirrors C#'s `Channel<T>` pattern.
- **Binary data transport** — `executeCommandWithBinary()` sends PCM
bytes alongside JSON params via `StreamingRequestBuffer`, with
transcription results parsed from push responses.
- **Settings freeze** — Audio format settings are snapshot-copied and
`Object.freeze()`d at `start()`, immutable during the session
- **Buffer copy** — `append()` copies the input `Uint8Array` before
queueing, safe when caller reuses buffers
- **Drain-on-stop** — `stop()` completes the push queue, waits for the
push loop to drain, parses final transcription from stop response, then
completes the output stream
- **Error propagation** — `start()` failures are propagated to
`outputQueue` so `getTranscriptionStream()` consumers see the error;
`tryParseCoreError()` handles both raw JSON and CoreInterop-prefixed
error messages
- **Dispose safety** — `dispose()` wraps `stop()` in try/catch, never
throws

This PR adds the JS SDK surface. The 3 native commands
(`audio_stream_start`, `audio_stream_push`, `audio_stream_stop`) are
routed through `execute_command` and the new
`execute_command_with_binary` exports. The code compiles with zero
TypeScript errors without the native library.

- ✅ TypeScript compilation — 0 errors across all source files
- ✅ Unit tests for `parseTranscriptionResult()`, `tryParseCoreError()`,
`LiveAudioTranscriptionOptions`
- ✅ E2E test with synthetic PCM audio (skips gracefully if native core
unavailable)

This implementation mirrors the C# `LiveAudioTranscriptionSession` with
identical logic:
- Same session lifecycle: `start` → `append` → `getStream` → `stop`
- Same push loop with error handling and binary data transport
- Same settings freeze and buffer copy semantics
- Same drain-before-stop ordering with final result parsing
- Same E2E test pattern (synthetic 440Hz sine wave, 100ms chunks,
ConversationItem-shaped response validation)
- Same renamed types: `LiveAudioTranscription*` (matching C# rename)

---

Changes from the original:
| Old (incorrect) | New (matches code) |
|---|---|
| `LiveAudioTranscriptionClient` | `LiveAudioTranscriptionSession` |
| `LiveAudioTranscriptionSettings` | `LiveAudioTranscriptionOptions` |
| `LiveAudioTranscriptionResult` | `LiveAudioTranscriptionResponse` |
| `createLiveTranscriptionClient()` | `createLiveTranscriptionSession()`
|

---------

Co-authored-by: ruiren_microsoft <ruiren@microsoft.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Kunal Vaishnavi <kvaishnavi@microsoft.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
foundry-local Ready Ready Preview, Comment Apr 3, 2026 2:42am

Request Review

Improve EP progress download printing in samples

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants