From f1c6c07e3ef295cb8379d9a550abdb8899a97863 Mon Sep 17 00:00:00 2001 From: Simon Kurtz Date: Mon, 1 Jun 2026 23:23:16 -0400 Subject: [PATCH 01/31] Initial check-in --- .github/copilot-instructions.md | 1 + .github/skills/apim-policies/SKILL.md | 23 + .github/skills/sample-creator/SKILL.md | 2 + AGENTS.md | 1 + README.md | 27 +- assets/APIM-Samples-Slide-Deck.html | 26 +- .../Infrastructure-Sample-Compatibility.svg | 79 ++- docs/index.html | 13 +- infrastructure/simple-apim/README.md | 7 +- infrastructure/simple-apim/main.bicep | 5 +- samples/inference-failover/README.md | 88 +++ .../backend-distribution.kql | 14 + samples/inference-failover/create.ipynb | 667 ++++++++++++++++++ .../inference-failover/failover-outcomes.kql | 12 + .../inference-api-policy.xml | 45 ++ .../inference-failover.workbook.json | 292 ++++++++ samples/inference-failover/main.bicep | 426 +++++++++++ .../inference-failover/token-throughput.kql | 18 + .../inference-failover/update-workbook.ps1 | 57 ++ .../verify-llm-ingestion.kql | 19 + ...pool-load-balancing-with-retry-tracked.xml | 10 +- tests/Test-Matrix.md | 30 +- tests/python/test_inference_failover.py | 159 +++++ 23 files changed, 1941 insertions(+), 80 deletions(-) create mode 100644 samples/inference-failover/README.md create mode 100644 samples/inference-failover/backend-distribution.kql create mode 100644 samples/inference-failover/create.ipynb create mode 100644 samples/inference-failover/failover-outcomes.kql create mode 100644 samples/inference-failover/inference-api-policy.xml create mode 100644 samples/inference-failover/inference-failover.workbook.json create mode 100644 samples/inference-failover/main.bicep create mode 100644 samples/inference-failover/token-throughput.kql create mode 100644 samples/inference-failover/update-workbook.ps1 create mode 100644 samples/inference-failover/verify-llm-ingestion.kql create mode 100644 tests/python/test_inference_failover.py diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 77226654..3a5e2639 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -601,4 +601,5 @@ Samples that require administrative or operational endpoints (cache loading, con ### API Management Policy XML Instructions - Policies should use camelCase for all variable names. +- The `` section may contain only one direct child policy. When retrying a backend request, make `` the sole direct child of `` and place `` plus any per-attempt policies inside it. Move terminal fallback handling to `` or `` as appropriate. Do not add sibling policies such as `` or `` alongside the backend ``; APIM rejects the policy during deployment. - Policy expressions (`@(...)` and `@{...}`) may **only** reference .NET types and members on APIM's [allow-list](https://learn.microsoft.com/azure/api-management/api-management-policy-expressions#CLRTypes). Using anything outside the list (e.g. `System.Globalization.*`, `DateTime.TryParse`, `DateTime.ToUniversalTime`, `System.Text.Json`) causes a deploy-time `ValidationError: One or more fields contain incorrect values` with no further detail. Verify each type/member against the allow-list before writing the expression. See `.github/skills/apim-policies/SKILL.md` for common pitfalls and allowed replacements. diff --git a/.github/skills/apim-policies/SKILL.md b/.github/skills/apim-policies/SKILL.md index ad4de66e..43f259ad 100644 --- a/.github/skills/apim-policies/SKILL.md +++ b/.github/skills/apim-policies/SKILL.md @@ -34,6 +34,29 @@ Every APIM policy document follows this structure: The `` element inherits policies from parent scopes (Global → Product → API → Operation). +### Backend Section Cardinality (CRITICAL) + +The `` section may contain only one direct child policy. APIM rejects a policy during deployment when `` contains sibling policies such as `` followed by ``, or `` followed by ``. + +When retrying a backend request, make `` the sole direct child of `` and place `` plus any policies that must execute on every attempt inside ``. Move terminal fallback handling to `` or `` as appropriate. + +```xml + + + + + +``` + +When a custom backend policy is not needed, keep `` as the only direct child: + +```xml + + + +``` + ## Policy Categories Quick Reference | Category | Common Policies | Section | diff --git a/.github/skills/sample-creator/SKILL.md b/.github/skills/sample-creator/SKILL.md index a683f7df..4a165928 100644 --- a/.github/skills/sample-creator/SKILL.md +++ b/.github/skills/sample-creator/SKILL.md @@ -314,6 +314,8 @@ For samples with custom policies, create XML files following the APIM policy str ``` +The `` section may contain only one direct child policy. Keep `` as the only child when inheriting backend behavior. When retrying, replace `` with a single `` child, nest `` and any per-attempt policies inside ``, and move terminal fallback handling to `` or `` as appropriate. + Load policies in the notebook: ```python diff --git a/AGENTS.md b/AGENTS.md index bc240c17..f0993c92 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -37,6 +37,7 @@ All repository contributions should treat accessibility as a first-class quality │ ├── costing/ # APIM costing and showback │ ├── egress-control/ # Egress control via NVA routing │ ├── general/ # Basic policy demonstrations +│ ├── inference-failover/ # AOAI model failover with LLM telemetry │ ├── load-balancing/ # Backend pool load balancing │ ├── oauth-3rd-party/ # OAuth 3rd-party (Spotify example) │ └── secure-blob-access/ # Valet key pattern for blob storage diff --git a/README.md b/README.md index 85706a68..9152e9d6 100644 --- a/README.md +++ b/README.md @@ -62,18 +62,20 @@ It's quick and easy to get started!
-| Sample Name | Description | Supported Infrastructure(s) | -|:------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------|:------------------------------| -| [AuthX][sample-authx] | Authentication and role-based authorization in a mock HR API. | All infrastructures | -| [AuthX Pro][sample-authx-pro] | Authentication and role-based authorization in a mock product with multiple APIs and policy fragments. | All infrastructures | -| [Azure Maps][sample-azure-maps] | Proxying calls to Azure Maps with APIM policies. | All infrastructures | -| [Costing][sample-costing] | Track and allocate API costs per business unit using APIM subscriptions, Entra ID application tracking, and AI Gateway token/PTU tracking across **both** Azure OpenAI Chat Completions and Responses APIs, including streaming (SSE) token usage which is not simple to capture correctly in APIM. | All infrastructures | -| [Dynamic CORS][sample-dynamic-cors] | Dynamic per-API CORS origin validation using custom policy fragments and a maintainable origin mapping. | All infrastructures | -| [Egress Control][sample-egress-control] | Control APIM outbound internet traffic by routing it through a Network Virtual Appliance (NVA) in a hub/spoke topology. | appgw-apim, appgw-apim-pe | -| [General][sample-general] | Basic demo of APIM sample setup and policy usage. | All infrastructures | -| [Load Balancing][sample-load-balancing] | Priority and weighted load balancing across backends. | apim-aca, afd-apim-pe | -| [OAuth 3rd-Party][sample-oauth-3rd-party] | Authenticate with APIM which then uses its Credential Manager with Spotify's REST API. | All infrastructures | -| [Secure Blob Access][sample-secure-blob-access] | Secure blob access via the [valet key pattern][valet-key-pattern]. | All infrastructures | +| Sample Name | Description | Supported Infrastructure(s) | +| :--- | :--- | :--- | +| [AuthX][sample-authx] | Authentication and role-based authorization in a mock HR API. | All infrastructures | +| [AuthX Pro][sample-authx-pro] | Authentication and role-based authorization in a mock product with multiple APIs and policy fragments. | All infrastructures | +| [Azure Maps][sample-azure-maps] | Proxying calls to Azure Maps with APIM policies. | All infrastructures | +| [Costing][sample-costing] | Track and allocate API costs per business unit using APIM subscriptions, Entra ID application tracking, and AI Gateway token/PTU tracking across **both** Azure OpenAI Chat Completions and Responses APIs, including streaming (SSE) token usage which is not simple to capture correctly in APIM. | All infrastructures | +| [Dynamic CORS][sample-dynamic-cors] | Dynamic per-API CORS origin validation using custom policy fragments and a maintainable origin mapping. | All infrastructures | +| [Egress Control][sample-egress-control] | Control APIM outbound internet traffic by routing it through a Network Virtual Appliance (NVA) in a hub/spoke topology. | appgw-apim, appgw-apim-pe | +| [General][sample-general] | Basic demo of APIM sample setup and policy usage. | All infrastructures | +| [Inference Failover][sample-inference-failover] | Route compatible Azure OpenAI models through priority and weighted APIM backend pools with focused LLM failover and token telemetry. | All infrastructures | +| [Load Balancing][sample-load-balancing] | Priority and weighted load balancing across backends. | apim-aca, afd-apim-pe | +| [OAuth 3rd-Party][sample-oauth-3rd-party] | Authenticate with APIM which then uses its Credential Manager with Spotify's REST API. | All infrastructures | +| [Secure Blob Access][sample-secure-blob-access] | Secure blob access via the [valet key pattern][valet-key-pattern]. | All infrastructures | +
### Compatibility Matrices @@ -383,6 +385,7 @@ _For much more API Management content, please also check out [APIM Love](https:/ [sample-costing]: ./samples/costing/README.md [sample-dynamic-cors]: ./samples/dynamic-cors/README.md [sample-general]: ./samples/general/README.md +[sample-inference-failover]: ./samples/inference-failover/README.md [sample-load-balancing]: ./samples/load-balancing/README.md [sample-egress-control]: ./samples/egress-control/README.md [sample-oauth-3rd-party]: ./samples/oauth-3rd-party/README.md diff --git a/assets/APIM-Samples-Slide-Deck.html b/assets/APIM-Samples-Slide-Deck.html index 2d5c2eff..ac9a8442 100644 --- a/assets/APIM-Samples-Slide-Deck.html +++ b/assets/APIM-Samples-Slide-Deck.html @@ -1015,7 +1015,7 @@

📚 Samples