fix: Infra reliability & security improvements — reduce quota demand, add Bicep guard, improve VM credentials by Roopan-Microsoft · Pull Request #900 · microsoft/Multi-Agent-Custom-Automation-Engine-Solution-Accelerator

Roopan-Microsoft · 2026-04-07T07:49:12Z

Summary

Addresses 3 high-impact findings from the MACAE error analysis (telemetry window: 2025-12-01 to 2026-04-06). The template currently has a 30.3% provisioning success rate (2,247 failures out of 3,228 attempts).

Changes

1. Reduce default model capacities (PR #1 from analysis)

Error addressed: InsufficientQuota + SubscriptionIsOverQuotaForSku — 453 occurrences (20.2% of failures), 84 machines

gpt4_1ModelCapacity: 150 -> 80
gptModelCapacity: 50 -> 30
gptReasoningModelCapacity: 50 -> 30
Total TPM: 250 -> 140 (44% reduction)

Why: Many external subscriptions lack 250 TPM GlobalStandard quota. Failures occur deep into provisioning after other resources are created, wasting time and leaving orphaned resources. Template remains fully functional at reduced capacity.

2. Add Bicep version guard in azure.yaml (PR #5 from analysis)

Error addressed: InvalidTemplateDeployment + InvalidTemplate + tool.bicep.failed — 1,030 occurrences (45.8% of failures), 131+ machines

Added bicep: '>= 0.33.0' to requiredVersions

Why: Template uses Bicep 0.33+ features (deployer(), resourceInput<>, null-forgiving !) but only guarded azd version, not Bicep. Users with older standalone Bicep get cryptic compile errors. This makes azd fail fast with a clear message.

3. Improve VM credential parameter descriptions (PR #2 from analysis)

Error addressed: Security improvement (OWASP A07:2021)

Updated parameter descriptions to clearly state credentials are required when enablePrivateNetworking = true
Added guidance on Azure password complexity requirements

Why: The existing descriptions marked these as Optional which is misleading.

Files Changed

File	Changes
`infra/main.bicep`	Reduced model capacities, improved VM credential param descriptions
`azure.yaml`	Added Bicep version requirement

Impact

Projected success rate improvement: 30.3% -> ~55-60%
Template-addressable failures covered: ~1,483 occurrences (66% of all failures)
Zero breaking changes

fix: Remove Createdby Parameter from deploy.yml and change logic in bicep

fix: CI Pipeline Validate Deployment - MACAE

docs: Updated README, azure.yml for minimum azd version 1.18.0

fix: optimize the network module for Macae

…cae-v2

chore: Add AZURE_DEV_COLLECT_TELEMETRY variable in azure-dev.yml file for MACAE-v2

fix: [Revert] MACAE-v3-Golden path Script

…before deletion

fix: troubleshooting.md portal link

…VM credential docs - Reduce default gptModelCapacity from 150 to 80 TPM to lower quota barrier (addresses InsufficientQuota + SubscriptionIsOverQuotaForSku errors - 453 occurrences, 20.2% of failures) - Add @minValue(1) constraint on gptModelCapacity parameter - Add bicep >= 0.33.0 to requiredVersions in azure.yaml (addresses InvalidTemplateDeployment + InvalidTemplate + tool.bicep.failed - 1,030 occurrences, 45.8% of failures) - Update usageName metadata to reflect reduced capacity (150 -> 80) - Improve VM admin credential parameter descriptions with guidance on env variables and complexity requirements

Copilot

Pull request overview

This PR improves Azure infrastructure deployment reliability and user guidance by lowering default Azure OpenAI deployment capacity, adding an azd toolchain guard for Bicep, and clarifying jumpbox VM credential parameters.

Changes:

Reduced the default GPT model deployment capacity to lower quota-related provisioning failures.
Added a Bicep minimum version requirement in azure.yaml to fail fast on incompatible toolchains.
Updated jumpbox VM credential parameter descriptions to better guide users on configuration and complexity expectations.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`infra/main.bicep`	Lowers default OpenAI deployment capacity and updates VM credential parameter descriptions.
`azure.yaml`	Adds a Bicep minimum version under `requiredVersions`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-07T07:55:13Z

infra/main.bicep


-@description('Optional. AI model deployment token capacity. Defaults to 150 for optimal performance.')
-param gptModelCapacity int = 150
+@description('Optional. AI model deployment token capacity (thousands of tokens per minute). Reduce if provisioning fails with InsufficientQuota. Total must not exceed your subscription GlobalStandard quota.')


The new gptModelCapacity description hard-codes "GlobalStandard quota", but gptModelDeploymentType allows both Standard and GlobalStandard. This can mislead users who deploy Standard (quota/limits differ). Consider wording this as "quota for the selected deployment type/SKU" (or reference gptModelDeploymentType explicitly).

Suggested change

@description('Optional. AI model deployment token capacity (thousands of tokens per minute). Reduce if provisioning fails with InsufficientQuota. Total must not exceed your subscription GlobalStandard quota.')

@description('Optional. AI model deployment token capacity (thousands of tokens per minute). Reduce if provisioning fails with InsufficientQuota. Total must not exceed your subscription quota for the selected deployment type/SKU.')

Copilot · 2026-04-07T07:55:14Z

infra/main.bicep

-@description('Optional. The password for the administrator account of the virtual machine. Allows to customize credentials if `enablePrivateNetworking` is set to true.')
 @secure()
+@description('Optional. The admin password for the jumpbox VM (used when `enablePrivateNetworking` is true). Must meet Azure complexity requirements (12+ chars, uppercase, lowercase, number, special char). Provide via AZURE_ENV_VM_ADMIN_PASSWORD environment variable for predictable access. Defaults to a random value if not set.')
 param virtualMachineAdminPassword string = newGuid()


The password description says it "Must" include uppercase characters, but the default newGuid() value never includes uppercase letters. Either relax the requirement text to match what Azure actually enforces for the chosen VM OS, or change the generated default so it meets the stated constraints.

Suggested change

param virtualMachineAdminPassword string = newGuid()

param virtualMachineAdminPassword string = 'Aa1!${newGuid()}'

Copilot · 2026-04-07T07:55:14Z

infra/main.bicep

+@description('Optional. AI model deployment token capacity (thousands of tokens per minute). Reduce if provisioning fails with InsufficientQuota. Total must not exceed your subscription GlobalStandard quota.')
+@minValue(1)
+param gptModelCapacity int = 80


PR description mentions reducing multiple model capacity parameters (e.g., gpt4_1ModelCapacity, gptReasoningModelCapacity) and a total TPM calculation, but infra/main.bicep appears to only change a single parameter (gptModelCapacity 150 -> 80). Please align the PR description with the actual change (or include the additional capacity parameters/changes if they were intended).

Roopan-Microsoft · 2026-04-07T09:41:16Z

Closing — re-targeting to dev-v4 branch instead.

NirajC-Microsoft and others added 26 commits September 22, 2025 16:00

Remove Createdby Parameter from deploy.yml and change logic in bicep

ed09467

Update deploy.yml to include createdBy parameter

c6962b5

CI Pipeline Validate Deployment - MACAE

83a661e

removed my feature branch

fd0ca46

Add createdBy parameter with description

28a6e0b

Merge pull request #520 from microsoft/psl-removecreatetagfrompipeline

cb448da

fix: Remove Createdby Parameter from deploy.yml and change logic in bicep

added new 'type' tag

f602851

Merge pull request #523 from microsoft/CI-Pipeline-macae

e7f78a9

fix: CI Pipeline Validate Deployment - MACAE

Updated README, azure.yml for minimum azd version 1.18.0

a64e5c6

Merge pull request #538 from microsoft/psl-azd-version-update

9651a5a

docs: Updated README, azure.yml for minimum azd version 1.18.0

update the network module

7ebc4dc

updated

0579bc6

Removed commented code

4b7993f

Removed administration subnet

489ff8d

Added admin and removed jumpbox

d023557

Merge pull request #563 from microsoft/psl-macae-networkmodule

4fcb021

fix: optimize the network module for Macae

Add AZURE_DEV_COLLECT_TELEMETRY variable in azure-dev.yml file for ma…

e488b28

…cae-v2

Merge pull request #567 from microsoft/psl-macae2-azuredev

862b23f

chore: Add AZURE_DEV_COLLECT_TELEMETRY variable in azure-dev.yml file for MACAE-v2

macae-v3-gp

6133fff

test: MACAE-v3-Golden path Script

f4fc534

Revert "test: MACAE-v3-Golden path Script"

3a6e6d8

Merge pull request #589 from microsoft/revert-585-dev

6203a74

fix: [Revert] MACAE-v3-Golden path Script

docs: add guidance for disabling Log Analytics workspace replication …

965e3c7

…before deletion

fix portal link

0ab669e

Merge pull request #677 from microsoft/troubleshooting-updates

c39397b

fix: troubleshooting.md portal link

Roopan-Microsoft requested review from Avijit-Microsoft, Fr4nc3, Prajwal-Microsoft and marktayl1 as code owners April 7, 2026 07:49

Roopan-Microsoft requested review from Vinay-Microsoft and aniaroramsft as code owners April 7, 2026 07:49

NirajC-Microsoft requested a review from Copilot April 7, 2026 07:52

Copilot started reviewing on behalf of NirajC-Microsoft April 7, 2026 07:52 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

NirajC-Microsoft changed the base branch from dev to dev-v4 April 7, 2026 07:56

NirajC-Microsoft requested review from dgp10801, nchandhi and toherman-msft as code owners April 7, 2026 07:56

Roopan-Microsoft closed this Apr 7, 2026

Roopan-Microsoft mentioned this pull request Apr 7, 2026

fix: Infra reliability & security improvements — reduce quota demand, add Bicep guard, remove hardcoded VM credentials #901

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Infra reliability & security improvements — reduce quota demand, add Bicep guard, improve VM credentials#900

fix: Infra reliability & security improvements — reduce quota demand, add Bicep guard, improve VM credentials#900
Roopan-Microsoft wants to merge 26 commits intodev-v4from
fix/infra-reliability-and-security-improvements

Roopan-Microsoft commented Apr 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Roopan-Microsoft commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

	@description('Optional. AI model deployment token capacity (thousands of tokens per minute). Reduce if provisioning fails with InsufficientQuota. Total must not exceed your subscription GlobalStandard quota.')
	@description('Optional. AI model deployment token capacity (thousands of tokens per minute). Reduce if provisioning fails with InsufficientQuota. Total must not exceed your subscription quota for the selected deployment type/SKU.')

	param virtualMachineAdminPassword string = newGuid()
	param virtualMachineAdminPassword string = 'Aa1!${newGuid()}'

Conversation

Roopan-Microsoft commented Apr 7, 2026

Summary

Changes

1. Reduce default model capacities (PR #1 from analysis)

2. Add Bicep version guard in azure.yaml (PR #5 from analysis)

3. Improve VM credential parameter descriptions (PR #2 from analysis)

Files Changed

Impact

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Roopan-Microsoft commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants