Skip to content

Conversation

@james-tn
Copy link
Contributor

@james-tn james-tn commented Jan 9, 2026

Enterprise Security Infrastructure for Azure OpenAI Workshop

Summary

This PR adds enterprise-grade security features to the Azure infrastructure deployment, aligning Terraform and Bicep configurations with best practices for production workloads.


🔒 Network Security

  • VNet Integration
    • Container Apps Environment now runs inside a dedicated VNet: 10.10.0.0/16
  • Private Endpoints
    • Added private endpoints for:
      • Cosmos DB
      • Azure OpenAI
    • Eliminates public network exposure
  • Private DNS Zones
    • Configured:
      • privatelink.documents.azure.com
      • privatelink.openai.azure.com
  • Internal MCP Service
    • New option to make the MCP service internal-only
    • Accessible only from within the Container Apps environment

🔐 Identity & Access

  • Managed Identity Authentication
    • Cosmos DB and Azure OpenAI accessed using a user-assigned managed identity
    • No API keys required
  • Key Vault Removed
    • No longer needed due to managed identity usage
  • RBAC Roles
    • Cosmos DB Data Contributor
    • Cognitive Services OpenAI User

🛠️ Infrastructure Improvements

  • Fixed deploy.ps1
    • Script no longer overwrites existing tfvars files
  • Embedding Model Added
    • Support for text-embedding-ada-002
  • Subnet Sizing Fix
    • Container Apps subnet set to /23 (minimum required for workload profiles)

⚙️ Configuration Options

James N. and others added 30 commits November 14, 2025 09:38
Enhanced Agentic AI with Secure Azure Deployment
…ough an exception

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
…ted NPM libraries and added Dockerfile for containerization
* WIP: Save local changes before switching to int-agentic

* Fix WebSocket reconnect issue and Vite build compatibility

- Add intentionalClose flag to WebSocket manager to prevent auto-reconnect on intentional close
- Fix Dockerfile to copy from Vite 'dist' instead of CRA 'build' directory
- Update backend static file serving to handle both Vite (assets/) and CRA (static/) structures
- Add catch-all exception handler for WebSocket disconnections in backend

---------

Co-authored-by: James N. <james.nguyen@microsoft.com>
tjsullivan1 and others added 12 commits December 19, 2025 21:46
Updated environment variable handling for jobs based on event types and branch names.
Added commands to ensure key vault is reachable and update its networking settings.
Add checks for existing key vault before updating settings.
Updated Key Vault role assignment to use user assigned identity and added a user assigned managed identity resource for the backend container app.
Infrastructure Automation with Testing
* WIP: Save local changes before switching to int-agentic

* Fix WebSocket reconnect issue and Vite build compatibility

- Add intentionalClose flag to WebSocket manager to prevent auto-reconnect on intentional close
- Fix Dockerfile to copy from Vite 'dist' instead of CRA 'build' directory
- Update backend static file serving to handle both Vite (assets/) and CRA (static/) structures
- Add catch-all exception handler for WebSocket disconnections in backend

* update authentication and bicep deployment to use AAD authentication instead of key

* complete terraform deployment

* update DEPLOYMENT and Terraform

* update DEPLOYMENT and Terraform

* Changed AZURE_OPENAI_API_VERSION to use a variable

* Reverted the OIDC changes on providers.tf

* Reverted the OIDC changes on providers.tf

* Removing key vault referene from orchestration workflow

* removing key vault reference and openai secret key from infrastructure workflow. I have also commented out all the tests for model endpoint, since that currently relies on key based access.

* changing docker to build off new image

* changing docker to build off new image

* changing docker to build off new image

* Making backend config optionally remote in the proper way

* Reverting backend change, seems to have broken state connection

* adding a local provider file so I can have flexible backends

* upgrade version of agent-framework and allow mcp in internal communication to be insecure

* Updated to work with both local and remote state

* optimize reflection agent code and remove workflow reflection agent

* add github workflow

* update github workflow to use repo level variables

* update github workflow to use repo level variables

* update github workflow to use repo level variables

* update github workflow to use repo level variables

* update test cases & test timeout & excluce MCP test bc mcp is deployed internal

* move test to after deployment

* move test to after deployment

* fix api version

* fix api version

* fix test run

* fix: Use placeholder image for Container Apps initial deployment

- Use mcr.microsoft.com/k8se/quickstart:latest as placeholder image
- Add lifecycle ignore_changes for container image (managed by update-containers)
- Solves chicken-and-egg problem: Container Apps created before images exist in ACR
- update-containers.yml sets real images after Docker builds complete

* fix: Remove pull_request triggers from Docker workflows

- Docker workflows should only run via workflow_call from orchestrate.yml
- Prevents duplicate/orphan runs that occur before infrastructure exists
- Manual dispatch still available for ad-hoc builds

* feat: Add james-dev to destroy-infrastructure condition

* feat: Update Bicep for feature parity with Terraform

- Add placeholder image support (mcr.microsoft.com/k8se/quickstart:latest)
- Fix MCP allowInsecure when mcpInternalOnly is true
- Add readiness probe to application container (/docs endpoint)
- Add missing env vars: AZURE_AI_AGENT_MODEL_DEPLOYMENT_NAME, AZURE_OPENAI_EMBEDDING_DEPLOYMENT
- Make AZURE_OPENAI_API_VERSION configurable via parameter
- Align naming convention with environment suffix
- Change image name from workshop-app to backend-app for consistency

* docs: enhance README with Mermaid diagrams and enterprise deployment guide

- Replace ASCII architecture diagrams with interactive Mermaid diagrams
- Add comprehensive enterprise security sections (VNet, Private Endpoints, Managed Identity)
- Document security profiles (Dev/Staging/Production)
- Add CI/CD with GitHub Actions OIDC section linking to GITHUB_ACTIONS_SETUP.md
- Update main README with enterprise deployment table linking to all guides
- Add data flow and authentication flow sequence diagrams
- Include troubleshooting guide with common issues

* docs: enhance README with Mermaid diagrams and enterprise deployment guide

- Replace ASCII architecture diagrams with interactive Mermaid diagrams
- Add comprehensive enterprise security sections (VNet, Private Endpoints, Managed Identity)
- Document security profiles (Dev/Staging/Production)
- Add CI/CD with GitHub Actions OIDC section linking to GITHUB_ACTIONS_SETUP.md
- Update main README with enterprise deployment table linking to all guides
- Add data flow and authentication flow sequence diagrams
- Include troubleshooting guide with common issues

* Updated deployment to reference tfvars file for local file/iteration value

---------

Co-authored-by: James N. <james.nguyen@microsoft.com>
Co-authored-by: Tim Sullivan <timothyj.sullivan1@gmail.com>
@james-tn james-tn requested a review from tjsullivan1 January 9, 2026 19:14
Comment on lines +40 to +78
name: Run Integration Tests
runs-on: ubuntu-latest
# No environment needed - uses repo-level variables

steps:
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install test dependencies
run: |
pip install -r tests/requirements.txt

- name: Wait for Container Apps to warm up
run: |
echo "Waiting 30 seconds for Container Apps to be ready..."
sleep 30

- name: Run integration tests
run: |
cd tests
pytest -v -m "integration" --tb=short
continue-on-error: true # Report results but don't fail the workflow
env:
BACKEND_API_ENDPOINT: ${{ inputs.backend_endpoint }}
MCP_ENDPOINT: ${{ inputs.mcp_endpoint }}
MCP_INTERNAL_ONLY: ${{ inputs.mcp_internal_only && 'true' || 'false' }}

- name: Test Summary
if: always()
run: |
echo "## Integration Test Results" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "- Backend Endpoint: ${{ inputs.backend_endpoint }}" >> $GITHUB_STEP_SUMMARY
echo "- MCP Endpoint: ${{ inputs.mcp_endpoint || 'Internal (skipped)' }}" >> $GITHUB_STEP_SUMMARY
echo "- Environment: ${{ inputs.environment }}" >> $GITHUB_STEP_SUMMARY

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 1 day ago

In general, the fix is to add an explicit permissions section to the workflow (either at the root or at the job level) to limit the GITHUB_TOKEN to the minimal scopes required. For this integration test workflow, the steps only require read access to the repository contents (to allow actions/checkout to function). They do not interact with issues, pull requests, or other writable resources, so contents: read is sufficient.

The best, least‑disruptive fix is to add a permissions block at the workflow root, just after the on: block and before jobs:. This will apply to all jobs in the workflow (currently only integration-tests) without changing any existing behavior. Concretely, in .github/workflows/integration-tests.yml, between the workflow_dispatch configuration (line 24–37) and the jobs: key (line 38), insert:

permissions:
  contents: read

No imports or additional definitions are needed because this is a GitHub Actions YAML file, not application code.

Suggested changeset 1
.github/workflows/integration-tests.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/integration-tests.yml b/.github/workflows/integration-tests.yml
--- a/.github/workflows/integration-tests.yml
+++ b/.github/workflows/integration-tests.yml
@@ -35,6 +35,9 @@
         description: 'MCP service endpoint URL (optional if internal)'
         required: false
 
+permissions:
+  contents: read
+
 jobs:
   integration-tests:
     name: Run Integration Tests
EOF
@@ -35,6 +35,9 @@
description: 'MCP service endpoint URL (optional if internal)'
required: false

permissions:
contents: read

jobs:
integration-tests:
name: Run Integration Tests
Copilot is powered by AI and may make mistakes. Always verify output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants