Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

Summary

Fixes #4260

This PR adds support for Azure endpoints that don't end with openai.azure.com, such as cognitiveservices.azure.com, services.ai.azure.com, and other *.azure.com domains (like cservices.azure.com mentioned in the issue).

The core issue was that the Azure provider only recognized endpoints containing openai.azure.com and would auto-construct deployment paths for all Azure endpoints, which broke non-standard Azure endpoint formats.

Changes:

  • Add _is_azure_endpoint() method to detect various Azure endpoint formats
  • Add is_azure_endpoint flag to track if endpoint is any Azure endpoint
  • Support base_url parameter as alias for endpoint (consistency with other providers)
  • Update _validate_and_fix_endpoint() to only auto-construct deployment path for openai.azure.com endpoints, leaving other Azure endpoints unchanged

Review & Testing Checklist for Human

  • Verify the _is_azure_endpoint() fallback logic - The method falls back to checking for .azure.com in the endpoint. Confirm this is not too broad and won't incorrectly match non-Azure endpoints.
  • Test with an actual non-openai.azure.com endpoint - The issue reporter used cservices.azure.com. If possible, test with a real Azure Cognitive Services or similar endpoint to verify the fix works end-to-end.
  • Verify base_url parameter doesn't conflict with base class - The BaseLLM class also stores base_url. Confirm the new handling in AzureCompletion works correctly.

Recommended test plan:

  1. Create an LLM with model="azure/gpt-4" and AZURE_ENDPOINT set to a cognitiveservices.azure.com endpoint
  2. Verify the endpoint is NOT modified (no /openai/deployments/ path added)
  3. Verify is_azure_endpoint is True and is_azure_openai_endpoint is False
  4. Make an actual API call if possible to confirm the endpoint works

Notes

  • Tests were added but couldn't be verified locally due to pytest plugin conflicts; CI will run them
  • The is_azure_endpoint flag is added for proper API handling but isn't actively used in this PR - it's available for future use or debugging

Link to Devin run: https://app.devin.ai/sessions/cf337a00c2114937a9f861e564969cea
Requested by: João

Fixes #4260

This PR adds support for Azure endpoints that don't end with openai.azure.com,
such as cognitiveservices.azure.com, services.ai.azure.com, and other
*.azure.com domains.

Changes:
- Add _is_azure_endpoint() method to detect various Azure endpoint formats
- Add is_azure_endpoint flag to track if endpoint is any Azure endpoint
- Support base_url parameter as alias for endpoint (consistency with other providers)
- Update _validate_and_fix_endpoint() to only auto-construct deployment path
  for openai.azure.com endpoints, leaving other Azure endpoints unchanged
- Add comprehensive tests for the new functionality

Co-Authored-By: João <joao@crewai.com>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

if domain in endpoint:
return True
# Also check for generic .azure.com pattern (e.g., cservices.azure.com)
return ".azure.com" in endpoint

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High

The string
.azure.com
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 9 days ago

In general, the problem should be fixed by parsing the endpoint as a URL, extracting its hostname, and performing checks against that hostname (and possibly its subdomains), instead of checking the raw endpoint string for substrings like "openai.azure.com" or ".azure.com". This ensures we classify and normalize only real Azure domains and not arbitrary strings that happen to contain those substrings.

Concretely for this file:

  1. Add an import for urlparse from Python’s standard urllib.parse module at the top of the file (we’re allowed to add well-known standard-library imports).
  2. Update _is_azure_endpoint to:
    • Parse the endpoint with urlparse.
    • Extract hostname (lowercased).
    • Compare this hostname against an allowlist of exact Azure hostnames (e.g., openai.azure.com, cognitiveservices.azure.com, services.ai.azure.com) and a generic *.azure.com check implemented as hostname == "azure.com" or hostname.endswith(".azure.com").
    • Remove all raw in endpoint substring checks.
  3. Update _validate_and_fix_endpoint to determine whether the host is an Azure OpenAI host by parsing the URL and checking hostname == "openai.azure.com" or ending with .openai.azure.com (to support resource-specific subdomains like <resource>.openai.azure.com). Use this boolean instead of "openai.azure.com" in endpoint. This avoids misclassifying URLs like https://evil-openai.azure.com.attacker.com.
  4. Keep existing functionality: the method should still auto-construct the deployment path only when the endpoint is an Azure OpenAI endpoint and doesn’t already contain /openai/deployments/. The only behavioral change is stricter and correct host recognition.

No new third-party dependencies are needed; we only use the standard library.

Suggested changeset 1
lib/crewai/src/crewai/llms/providers/azure/completion.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/lib/crewai/src/crewai/llms/providers/azure/completion.py b/lib/crewai/src/crewai/llms/providers/azure/completion.py
--- a/lib/crewai/src/crewai/llms/providers/azure/completion.py
+++ b/lib/crewai/src/crewai/llms/providers/azure/completion.py
@@ -4,6 +4,7 @@
 import logging
 import os
 from typing import TYPE_CHECKING, Any, TypedDict
+from urllib.parse import urlparse
 
 from pydantic import BaseModel
 from typing_extensions import Self
@@ -188,8 +189,8 @@
         """Check if the endpoint is an Azure endpoint.
 
         Azure endpoints can have various domain formats:
-        - openai.azure.com (Azure OpenAI Service)
-        - cognitiveservices.azure.com (Azure AI Services / Cognitive Services)
+        - <resource-name>.openai.azure.com (Azure OpenAI Service)
+        - <resource-name>.cognitiveservices.azure.com (Azure AI Services / Cognitive Services)
         - services.ai.azure.com (Azure AI Services)
         - Other *.azure.com domains
 
@@ -199,18 +200,24 @@
         Returns:
             True if the endpoint is an Azure endpoint, False otherwise
         """
-        azure_domains = [
+        parsed = urlparse(endpoint)
+        hostname = (parsed.hostname or "").lower()
+        if not hostname:
+            return False
+
+        azure_domains = {
             "openai.azure.com",
             "cognitiveservices.azure.com",
             "services.ai.azure.com",
-        ]
-        # Check for known Azure domains
-        for domain in azure_domains:
-            if domain in endpoint:
-                return True
-        # Also check for generic .azure.com pattern (e.g., cservices.azure.com)
-        return ".azure.com" in endpoint
+        }
 
+        # Check for known Azure hostnames
+        if hostname in azure_domains:
+            return True
+
+        # Also check for generic .azure.com pattern (e.g., <resource>.azure.com)
+        return hostname == "azure.com" or hostname.endswith(".azure.com")
+
     @staticmethod
     def _validate_and_fix_endpoint(endpoint: str, model: str) -> str:
         """Validate and fix Azure endpoint URL format.
@@ -228,9 +231,18 @@
         Returns:
             Validated and potentially corrected endpoint URL
         """
-        # Only auto-construct deployment path for Azure OpenAI endpoints (openai.azure.com)
+        parsed = urlparse(endpoint)
+        hostname = (parsed.hostname or "").lower()
+
+        # Only auto-construct deployment path for Azure OpenAI endpoints
+        # (e.g., <resource-name>.openai.azure.com)
+        is_openai_host = bool(
+            hostname
+            and (hostname == "openai.azure.com" or hostname.endswith(".openai.azure.com"))
+        )
+
         # Other Azure endpoints (cognitiveservices.azure.com, etc.) should be used as-is
-        if "openai.azure.com" in endpoint and "/openai/deployments/" not in endpoint:
+        if is_openai_host and "/openai/deployments/" not in endpoint:
             endpoint = endpoint.rstrip("/")
 
             if not endpoint.endswith("/openai/deployments"):
EOF
Copilot is powered by AI and may make mistakes. Always verify output.
)

# endpoint should take precedence
assert "explicit-endpoint.openai.azure.com" in llm.endpoint

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High test

The string
explicit-endpoint.openai.azure.com
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 9 days ago

In general, to avoid incomplete URL substring sanitization, URLs should be parsed and validated based on structured components (hostname, scheme, path), or compared exactly against expected values, rather than using substring checks like in or endswith on the raw URL string.

For this specific case, the test test_azure_endpoint_takes_precedence_over_base_url currently asserts that the chosen endpoint contains "explicit-endpoint.openai.azure.com". Since the intention is to verify that the endpoint argument wins over base_url, the best fix is to assert equality against the exact endpoint string passed in: "https://explicit-endpoint.openai.azure.com". This keeps the behavior check intact while removing the substring pattern that CodeQL flags. No additional parsing is needed here because we know the full expected string.

Concretely, in lib/crewai/tests/llms/azure/test_azure.py, around line 1338, replace:

# endpoint should take precedence
assert "explicit-endpoint.openai.azure.com" in llm.endpoint

with:

# endpoint should take precedence
assert llm.endpoint == "https://explicit-endpoint.openai.azure.com"

No new imports or helper methods are required.

Suggested changeset 1
lib/crewai/tests/llms/azure/test_azure.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/lib/crewai/tests/llms/azure/test_azure.py b/lib/crewai/tests/llms/azure/test_azure.py
--- a/lib/crewai/tests/llms/azure/test_azure.py
+++ b/lib/crewai/tests/llms/azure/test_azure.py
@@ -1335,7 +1335,7 @@
     )
 
     # endpoint should take precedence
-    assert "explicit-endpoint.openai.azure.com" in llm.endpoint
+    assert llm.endpoint == "https://explicit-endpoint.openai.azure.com"
 
 
 def test_azure_non_openai_endpoint_model_parameter_included():
EOF
@@ -1335,7 +1335,7 @@
)

# endpoint should take precedence
assert "explicit-endpoint.openai.azure.com" in llm.endpoint
assert llm.endpoint == "https://explicit-endpoint.openai.azure.com"


def test_azure_non_openai_endpoint_model_parameter_included():
Copilot is powered by AI and may make mistakes. Always verify output.
@devin-ai-integration
Copy link
Contributor Author

Closing due to inactivity for more than 7 days. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] crewai.LLM does not support Azure endpoints other than those ending with openai.azure.com

0 participants