-
Notifications
You must be signed in to change notification settings - Fork 5.8k
fix: support Azure endpoints other than openai.azure.com #4261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: support Azure endpoints other than openai.azure.com #4261
Conversation
Fixes #4260 This PR adds support for Azure endpoints that don't end with openai.azure.com, such as cognitiveservices.azure.com, services.ai.azure.com, and other *.azure.com domains. Changes: - Add _is_azure_endpoint() method to detect various Azure endpoint formats - Add is_azure_endpoint flag to track if endpoint is any Azure endpoint - Support base_url parameter as alias for endpoint (consistency with other providers) - Update _validate_and_fix_endpoint() to only auto-construct deployment path for openai.azure.com endpoints, leaving other Azure endpoints unchanged - Add comprehensive tests for the new functionality Co-Authored-By: João <joao@crewai.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
| if domain in endpoint: | ||
| return True | ||
| # Also check for generic .azure.com pattern (e.g., cservices.azure.com) | ||
| return ".azure.com" in endpoint |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High
.azure.com
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 days ago
In general, the problem should be fixed by parsing the endpoint as a URL, extracting its hostname, and performing checks against that hostname (and possibly its subdomains), instead of checking the raw endpoint string for substrings like "openai.azure.com" or ".azure.com". This ensures we classify and normalize only real Azure domains and not arbitrary strings that happen to contain those substrings.
Concretely for this file:
- Add an import for
urlparsefrom Python’s standardurllib.parsemodule at the top of the file (we’re allowed to add well-known standard-library imports). - Update
_is_azure_endpointto:- Parse the endpoint with
urlparse. - Extract
hostname(lowercased). - Compare this hostname against an allowlist of exact Azure hostnames (e.g.,
openai.azure.com,cognitiveservices.azure.com,services.ai.azure.com) and a generic*.azure.comcheck implemented ashostname == "azure.com" or hostname.endswith(".azure.com"). - Remove all raw
in endpointsubstring checks.
- Parse the endpoint with
- Update
_validate_and_fix_endpointto determine whether the host is an Azure OpenAI host by parsing the URL and checkinghostname == "openai.azure.com"or ending with.openai.azure.com(to support resource-specific subdomains like<resource>.openai.azure.com). Use this boolean instead of"openai.azure.com" in endpoint. This avoids misclassifying URLs likehttps://evil-openai.azure.com.attacker.com. - Keep existing functionality: the method should still auto-construct the deployment path only when the endpoint is an Azure OpenAI endpoint and doesn’t already contain
/openai/deployments/. The only behavioral change is stricter and correct host recognition.
No new third-party dependencies are needed; we only use the standard library.
-
Copy modified line R7 -
Copy modified lines R192-R193 -
Copy modified lines R203-R208 -
Copy modified line R212 -
Copy modified lines R214-R220 -
Copy modified lines R234-R243 -
Copy modified line R245
| @@ -4,6 +4,7 @@ | ||
| import logging | ||
| import os | ||
| from typing import TYPE_CHECKING, Any, TypedDict | ||
| from urllib.parse import urlparse | ||
|
|
||
| from pydantic import BaseModel | ||
| from typing_extensions import Self | ||
| @@ -188,8 +189,8 @@ | ||
| """Check if the endpoint is an Azure endpoint. | ||
|
|
||
| Azure endpoints can have various domain formats: | ||
| - openai.azure.com (Azure OpenAI Service) | ||
| - cognitiveservices.azure.com (Azure AI Services / Cognitive Services) | ||
| - <resource-name>.openai.azure.com (Azure OpenAI Service) | ||
| - <resource-name>.cognitiveservices.azure.com (Azure AI Services / Cognitive Services) | ||
| - services.ai.azure.com (Azure AI Services) | ||
| - Other *.azure.com domains | ||
|
|
||
| @@ -199,18 +200,24 @@ | ||
| Returns: | ||
| True if the endpoint is an Azure endpoint, False otherwise | ||
| """ | ||
| azure_domains = [ | ||
| parsed = urlparse(endpoint) | ||
| hostname = (parsed.hostname or "").lower() | ||
| if not hostname: | ||
| return False | ||
|
|
||
| azure_domains = { | ||
| "openai.azure.com", | ||
| "cognitiveservices.azure.com", | ||
| "services.ai.azure.com", | ||
| ] | ||
| # Check for known Azure domains | ||
| for domain in azure_domains: | ||
| if domain in endpoint: | ||
| return True | ||
| # Also check for generic .azure.com pattern (e.g., cservices.azure.com) | ||
| return ".azure.com" in endpoint | ||
| } | ||
|
|
||
| # Check for known Azure hostnames | ||
| if hostname in azure_domains: | ||
| return True | ||
|
|
||
| # Also check for generic .azure.com pattern (e.g., <resource>.azure.com) | ||
| return hostname == "azure.com" or hostname.endswith(".azure.com") | ||
|
|
||
| @staticmethod | ||
| def _validate_and_fix_endpoint(endpoint: str, model: str) -> str: | ||
| """Validate and fix Azure endpoint URL format. | ||
| @@ -228,9 +231,18 @@ | ||
| Returns: | ||
| Validated and potentially corrected endpoint URL | ||
| """ | ||
| # Only auto-construct deployment path for Azure OpenAI endpoints (openai.azure.com) | ||
| parsed = urlparse(endpoint) | ||
| hostname = (parsed.hostname or "").lower() | ||
|
|
||
| # Only auto-construct deployment path for Azure OpenAI endpoints | ||
| # (e.g., <resource-name>.openai.azure.com) | ||
| is_openai_host = bool( | ||
| hostname | ||
| and (hostname == "openai.azure.com" or hostname.endswith(".openai.azure.com")) | ||
| ) | ||
|
|
||
| # Other Azure endpoints (cognitiveservices.azure.com, etc.) should be used as-is | ||
| if "openai.azure.com" in endpoint and "/openai/deployments/" not in endpoint: | ||
| if is_openai_host and "/openai/deployments/" not in endpoint: | ||
| endpoint = endpoint.rstrip("/") | ||
|
|
||
| if not endpoint.endswith("/openai/deployments"): |
| ) | ||
|
|
||
| # endpoint should take precedence | ||
| assert "explicit-endpoint.openai.azure.com" in llm.endpoint |
Check failure
Code scanning / CodeQL
Incomplete URL substring sanitization High test
explicit-endpoint.openai.azure.com
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 9 days ago
In general, to avoid incomplete URL substring sanitization, URLs should be parsed and validated based on structured components (hostname, scheme, path), or compared exactly against expected values, rather than using substring checks like in or endswith on the raw URL string.
For this specific case, the test test_azure_endpoint_takes_precedence_over_base_url currently asserts that the chosen endpoint contains "explicit-endpoint.openai.azure.com". Since the intention is to verify that the endpoint argument wins over base_url, the best fix is to assert equality against the exact endpoint string passed in: "https://explicit-endpoint.openai.azure.com". This keeps the behavior check intact while removing the substring pattern that CodeQL flags. No additional parsing is needed here because we know the full expected string.
Concretely, in lib/crewai/tests/llms/azure/test_azure.py, around line 1338, replace:
# endpoint should take precedence
assert "explicit-endpoint.openai.azure.com" in llm.endpointwith:
# endpoint should take precedence
assert llm.endpoint == "https://explicit-endpoint.openai.azure.com"No new imports or helper methods are required.
-
Copy modified line R1338
| @@ -1335,7 +1335,7 @@ | ||
| ) | ||
|
|
||
| # endpoint should take precedence | ||
| assert "explicit-endpoint.openai.azure.com" in llm.endpoint | ||
| assert llm.endpoint == "https://explicit-endpoint.openai.azure.com" | ||
|
|
||
|
|
||
| def test_azure_non_openai_endpoint_model_parameter_included(): |
|
Closing due to inactivity for more than 7 days. Configure here. |
Summary
Fixes #4260
This PR adds support for Azure endpoints that don't end with
openai.azure.com, such ascognitiveservices.azure.com,services.ai.azure.com, and other*.azure.comdomains (likecservices.azure.commentioned in the issue).The core issue was that the Azure provider only recognized endpoints containing
openai.azure.comand would auto-construct deployment paths for all Azure endpoints, which broke non-standard Azure endpoint formats.Changes:
_is_azure_endpoint()method to detect various Azure endpoint formatsis_azure_endpointflag to track if endpoint is any Azure endpointbase_urlparameter as alias forendpoint(consistency with other providers)_validate_and_fix_endpoint()to only auto-construct deployment path foropenai.azure.comendpoints, leaving other Azure endpoints unchangedReview & Testing Checklist for Human
_is_azure_endpoint()fallback logic - The method falls back to checking for.azure.comin the endpoint. Confirm this is not too broad and won't incorrectly match non-Azure endpoints.cservices.azure.com. If possible, test with a real Azure Cognitive Services or similar endpoint to verify the fix works end-to-end.base_urlparameter doesn't conflict with base class - TheBaseLLMclass also storesbase_url. Confirm the new handling inAzureCompletionworks correctly.Recommended test plan:
model="azure/gpt-4"andAZURE_ENDPOINTset to acognitiveservices.azure.comendpoint/openai/deployments/path added)is_azure_endpointisTrueandis_azure_openai_endpointisFalseNotes
is_azure_endpointflag is added for proper API handling but isn't actively used in this PR - it's available for future use or debuggingLink to Devin run: https://app.devin.ai/sessions/cf337a00c2114937a9f861e564969cea
Requested by: João