fix(claude): pick models from workspace's serving endpoints (GDS-aware)#27
Open
fix(claude): pick models from workspace's serving endpoints (GDS-aware)#27
Conversation
`setup_claude.py` hardcoded ANTHROPIC_DEFAULT_OPUS_MODEL=opus-4-7 regardless of what the workspace actually serves. On workspaces in geos that don't have opus-4-7 (e.g. AU's adb-7405613340366915.15 serves only opus-4-6 / sonnet-4-6 / sonnet-4-5 / haiku-4-5), every opus-tier call ENDPOINT_NOT_FOUNDs. Adds `utils.discover_serving_endpoints()` to query the workspace's `/api/2.0/serving-endpoints` and return the READY model names. Workspace direct-serving endpoints reflect Databricks Geo Designated Services policy — using this list as the validation oracle gets GDS compliance for free, no policy parsing needed. `setup_claude.py` now picks each tier (opus / sonnet / haiku) by walking a priority chain against the discovered list; falls back to the original env-set default if discovery fails (e.g. workspace unreachable at startup) so behaviour matches main when discovery isn't available. Logs the substitution when it happens. Verified against live daveok (AU geo, no opus-4-7): Active model: databricks-claude-opus-4-6 (was opus-4-7) Opus tier: databricks-claude-opus-4-6 Sonnet tier: databricks-claude-sonnet-4-6 Haiku tier: databricks-claude-haiku-4-5 Setup_codex / setup_hermes / setup_gemini follow the same pattern; filed as follow-up so this PR stays single-agent surgical. Co-authored-by: Isaac
Collaborator
Author
|
@datasciencemonkey — flagging for priority review. P0: every non-US-geo workspace breaks Claude Code's Opus tier without this. Helper is small ( |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Priority
P0 — Claude Code's Opus tier is broken on every non-US-geo workspace. Workspaces in AU / EU / etc. serve
databricks-claude-opus-4-6but notopus-4-7;setup_claude.pyhardcodes the latter, so selecting Opus 404s. App-side workaround for the gateway-side issue tracked at #8.Summary
Fix for #26. Setup scripts had hardcoded model names that don't survive Geo Designated Services restrictions. This PR teaches
setup_claude.pyto query the workspace's/api/2.0/serving-endpoints, treat that list as the GDS-respecting "what's actually served here" oracle, and pick a working model in each tier.Changes
utils.py(+52, no logic changes elsewhere):discover_serving_endpoints(host, token, timeout=5.0) -> set[str]— returns READY endpoint names. Empty set on any failure (preserves caller's fallback behaviour).pick_in_geo_model(preferred, available, fallback) -> str— first preferred entry inavailable; elsefallback.setup_claude.py(+38/-5):ANTHROPIC_MODEL,ANTHROPIC_DEFAULT_OPUS_MODEL,ANTHROPIC_DEFAULT_SONNET_MODEL,ANTHROPIC_DEFAULT_HAIKU_MODELfrom the discovered list, walking a per-tier priority chain.Test Evidence (verified on the live deployment 2026-05-06)
Smoke test against
daveok(AU geo workspace):Before:
/modelshows opus-4-7 → 404 on every Opus call. After: opus-4-6 used as Opus default → works.Test plan
Out of scope
setup_codex.py,setup_hermes.py,setup_gemini.py,setup_opencode.pyneed the same treatment. Will follow as separate single-agent PRs once this helper lands and the semantics are reviewed. Single-agent scope here keeps the helper introduction reviewable on its own.Closes #26
This pull request and its description were written by Isaac.