Fix Codex GPT model selection#120
Conversation
rohita5l
left a comment
There was a problem hiding this comment.
Good catch on the sort bug.
This adds a second GPT-version parser. _openai_model_id and the new _gpt_version_key are both pulling major/minor/patch out of a model id, but they do it differently (one's hyphen-only on the raw string, the other splits on / and takes . or -). They'll drift apart eventually. I'd pull the parsing into one helper and build both the OpenAI id and the sort key off it:
_GPT_RE = re.compile(r"(?:databricks-)?gpt-(\d+)(?:[.-](\d+))?(?:[.-](\d+))?(-.+|[a-z].*)?")
def _parse_gpt(model: str | None):
"""(major, minor, patch, suffix) from a Databricks- or OpenAI-style GPT id.
minor/patch are None when absent; None for non-GPT ids."""
if not model:
return None
m = _GPT_RE.fullmatch(model.split("/")[-1])
if not m:
return None
major, minor, patch, suffix = m.groups()
return (
int(major),
int(minor) if minor is not None else None,
int(patch) if patch is not None else None,
suffix or "",
)
def _openai_model_id(model: str | None) -> str | None:
parsed = _parse_gpt(model)
if parsed is None:
return model
major, minor, patch, suffix = parsed
version = str(major)
if minor is not None:
version += f".{minor}"
if patch is not None:
version += f".{patch}"
return f"gpt-{version}{suffix}"
def default_model(state: dict) -> str | None:
codex_models = state.get("codex_models") or []
if not codex_models:
return None
def _key(mid: str):
parsed = _parse_gpt(mid)
if parsed is None:
return (0, 0, 0, 0)
major, minor, patch, suffix = parsed
return (major, minor or 0, patch or 0, 0 if suffix else 1)
return max(codex_models, key=_key)
Couple of other things while I'm here:
The try/except ValueError around max can't actually fire, you've already returned early on an empty list, and _key catches its own errors, so there's nothing left to raise. Safe to drop.
I think gpt-4o-style ids slip through. The suffix in the current regex needs a leading hyphen, so databricks-gpt-4o doesn't match and gets returned as-is, prefix and all. Codex would choke on that. The [a-z].* branch covers it. Might be worth a test since everything here is gpt-5 family.
Also default_model strips mid.split("/")[-1] but _openai_model_id doesn't, so a namespaced id would map in one place and not the other. Going through _parse_gpt evens that out, assuming ids can show up with a / at all ?
Summary
databricks-gpt-5-5to OpenAI-style model IDs likegpt-5.5in Codex config metadata so we don't get the warning signsbefore:

after:

Tests
uv run pytest tests/test_agent_codex.py