fix(inbox): validate persisted model before one-click cloud tasks#2728
fix(inbox): validate persisted model before one-click cloud tasks#2728andrewm4894 wants to merge 4 commits into
Conversation
One-click cloud-task flows (scout chat, Responder setup, home quick actions, Discuss/Create-PR) used the persisted `lastUsedModel` directly without checking it was still offered by the gateway. A stale id — e.g. a model later de-listed for the org — would be sent verbatim and the run failed with a gateway 403. The TaskInput model picker already guards this and silently falls back to the server default, which is why the picker showed Opus while the scout still tried the de-listed model. Centralise the guard in the model resolver: the persisted/pinned model is now passed as a *preference* and only honoured when it appears in the gateway's current options, otherwise the server default is used. Generated-By: PostHog Code Task-Id: f29142f6-4183-48a6-9046-8cba7f8181de
|
React Doctor found no issues in the changed files. 🎉 Reviewed by React Doctor for commit |
Prompt To Fix All With AIFix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
packages/ui/src/features/home/hooks/useRunWorkstreamAction.ts:107-115
**Transient resolver failure now hard-blocks runs that previously succeeded**
Before this PR, `resolveDefaultModel` was only called when `!model`, so a transient network error on `getPreviewConfigOptions` only blocked users with no persisted model — users with a `lastUsedModel` were unaffected. Now the resolver is always called when `cloudRegion` is set, but both `desktop-services.ts` and `resolveDefaultModel.ts` swallow all exceptions and return `undefined`. A single `getPreviewConfigOptions` timeout therefore turns `model = undefined`, surfaces "Couldn't start task", and forces a fallback to the task-input screen — even when `preferredModel` was a currently-valid id.
The same regression exists in `useInboxCloudTaskRunner.ts` via the `resolveDefaultModel` wrapper. Consider `model = resolvedModel ?? preferredModel` at both call sites so a gateway outage degrades gracefully (same risk profile as the old code) rather than failing hard.
### Issue 2 of 3
packages/core/src/inbox/reportTaskCreation.ts:39-43
**Shallow option lookup misses nested model groups**
`PreviewConfigChoice` is declared to support recursive nesting (`options?: PreviewConfigChoice[]`), and the comment in the interface says "either flat or wrapped in a labelled group." However the lookup `modelOption.options?.some((o) => o.value === preferredModel)` only inspects the top level. For a grouped structure like `[{ options: [{ value: "claude-opus-4-8" }] }]`, `o.value` is `undefined` for the outer entry and the preferred model would not be found, causing an unnecessary fallback to the server default. If the gateway can return grouped options, this check should recurse or flatten first.
### Issue 3 of 3
packages/core/src/inbox/reportTaskCreation.test.ts:20-55
**Tests could be parameterised**
The five `it(...)` cases for `selectModelFromOptions` all follow the pattern `(options, preferredModel?) → expectedResult` and differ only in input/output values. Per the team's preference, these should use `it.each` (or equivalent) so a new model-availability variant only adds a table row rather than a full `it` block.
Reviews (1): Last reviewed commit: "fix(inbox): validate persisted model bef..." | Re-trigger Greptile |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f0a2e5417b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…e-model-cloud-tasks
…ouped options Address bot review feedback on one-click cloud task model resolution: - Fall back to the preferred/persisted model when the resolver returns undefined (transient gateway failure) instead of hard-blocking the run, at both call sites (useRunWorkstreamAction, useInboxCloudTaskRunner). - Flatten nested/grouped preview-config options before validating the preferred model, mirroring the TaskInput picker's flattenConfigValues. - Drop the persisted reasoning effort when the resolver swaps in a fallback model, since the tier may be unsupported for the new model and the cloud runtime rejects the pair. - Fix the useDiscussReport test DI mock to return a real model resolver (previously every run silently blocked as missingModel). - Parameterise selectModelFromOptions tests and cover grouped options.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b1b6beeef0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
…vice Apply the same degrade-gracefully treatment to the two remaining call sites Codex flagged on re-review: - ConfigureAgentsSection setup flow: fall back to the persisted model on transient resolver failure, and drop the persisted effort when the resolver swaps in a fallback model (unsupported tier would be rejected by the cloud runtime). - SignalReportTaskService: fall back to the explicit modelOverride when the resolver returns undefined, so a valid override-driven run isn't blocked by a gateway outage. Added a covering test.
Problem
Creating a scout (and other one-click cloud-task flows) intermittently failed with a gateway
403 — <model> not available, even though the model picker showed Opus selected. The one-click flows read the persistedlastUsedModeldirectly and sent it verbatim, with no check that the gateway still offers it. If that persisted id was a model later de-listed for the org (e.g.claude-fable-5), the run 403'd. The TaskInput picker already guards this — it falls back to the server default when the saved model isn't in the available options — which is exactly why the picker displayed Opus while the scout still tried the de-listed model.Changes
useInboxCloudTaskRunner), Responder setup (ConfigureAgentsSection), home quick actions (useRunWorkstreamAction), and Discuss/Create-PR (SignalReportTaskService).flattenConfigValues— a model wrapped in a labelled group is no longer treated as unavailable.undefined, so each call site now falls back to the preferred id (resolvedModel ?? preferredModel/?? lastUsedModel/?? modelOverride). A gateway hiccup keeps the old risk profile (a stale id may still 403) rather than failing every run whose persisted model was valid.packages/agent/src/server/bin.ts; we let the runtime pick its default instead.How did you test this?
selectModelFromOptions(parameterised: preferred kept when offered, falls back when de-listed, ignores empty/null, recognises a model nested in a group).SignalReportTaskServicetests cover the override forwarded as a preference and the transient-failure fallback to the explicit override.useDiscussReportDI mock that was silently failing the valid-create case (the resolver service was mocked as{ createTask }, so every run exited asmissing-model).pnpm --filter @posthog/coreand--filter @posthog/uivitest: all green.pnpm typecheck(core, ui, apps/code) and Biome lint on changed files: clean.Automatic notifications
Created with PostHog Code from a Slack thread