Skip to content

fix(inbox): validate persisted model before one-click cloud tasks#2728

Open
andrewm4894 wants to merge 4 commits into
mainfrom
posthog-code/fix-stale-model-cloud-tasks
Open

fix(inbox): validate persisted model before one-click cloud tasks#2728
andrewm4894 wants to merge 4 commits into
mainfrom
posthog-code/fix-stale-model-cloud-tasks

Conversation

@andrewm4894

@andrewm4894 andrewm4894 commented Jun 17, 2026

Copy link
Copy Markdown
Member

Problem

Creating a scout (and other one-click cloud-task flows) intermittently failed with a gateway 403 — <model> not available, even though the model picker showed Opus selected. The one-click flows read the persisted lastUsedModel directly and sent it verbatim, with no check that the gateway still offers it. If that persisted id was a model later de-listed for the org (e.g. claude-fable-5), the run 403'd. The TaskInput picker already guards this — it falls back to the server default when the saved model isn't in the available options — which is exactly why the picker displayed Opus while the scout still tried the de-listed model.

Changes

  • The model resolver now takes the persisted/pinned model as a preference: it's honoured only when it appears in the gateway's current options, otherwise the adapter's server default is used. This mirrors the picker's existing guard.
  • Routed all one-click flows through that guard: scout chat (useInboxCloudTaskRunner), Responder setup (ConfigureAgentsSection), home quick actions (useRunWorkstreamAction), and Discuss/Create-PR (SignalReportTaskService).
  • The validity check flattens nested/grouped model options before comparing, matching the picker's flattenConfigValues — a model wrapped in a labelled group is no longer treated as unavailable.
  • Resolver failures degrade gracefully instead of hard-blocking: the resolver swallows transient gateway errors and returns undefined, so each call site now falls back to the preferred id (resolvedModel ?? preferredModel / ?? lastUsedModel / ?? modelOverride). A gateway hiccup keeps the old risk profile (a stale id may still 403) rather than failing every run whose persisted model was valid.
  • Persisted reasoning effort is dropped when the resolver swaps in a fallback model (the model-bearing UI flows). The old effort tier may be unsupported for the new model, which the cloud runtime rejects in packages/agent/src/server/bin.ts; we let the runtime pick its default instead.

How did you test this?

  • Unit tests for the guard in selectModelFromOptions (parameterised: preferred kept when offered, falls back when de-listed, ignores empty/null, recognises a model nested in a group).
  • SignalReportTaskService tests cover the override forwarded as a preference and the transient-failure fallback to the explicit override.
  • Fixed the useDiscussReport DI mock that was silently failing the valid-create case (the resolver service was mocked as { createTask }, so every run exited as missing-model).
  • pnpm --filter @posthog/core and --filter @posthog/ui vitest: all green.
  • pnpm typecheck (core, ui, apps/code) and Biome lint on changed files: clean.

Automatic notifications

  • Publish to changelog?
  • Alert Sales and Marketing teams?

Created with PostHog Code from a Slack thread

One-click cloud-task flows (scout chat, Responder setup, home quick
actions, Discuss/Create-PR) used the persisted `lastUsedModel` directly
without checking it was still offered by the gateway. A stale id — e.g. a
model later de-listed for the org — would be sent verbatim and the run
failed with a gateway 403. The TaskInput model picker already guards this
and silently falls back to the server default, which is why the picker
showed Opus while the scout still tried the de-listed model.

Centralise the guard in the model resolver: the persisted/pinned model is
now passed as a *preference* and only honoured when it appears in the
gateway's current options, otherwise the server default is used.

Generated-By: PostHog Code
Task-Id: f29142f6-4183-48a6-9046-8cba7f8181de
@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown

React Doctor found no issues in the changed files. 🎉

Reviewed by React Doctor for commit a422ac6.

@andrewm4894 andrewm4894 marked this pull request as ready for review June 17, 2026 09:10
@andrewm4894 andrewm4894 self-assigned this Jun 17, 2026
@greptile-apps

greptile-apps Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
packages/ui/src/features/home/hooks/useRunWorkstreamAction.ts:107-115
**Transient resolver failure now hard-blocks runs that previously succeeded**

Before this PR, `resolveDefaultModel` was only called when `!model`, so a transient network error on `getPreviewConfigOptions` only blocked users with no persisted model — users with a `lastUsedModel` were unaffected. Now the resolver is always called when `cloudRegion` is set, but both `desktop-services.ts` and `resolveDefaultModel.ts` swallow all exceptions and return `undefined`. A single `getPreviewConfigOptions` timeout therefore turns `model = undefined`, surfaces "Couldn't start task", and forces a fallback to the task-input screen — even when `preferredModel` was a currently-valid id.

The same regression exists in `useInboxCloudTaskRunner.ts` via the `resolveDefaultModel` wrapper. Consider `model = resolvedModel ?? preferredModel` at both call sites so a gateway outage degrades gracefully (same risk profile as the old code) rather than failing hard.

### Issue 2 of 3
packages/core/src/inbox/reportTaskCreation.ts:39-43
**Shallow option lookup misses nested model groups**

`PreviewConfigChoice` is declared to support recursive nesting (`options?: PreviewConfigChoice[]`), and the comment in the interface says "either flat or wrapped in a labelled group." However the lookup `modelOption.options?.some((o) => o.value === preferredModel)` only inspects the top level. For a grouped structure like `[{ options: [{ value: "claude-opus-4-8" }] }]`, `o.value` is `undefined` for the outer entry and the preferred model would not be found, causing an unnecessary fallback to the server default. If the gateway can return grouped options, this check should recurse or flatten first.

### Issue 3 of 3
packages/core/src/inbox/reportTaskCreation.test.ts:20-55
**Tests could be parameterised**

The five `it(...)` cases for `selectModelFromOptions` all follow the pattern `(options, preferredModel?) → expectedResult` and differ only in input/output values. Per the team's preference, these should use `it.each` (or equivalent) so a new model-availability variant only adds a table row rather than a full `it` block.

Reviews (1): Last reviewed commit: "fix(inbox): validate persisted model bef..." | Re-trigger Greptile

Comment thread packages/ui/src/features/home/hooks/useRunWorkstreamAction.ts
Comment thread packages/core/src/inbox/reportTaskCreation.ts
Comment thread packages/core/src/inbox/reportTaskCreation.test.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f0a2e5417b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/core/src/inbox/reportTaskCreation.ts Outdated
Comment thread packages/ui/src/features/inbox/hooks/useInboxCloudTaskRunner.ts
Comment thread packages/ui/src/features/inbox/hooks/useInboxCloudTaskRunner.ts Outdated
…ouped options

Address bot review feedback on one-click cloud task model resolution:
- Fall back to the preferred/persisted model when the resolver returns
  undefined (transient gateway failure) instead of hard-blocking the run,
  at both call sites (useRunWorkstreamAction, useInboxCloudTaskRunner).
- Flatten nested/grouped preview-config options before validating the
  preferred model, mirroring the TaskInput picker's flattenConfigValues.
- Drop the persisted reasoning effort when the resolver swaps in a
  fallback model, since the tier may be unsupported for the new model
  and the cloud runtime rejects the pair.
- Fix the useDiscussReport test DI mock to return a real model resolver
  (previously every run silently blocked as missingModel).
- Parameterise selectModelFromOptions tests and cover grouped options.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b1b6beeef0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/ui/src/features/inbox/components/ConfigureAgentsSection.tsx Outdated
Comment thread packages/ui/src/features/inbox/components/ConfigureAgentsSection.tsx Outdated
Comment thread packages/core/src/inbox/signalReportTaskService.ts Outdated
…vice

Apply the same degrade-gracefully treatment to the two remaining call
sites Codex flagged on re-review:
- ConfigureAgentsSection setup flow: fall back to the persisted model on
  transient resolver failure, and drop the persisted effort when the
  resolver swaps in a fallback model (unsupported tier would be rejected
  by the cloud runtime).
- SignalReportTaskService: fall back to the explicit modelOverride when
  the resolver returns undefined, so a valid override-driven run isn't
  blocked by a gateway outage. Added a covering test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant