fix(ui): recover legacy sessions with export fallback#444
Conversation
Some historical OpenCode sessions still appear in session.list but fail to open in CodeNomad because the strict v2 session.messages response validation rejects transcripts that are missing assistant metadata such as info.agent. This left the session visible in the UI while the transcript pane stayed empty. When that specific schema-validation failure happens, CodeNomad now falls back to an opencode export for the affected session and hydrates the transcript from the CLI-compatible export payload instead of abandoning the load. The fallback is scoped to malformed legacy transcripts so normal session loads still use the standard API path. Validation: npm run typecheck --workspace @codenomad/ui; npm run build --workspace @codenomad/tauri-app. Note: npm run typecheck --workspace @neuralnomads/codenomad currently fails on this origin/dev base because of unrelated server dependency/type drift.
|
Hey @pascalandr
Interesting. Is it failing in OpenCode end or CodeNomad end? |
|
only in codenomad (any version on windows), opencode have no problem with the malformed metadata |
|
If I understand correctly, this tries to export and import the old style session. |
right, still working on it, but there's no import, only export because this is the only working way: |
The initial legacy-session recovery path worked, but it was still too broad and too heavy for very large transcripts. Tighten the fallback to the exact missing-agent validation signature, stream export output through a temp file instead of buffering the full payload in memory, and abort stalled exports so the recovery path cannot hang indefinitely. Also add focused coverage for the legacy error matcher so unrelated schema regressions keep surfacing normally instead of being hidden behind the fallback.
…ession-load-clean
|
I want you to perform the gatekeeper checks for latest PR state check for regressions |
|
Gatekeeper review for latest PR state. I understood the request as: review PR #444 ruthlessly for regressions and implementation quality, especially around the legacy Blocking finding
Non-blocking but should fix before merge if touching this area
Implementation direction The export fallback itself is reasonable as a narrow workaround if
Validation attempted
Those validation failures look environmental/base-dependency related rather than conclusively caused by this PR, but they mean I could not prove the branch clean locally. Verdict Request changes. The fallback is too close to merge to ignore the worktree-context regression; it risks fixing root legacy sessions while leaving worktree sessions broken or incorrectly hydrated. -- |
The legacy-session export fallback was correctly recovering malformed root transcripts, but it could still read the wrong session data for worktree-backed sessions because the recovery export always ran from the workspace root. Make the fallback follow the same worktree context as the normal transcript request so legacy worktree sessions resolve against the correct directory. Tighten the recovery path further by failing hard when the export payload does not contain a transcript array instead of silently presenting an empty session, and keep the matcher/tests focused on the reproduced missing-agent validation failure. Validation: npx tsx --test packages/ui/src/stores/session-message-fallback.test.ts; npm run typecheck --workspace @codenomad/ui; npm run build --workspace @codenomad/tauri-app
|
PR builds are available as GitHub Actions artifacts: https://github.com/NeuralNomadsAI/CodeNomad/actions/runs/25870026173 Artifacts expire in 7 days.
|
|
@CodeNomadBot |
|
Gatekeeper review for the updated PR state. I understood “same review” as: re-run the ruthless PR gatekeeper pass after the latest updates, focused on regressions and implementation quality. Assumption: review-only; I did not change code. Findings No blocking findings in the latest state. The previous blocking issue I raised is addressed: the fallback now carries Non-blocking quality notes
Implementation assessment The current approach is now appropriately narrow:
I do not see a better lower-risk implementation unless OpenCode exposes a tolerant/raw transcript endpoint. Given the stated failure occurs before CodeNomad receives message data, client-side hydration compatibility alone would not solve this class of legacy sessions. Validation attempted
Verdict No blocking regressions found in the updated PR. I would still tighten the pre-aborted signal handling and temp filename before merge if you want the fallback hardening to be fully boring, but I would not block on either unless the project treats local route hardening as mandatory for this release. -- |
|
hey @pascalandr
|
|
To clarify: OpenCode itself still reads these sessions fine; the incompatibility is in the SDK used by CodeNomad, where |
|
You maybe highlighting some deeper issue here, OpenCode Desktop uses the same SDK that CodeNomad does. a. Either doing something incorrect I think instead of trying a band-aid fix, we should investigate why it fails in CodeNomad. Would you be able to
|
|
oh I never spoked about Opencode desktop, did you ? |
|
TUI also uses the same SDK, IIRC |
|
well as I said already this cessions open with zero issue or message in opencode cli and doesn't open at all in codnomad. that all I can say, the agent found a solution by testing against the sdk. #444 (comment) |
|
CLI / TUI also works via the SDK so if it can work, then definitely some issue in CodeNomad. Would you be able to send me one of your problematic sessions by exporting? |
|
once the cession is exported it isn't broken anymore in Codenomad. So I agree we don't understand well what's happening here, working on another solution by trying to repair the cessions instead of supporting them. |
|
We can understand the issue better if you try to load the session in opencode desktop and keep the dev tools open to capture request / response. Once we have that we can see what different request / response is CodeNomad doing. A curl request can also be used to query the session messages to understand what's going on. |
|
ok so I'm going to install opencode desktop |
|
Error: Missing key ──────────────────────────────────────── |
|
and still no issue in TUI, so I guess finally the sdk is not used the same way in each of them |
|
supersed by #450 |
Summary
session.listbut fail to open in CodeNomad because the transcript endpoint rejects their historical message payloadopencode exportonly for that known legacy validation failuresession.messagesflow and harden the fallback so it does not become a new source of hangs or memory spikesProblem
The broken sessions are not missing from OpenCode entirely.
They still show up in
session.list, so CodeNomad can discover them and render them in the session sidebar. The failure happens later, when CodeNomad tries to hydrate the transcript by calling the OpenCode v2session.messagesendpoint.For some historical sessions, at least one assistant message does not satisfy the current v2 response schema anymore. The concrete legacy shape we reproduced is a missing
info.agentfield inside a transcript message. When that happens, the OpenCode endpoint rejects the entiresession.messagesresponse during validation and returns a400body-validation error instead of returning partial transcript data.That means CodeNomad never receives a message list to hydrate at all. This is why the session can still be visible in the UI while opening it fails.
The error we reproduced on the affected sessions is:
Why This Is Not Just A UI Hydration Fix
A UI-only compatibility layer would only help if CodeNomad received the old transcript payload and merely failed to render or normalize it.
That is not what happens here.
The blocking failure occurs inside the OpenCode
session.messagesAPI path itself, before CodeNomad can hydrate anything. Since the endpoint returns a validation error instead of message data, CodeNomad cannot repair the payload in memory because it never gets access to the transcript through the normal API route.That is why this PR uses
opencode exportas a read fallback for a very specific legacy failure mode rather than trying to reinterpretsession.messagesclient-side.Recovery Strategy
The new behavior is:
session.messagesrequest path.info.agent, CodeNomad falls back toopencode export <sessionId>.Important scope limits:
Hardening In This PR
Because the affected transcripts can be very large, the fallback path is also hardened so the recovery mechanism does not introduce a second failure mode.
Specifically:
["info"]["agent"]), so unrelated schema regressions still surface normallyFiles Changed
packages/ui/src/stores/session-message-source.tspackages/ui/src/stores/session-message-fallback.tspackages/server/src/workspaces/manager.tspackages/server/src/server/routes/workspaces.tspackages/ui/src/stores/session-api.tspackages/ui/src/stores/session-state.tspackages/ui/src/lib/api-client.tsValidation
npm run typecheck --workspace @codenomad/uinpx tsx --test packages/ui/src/stores/session-message-fallback.test.tsnpm run build --workspace @codenomad/tauri-appNotes
npm run typecheck --workspace @neuralnomads/codenomadcurrently fails on the movingdevbase because of unrelated server dependency/type drift, not because of this changepackages/server/src/workspaces/manager.tspackages/ui/src/stores/session-api.tspackages/ui/src/stores/session-state.ts