[No QA] Fix lost/duplicated requests in PersistedRequests (Issues 2, 3, 4, 5)#84622
Conversation
| Log.info('[PersistedRequests] hit Onyx connect callback', false, {isValNullish: val == null}); | ||
| Log.info('[PersistedRequests] hit Onyx connect callback', false, {isValNullish: val == null, isInitialized}); | ||
|
|
||
| if (isInitialized) { |
There was a problem hiding this comment.
this is interesting, makes sense to me tbh
…szymonzalarski/investigate-and-fix-sequential-queue-issues-issue-4
3a608e0 to
8cfba53
Compare
…szymonzalarski/investigate-and-fix-sequential-queue-issues-issue-4
…e-and-fix-sequential-queue-issues-issue-4
…e-and-fix-sequential-queue-issues-issue-4
… make in-memory state authoritative after init
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6b867738cf
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@ChavdaSachin Please copy/paste the Reviewer Checklist from here into a new comment on this PR and complete it. If you have the K2 extension, you can simply click: [this button] |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6b867738cf
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (isInitialized) { | ||
| // After initialization, in-memory is authoritative. | ||
| // Ignore disk callbacks to prevent out-of-order Onyx.set() from | ||
| // overwriting the correct in-memory state (Bug #80759 Issue 4). | ||
| return; |
There was a problem hiding this comment.
Continue syncing queue updates after initialization
Returning early once isInitialized is true makes this connection ignore every later PERSISTED_REQUESTS update, so leader tabs stop seeing writes enqueued by other tabs. Fresh evidence in this commit is that request lifecycle paths now persist queue changes via Onyx.multiSet, while queue execution is still leader-only in src/libs/Network/SequentialQueue.ts (isClientTheLeader() gate), which means non-leader writes can remain unprocessed until reload or leader change because the leader’s in-memory persistedRequests is never updated.
Useful? React with 👍 / 👎.
|
@ChavdaSachin yes please, can you also do the testing? |
…e-and-fix-sequential-queue-issues-issue-4
|
🚧 @mjasikowski has triggered a test Expensify/App build. You can view the workflow run here. |
|
🧪🧪 Use the links below to test this adhoc build on Android, iOS, and Web. Happy testing! 🧪🧪
|
|
✋ This PR was not deployed to staging yet because QA is ongoing. It will be automatically deployed to staging after the next production release. |
|
🚀 Deployed to staging by https://github.com/mjasikowski in version: 9.3.40-0 🚀
|
|
🚀 Deployed to staging by https://github.com/mjasikowski in version: 9.3.40-0 🚀
|
The Promise.resolve().then(() => flush(false)) workaround was needed because stale Onyx.set() callbacks could overwrite in-memory persistedRequests during process() (Issues 3c/4 in Expensify#80759). PR Expensify#84622 fixed this at the source: PersistedRequests now treats in-memory state as authoritative after initialization, making the deferral unnecessary. Call flush(false) directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Deploy Blocker #85669 was identified to be related to this PR. |
|
Deploy Blocker #85673 was identified to be related to this PR. |
|
Deploy Blocker #85671 was identified to be related to this PR |
|
@jponikarchuk @lanitochka17 on it |
|
These 3 deploy blockers are all about file upload/attachment failures, not about the sequential queue persistence pipeline. The root cause is PR #70740 ("refactor: Improve and simplify attachment and file validation"), merged today, which modifies Issue #85671 directly links PR #70740 in its "regression testing" field. Our PR #84622 only touches |
|
@szymonzalarski98 QA found these regressions using the binary search method. This is the first PR based on the chronological order of being merged that has this regression |
|
so either there is some issue with the builds cc @LukasMod or order, or not sure what is wrong here 😄 |
|
Deploy Blocker ##85708 was identified to be related to this PR. |
|
I reverted the current PR locally and could confirm #85708 is caused by the current PR, haven't tested the other blockers but they are likely to be related to current PR as well based on the similarity(file upload related issues). My copilot came up with the changes which fixed the issue #85708, but I haven't reviewed the code (gotta go offline now 💤). |
|
@ChavdaSachin thank you for letting me know, I'm working on a fix for those issues |
|
Deploy Blocker #85755 was identified to be related to this PR. |
|
Hey, here is a PR with fix #85920 |
Explanation of Change
Sequential Queue Issues — Status Report
Fixed: Issues 2, 3, 4, 5
Issue 3: processNextRequest() and rollbackOngoingRequest() don't persist to disk
File:
src/libs/actions/PersistedRequests.tsProblem: When a request moves from the queue to
ongoingRequest, neither the updated queue nor the ongoing request is written to disk. If the app crashes mid-flight, the ongoing request is lost (3a) and the queue on disk is stale (3c).Fix: Both
processNextRequest()androllbackOngoingRequest()now useOnyx.multiSet()to atomically persist the updated queue and the ongoing request to disk.Issue 4: Connect callback overwrites in-memory state with stale disk data
File:
src/libs/actions/PersistedRequests.tsProblem: The
connectWithoutViewcallback blindly overwrites the in-memorypersistedRequestsarray with whatever Onyx returns from disk. When multipleOnyx.set()calls resolve out of order, the last callback wins — potentially reverting the in-memory state to a stale snapshot and permanently losing requests.Fix: After initialization, the callback is a no-op for non-null values:
if (isInitialized && val != null) { return; }. In-memory state is authoritative.Onyx.clear()(which sendsnull) is still allowed through so the module resets properly for tests and app restarts.Issue 5: Conflict resolver sees stale queue
Fixed automatically by Issue 4. Since in-memory state is no longer overwritten by stale disk callbacks, the conflict resolver in
prepareRequest()andpush()always sees the complete, up-to-date queue.Issue 2: Nothing in the write pipeline is awaited
Files:
src/libs/actions/PersistedRequests.ts,src/libs/Network/SequentialQueue.ts,src/libs/API/index.tsProblem:
API.write()returnedPromise.resolve()immediately.PersistedRequests.save()was void (didn't return theOnyx.set()promise). The caller had zero guarantee that the request was persisted to disk.Fix:
save()now returnsPromise<void>(theOnyx.set()promise)push()returns apersistencePromisefromsave()(but callsflush()synchronously to preserve write-before-read ordering)processRequest()is nowasyncand awaitspushToSequentialQueue()await API.write()now guarantees the request is persisted to diskNot Fixed: Issue 1 — Optimistic data is applied before the request is persisted
The Problem
In
API.write(),prepareRequest()appliesOnyx.update(optimisticData)BEFORE the request is persisted to disk. If the app crashes after the optimistic UI update but beforeOnyx.set(PERSISTED_REQUESTS)completes, the user sees stale optimistic UI with no backing request to reconcile it.Why We Can't Fix It Now
The fix requires moving
Onyx.update(optimisticData)to AFTERpush()inwrite(). However, conflict resolvers have Onyx side effects insidecheckAndFixConflictingRequest.Specifically,
resolveCommentDeletionConflictsinsrc/libs/actions/RequestConflictUtils.ts:151callsOnyx.update(rollbackData)— which sets a report action tonull— inside the conflict resolver callback.Current flow (works):
prepareRequest()→ conflict resolver →Onyx.update(rollback)— sets action tonullprepareRequest()→Onyx.update(optimisticData)— setspendingAction: "delete"push()→ conflict resolver again →Onyx.update(rollback)Steps 1+2 are in the same synchronous tick → Onyx batches them together →
nullwins → action is deleted. Tests pass.Flow after moving optimistic data after push (breaks):
prepareRequest()→ conflict resolver →Onyx.update(rollback)— sets action tonullpush()→save()→Onyx.set(PERSISTED_REQUESTS)push()→ conflict resolver again →Onyx.update(rollback)write()→Onyx.update(optimisticData)— setspendingAction: "delete"Steps 1 and 4 are now in different Onyx batches (separated by push operations). The
nullfrom step 1 is consumed in batch A. The optimistic data in step 4 creates the action anew with{pendingAction: "delete"}instead of it staying deleted. 4 ReportTest tests fail.Plan for Future Fix
The root cause is that
resolveCommentDeletionConflicts(and potentially other conflict resolvers) callOnyx.update()as a side effect inside the callback. This couples Onyx state mutations to the conflict resolution logic, making it impossible to reorder the pipeline.Required refactoring:
ConflictActionDatatype to include an optionalrollbackDatafieldOnyx.update()directlywrite()apply rollback data and optimistic data together in a controlled order, after persistThis changes the conflict resolver interface and affects all resolvers — it should be done in a separate PR.
Fixed Issues
$ #80759
PROPOSAL:
Tests
npx jest tests/unit/PersistedRequests.ts --no-coverage— all pass, including:Issue 3a: verifiesPERSISTED_ONGOING_REQUESTSis written to disk afterprocessNextRequest()(viaOnyxUtils.get())Issue 3c: verifiesPERSISTED_REQUESTSon disk matches in-memory afterprocessNextRequest()Issue 4: verifies rapid concurrent saves don't lose requests due to out-of-order Onyx.set() resolutionnpx jest tests/unit/APITest.ts --no-coverage— all pass, including:Issue 5: verifies conflict resolver sees complete queue state after out-of-order writesnpx jest tests/unit/MiddlewareTest.ts tests/unit/SequentialQueueTest.ts --no-coverage— all pass (no regressions)Offline tests
QA Steps
PR Author Checklist
### Fixed Issuessection aboveTestssectionOffline stepssectionQA stepssectiontoggleReportand notonIconClick)src/languages/*files and using the translation methodSTYLE.md) were followedAvatar, I verified the components usingAvatarare working as expected)StyleUtils.getBackgroundAndBorderStyle(theme.componentBG))npm run compress-svg)Avataris modified, I verified thatAvataris working as expected in all cases)Designlabel and/or tagged@Expensify/designso the design team can review the changes.ScrollViewcomponent to make it scrollable when more elements are added to the page.mainbranch was merged into this PR after a review, I tested again and verified the outcome was still expected according to theTeststeps.Screenshots/Videos
Android: Native
Android: mWeb Chrome
iOS: Native
iOS: mWeb Safari
MacOS: Chrome / Safari