Skip to content

HOLD fix network state detection#84760

Open
adhorodyski wants to merge 40 commits intoExpensify:mainfrom
callstack-internal:fix-network-state-detection
Open

HOLD fix network state detection#84760
adhorodyski wants to merge 40 commits intoExpensify:mainfrom
callstack-internal:fix-network-state-detection

Conversation

@adhorodyski
Copy link
Contributor

@adhorodyski adhorodyski commented Mar 10, 2026

@mjasikowski @mountiny @TMisiukiewicz @OlimpiaZurek

Explanation of Change

The existing RecheckConnection system was ad-hoc: a single middleware that re-ran NetInfo.fetch() after every failed request with no coordination, a grab-bag of isOffline/shouldForceOffline/shouldFailAllRequests flags scattered across Onyx, and no unified model for what "offline" means. This PR replaces it with a structured network state detection architecture:

Central state machine (NetworkState.ts) — single source of truth for offline status with three hard-stop triggers: no radio, sustained API failures, and force-offline. All offline decisions flow through updateState() instead of being scattered across files.

Sustained failure detection (FailureTracker.ts + FailureTracking middleware) — replaces the old RecheckConnection middleware. Counts consecutive API failures with a dual threshold (3 failures over 10s) to distinguish transient errors from genuine outages. One successful request clears everything.

Simplified Onyx network state — the Network onyx type is cleaned up: shouldForceOffline stays (debug tool), shouldFailAllRequests and shouldSimulatePoorConnection stay (test tools), but all the derived/redundant flags are removed. isOffline is now set exclusively by NetworkState.

NetInfo as the reachability layer — NetInfo is configured with useNativeReachability: false so it uses JS fetch polling (api/Ping) on all platforms instead of trusting native OS reachability. This aligns behavior across web and mobile: polls every 60s when reachable, every 5s when unreachable. The isInternetReachable transition listener detects recovery.

Reconnect action (Reconnect.ts) — extracted from the old NetworkConnection into a focused module. Called on recovery and app foreground to sync data via openApp/reconnectApp.

Synchronous offline reads (useNetwork + useSyncExternalStore)useNetwork now reads isOffline synchronously from NetworkState via useSyncExternalStore instead of subscribing to Onyx. This eliminates the render-lag from async Onyx reads and ensures components see the correct offline state immediately.

SequentialQueue offline→online flushSequentialQueue subscribes to NetworkState directly and calls flush() on offline→online transitions. This replaces the old pause()/unpause() coupling with a cleaner dependency direction (SequentialQueue → NetworkState).

NetworkConnection.ts deleted — all its responsibilities (NetInfo configuration, Onyx subscriptions, poor connection simulation, getDBTimeWithSkew) are consolidated into NetworkState.ts, eliminating the split between the two modules.

Background location task safety — the background location tracking task uses NetInfo.fetch() directly instead of the in-memory NetworkState.isOffline(), so it works correctly in Android headless JS contexts where module-level state hasn't been populated.

Architecture documentation added at contributingGuides/NETWORK_STATE_DETECTION.md.

Fixed Issues

$ #80759
PROPOSAL:

Tests

1. Basic offline/online detection (Web)

  1. Open the app on web (https://dev.new.expensify.com:8082/)
  2. Log in with any account
  3. Open browser DevTools → Network tab
  4. Set network to Offline (DevTools throttling or disconnect WiFi)
  5. Verify the app shows the offline indicator within ~15 seconds
  6. Verify the JS console shows [NetworkState] Hard stop: NO_RADIO log
  7. Re-enable network
  8. Verify the offline indicator disappears
  9. Verify the JS console shows [NetworkState] Internet reachability restored log
  10. Verify no errors appear in the JS console

2. Basic offline/online detection (Mobile — iOS & Android)

  1. Open the app on a physical device or simulator
  2. Log in with any account
  3. Enable Airplane Mode
  4. Verify the app shows the offline indicator
  5. Disable Airplane Mode
  6. Verify the offline indicator disappears and the app reconnects (data refreshes)
  7. Verify no errors appear in the JS console

3. Sustained failure detection (Web)

  1. Open the app on web, log in
  2. Open DevTools → Console, watch for [FailureTracker] logs
  3. Open the Test Tool Menu (Settings → Troubleshoot → Test tools or shake gesture)
  4. Toggle "Simulate failing network requests" ON
  5. Perform actions that trigger API requests (send a chat message, navigate between reports)
  6. Observe [FailureTracker] Failure #1, #2, #3 logs in the console
  7. After ≥3 failures spanning ≥10 seconds, verify [FailureTracker] Sustained failure threshold reached appears
  8. Verify the app transitions to offline state (offline indicator shows)
  9. Toggle "Simulate failing network requests" OFF
  10. Perform another action that triggers an API request
  11. Verify [FailureTracker] Success after N failures — resetting tracker appears
  12. Verify the app returns to online state

4. Force offline toggle (Test Tool Menu)

  1. Open the app, log in
  2. Open the Test Tool Menu
  3. Toggle "Force offline" ON
  4. Verify the app immediately shows the offline indicator
  5. Verify pending requests are paused (send a chat message — it should appear optimistically but not sync)
  6. Toggle "Force offline" OFF
  7. Verify the app goes back online and the queued message syncs

5. App foreground recovery

  1. Open the app on mobile, log in
  2. Verify some data is loaded (reports list visible)
  3. Background the app (press Home)
  4. Wait ~30 seconds
  5. Bring the app to foreground
  6. Verify the console shows [NetworkState] App became active
  7. Verify data is refreshed (any new reports/messages appear)

6. Recovery after radio loss + restore

  1. Open the app on mobile, log in
  2. Turn off WiFi (but keep cellular on, or vice versa)
  3. Verify the app may briefly show offline then recover via the other radio
  4. Turn off ALL radios (Airplane Mode)
  5. Verify the app goes offline
  6. Turn Airplane Mode off
  7. Verify the app recovers: offline indicator clears, data syncs, sequential queue flushes

7. Transient failures do NOT trigger offline

  1. Open the app on web, log in
  2. Using DevTools, block a specific API endpoint briefly (e.g., block for 2-3 seconds then unblock)
  3. Trigger 2-3 quick API calls that fail
  4. Verify the app does NOT go offline (failures are under both the 3-count AND 10-second thresholds)
  5. Verify [FailureTracker] Failure #N logs appear but NO Sustained failure threshold reached

Offline tests

8. Optimistic updates while offline

  1. Force the app offline (Test Tool Menu → Force offline)
  2. Send a chat message in any report
  3. Verify the message appears immediately (optimistic update)
  4. Create an expense/money request
  5. Verify it appears optimistically
  6. Turn Force offline OFF
  7. Verify queued requests flush and data syncs with the server

9. Sequential queue behavior

  1. Force the app offline
  2. Perform multiple write actions (send 3 messages, create an expense)
  3. Verify each action appears optimistically
  4. Turn Force offline OFF
  5. Verify all queued actions process in order (messages appear server-side in the correct order)

QA Steps

10. End-to-end connectivity on staging

  1. Log in on staging on each platform (iOS native, Android native, mWeb Safari, mWeb Chrome, Desktop Chrome/Safari)
  2. Navigate around the app (open reports, switch workspaces, view search)
  3. Verify no unexpected offline indicators appear
  4. Verify no console errors related to network state

11. Real network interruption on staging

  1. Log in on staging
  2. Disconnect network for ~20 seconds
  3. Verify offline indicator appears
  4. Reconnect network
  5. Verify the app recovers and data syncs
  6. Verify no duplicate or lost messages

12. High traffic account regression

  1. Log in with a High Traffic account on staging
  2. Navigate through reports, search, workspaces
  3. Verify no long loading states or unexpected offline indicators
  4. Verify the app remains responsive and data loads as expected

PR Author Checklist

  • I linked the correct issue in the ### Fixed Issues section above
  • I wrote clear testing steps that cover the changes made in this PR
    • I added steps for local testing in the Tests section
    • I added steps for the expected offline behavior in the Offline steps section
    • I added steps for Staging and/or Production testing in the QA steps section
    • I added steps to cover failure scenarios (i.e. verify an input displays the correct error message if the entered data is not correct)
    • I turned off my network connection and tested it while offline to ensure it matches the expected behavior (i.e. verify the default avatar icon is displayed if app is offline)
    • I tested this PR with a High Traffic account against the staging or production API to ensure there are no regressions (e.g. long loading states that impact usability).
  • I included screenshots or videos for tests on all platforms
  • I ran the tests on all platforms & verified they passed on:
    • Android: Native
    • Android: mWeb Chrome
    • iOS: Native
    • iOS: mWeb Safari
    • MacOS: Chrome / Safari
  • I verified there are no console errors (if there's a console error not related to the PR, report it or open an issue for it to be fixed)
  • I followed proper code patterns (see Reviewing the code)
    • I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e. toggleReport and not onIconClick)
    • I verified that comments were added to code that is not self explanatory
    • I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
    • I verified any copy / text shown in the product is localized by adding it to src/languages/* files and using the translation method
      • If any non-english text was added/modified, I used JaimeGPT to get English > Spanish translation. I then posted it in #expensify-open-source and it was approved by an internal Expensify engineer. Link to Slack message:
    • I verified all numbers, amounts, dates and phone numbers shown in the product are using the localization methods
    • I verified any copy / text that was added to the app is grammatically correct in English. It adheres to proper capitalization guidelines (note: only the first word of header/labels should be capitalized), and is either coming verbatim from figma or has been approved by marketing (in order to get marketing approval, ask the Bug Zero team member to add the Waiting for copy label to the issue)
    • I verified proper file naming conventions were followed for any new files or renamed files. All non-platform specific files are named after what they export and are not named "index.js". All platform-specific files are named for the platform the code supports as outlined in the README.
    • I verified the JSDocs style guidelines (in STYLE.md) were followed
  • If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
  • I followed the guidelines as stated in the Review Guidelines
  • I tested other components that can be impacted by my changes (i.e. if the PR modifies a shared library or component like Avatar, I verified the components using Avatar are working as expected)
  • I verified all code is DRY (the PR doesn't include any logic written more than once, with the exception of tests)
  • I verified any variables that can be defined as constants (ie. in CONST.ts or at the top of the file that uses the constant) are defined as such
  • I verified that if a function's arguments changed that all usages have also been updated correctly
  • If any new file was added I verified that:
    • The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
  • If a new CSS style is added I verified that:
    • A similar style doesn't already exist
    • The style can't be created with an existing StyleUtils function (i.e. StyleUtils.getBackgroundAndBorderStyle(theme.componentBG))
  • If new assets were added or existing ones were modified, I verified that:
    • The assets are optimized and compressed (for SVG files, run npm run compress-svg)
    • The assets load correctly across all supported platforms.
  • If the PR modifies code that runs when editing or sending messages, I tested and verified there is no unexpected behavior for all supported markdown - URLs, single line code, code blocks, quotes, headings, bold, strikethrough, and italic.
  • If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like Avatar is modified, I verified that Avatar is working as expected in all cases)
  • If the PR modifies a component related to any of the existing Storybook stories, I tested and verified all stories for that component are still working as expected.
  • If the PR modifies a component or page that can be accessed by a direct deeplink, I verified that the code functions as expected when the deeplink is used - from a logged in and logged out account.
  • If the PR modifies the UI (e.g. new buttons, new UI components, changing the padding/spacing/sizing, moving components, etc) or modifies the form input styles:
    • I verified that all the inputs inside a form are aligned with each other.
    • I added Design label and/or tagged @Expensify/design so the design team can review the changes.
  • If a new page is added, I verified it's using the ScrollView component to make it scrollable when more elements are added to the page.
  • I added unit tests for any new feature or bug fix in this PR to help automatically prevent regressions in this user flow.
  • If the main branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to the Test steps.

Screenshots/Videos

Android: Native
Android: mWeb Chrome
iOS: Native
iOS: mWeb Safari
MacOS: Chrome / Safari

adhorodyski and others added 8 commits March 10, 2026 11:52
Introduce a hard stop model for offline detection: OS radio (NetInfo),
sustained API failure tracking, and recovery probes with exponential
backoff. This replaces the old RecheckConnection middleware with a more
reliable system that distinguishes between no-radio and server-unreachable
scenarios, and actively probes for recovery before going back online.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
shouldForceOffline is a debug tool that should keep the app offline
unconditionally. Previously, toggling it ON would start the recovery
probe, which would immediately succeed (server is reachable), fail to
clear shouldForceOffline, and restart — causing an infinite loop.

Now updateState() only starts the probe for real connectivity triggers
(noRadio, sustainedFailures) and stops it when shouldForceOffline is
the cause. Also skips the foreground probe in that case.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of subscribing to NetInfo from a useLayoutEffect in Expensify.tsx
(which required prop-drilling accountID), use a module-level Onyx
connection to SESSION in NetworkConnection.ts. This ensures the NetInfo
subscription lifecycle is tied to accountID changes directly and avoids
unnecessary re-renders.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When NO_RADIO cleared (radio came back), noRadioActive was set to false
but updateState() was never called, so Onyx isOffline stayed true.

Now updateState() is always called when NO_RADIO clears. If no other
hard stops are active the app goes online immediately. If sustained
failures are still active, it stays offline and fires an immediate probe.

Also removes temporary [NETWORK_DETECTION] log prefixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Store radio state as a direct positive boolean (`hasRadio = true`)
instead of inverted (`noRadioActive = false`), eliminating the
double-negation in setNoRadio. Also fixes incorrect boolean values
in the architecture docs and removes [NETWORK_DETECTION] log prefixes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the custom RecoveryProbe (~115 lines) with NetInfo's built-in
JS fetch polling by setting useNativeReachability: false. This aligns
behavior across web and mobile: poll api/Ping every 60s when reachable,
every 5s when unreachable.

The NetInfo listener now tracks isInternetReachable transitions and
calls onReachabilityRestored() when connectivity returns, replacing
the probe-based recovery flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@melvin-bot
Copy link

melvin-bot bot commented Mar 10, 2026

Hey! I see that you made changes to our Form component. Make sure to update the docs in FORMS.md accordingly. Cheers!

adhorodyski and others added 2 commits March 10, 2026 17:14
FailureTracker now uses a listener Set instead of importing NetworkState
directly, breaking the circular dependency. NetworkState registers as
a consumer via onSustainedFailureChange().

When onReachabilityRestored() fires, FailureTracker counters are reset
to zero. This prevents stale failure history from immediately
re-triggering a hard stop after recovery on flapping networks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After successful re-authentication on READ requests, call reconnect()
to re-sync app data — restoring behavior that was lost when the old
triggerReconnectionCallbacks was removed.

Add FailureTracker unit tests covering threshold logic, counter reset,
and listener lifecycle. Bring back the re-auth reconnect test adapted
for the new reconnect() function.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adhorodyski
Copy link
Contributor Author

Bug analysis: App stuck in offline/skeleton state despite working internet

Reported in: Slack thread

Symptoms: App permanently stuck showing skeletons and offline indicator. Ping endpoint returns 200 in DevTools, but the app stays offline. Resolved only by clearing cache and restart. Logs showed "a couple of network state changes which led to being stuck in the offline state."


How the old system (current main) gets stuck:

The old architecture has a catch-22:

  1. Network flaps → isOffline set to true in Onyx via scattered code paths
  2. isOffline: true → SequentialQueue pauses → no outgoing requests
  3. RecheckConnection middleware only fires when requests are made — but no requests are going out because the queue is paused
  4. The 60s recheck interval only calls NetInfo.fetch(), which on web just reads navigator.onLine (returns true) — doesn't check actual server reachability
  5. Dead end: app thinks it's offline → doesn't send requests → can't discover it's back online

Additionally, networkStatus could get stuck in UNKNOWN, and multiple code paths could independently write isOffline to Onyx without coordination.

How this PR prevents it:

  1. NetInfo's reachability polling runs independently of our request queue — it doesn't care if SequentialQueue is paused. Polls api/Ping every 5s when unreachable.
  2. When Ping succeeds → isInternetReachable transitions to trueonReachabilityRestored() forcefully clears ALL hard stop flags, resets FailureTracker counters, unpauses queue, and triggers reconnect(). Recovery within ~5s max.
  3. isOffline is now only set by NetworkState.updateState() — no scattered writers, no inconsistent state.
  4. Even stale Onyx state from a previous session is cleared on startup: module-level state initializes clean, and the first NetInfo listener event (fires immediately on subscribe) triggers updateState()isOffline = false.

Verdict: This PR breaks the catch-22 that caused this bug. The old system relied on request-driven rechecks (dead when the queue is paused). Ours uses NetInfo's independent polling that runs regardless of app state.

adhorodyski and others added 5 commits March 11, 2026 12:17
…tion

# Conflicts:
#	src/libs/Navigation/AppNavigator/AuthScreensInitHandler.tsx
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests now use NetworkState.setHasRadio() instead of directly manipulating
ONYXKEYS.NETWORK via Onyx, which ensures they go through the real
production code path (pause/unpause + Onyx merge).

Source fixes:
- Reconnect.ts: named imports instead of namespace import
- FailureTracking.ts: Number() cast for jsonCode comparison
- useNetwork.ts: wrap ref mutation in useEffect
- NetworkTest.tsx: eslint-disable for required namespace import
- AuthScreensInitHandlerTest.tsx: remove obsolete onReconnect tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adhorodyski
Copy link
Contributor Author

I will provide test steps on tomorrow to unblock this going into full review.

@adhorodyski
Copy link
Contributor Author

Provided repro steps, resolving failing tests now.

@adhorodyski
Copy link
Contributor Author

I'm still working on the failing tests around SQ. cc @mjasikowski

adhorodyski and others added 2 commits March 13, 2026 19:26
Two issues caused 7 test failures after replacing onReconnection(flush)
with NetworkState pause/unpause:

1. unpause() called flush(false) synchronously, before pending Onyx.set
   callbacks from save()/update() during offline push() settled. Stale
   callbacks overwrote in-memory persistedRequests during process(),
   causing duplicate API calls (exposes known Issues 3c/4 in Expensify#80759).
   Fix: defer flush to next macrotask via setTimeout(0).

2. .finally() used isQueuePaused instead of isOffline() to resolve
   isReadyPromise. isQueuePaused is true for both offline AND
   shouldPauseQueue pauses — the latter should NOT resolve (WRITEs
   still pending). Fix: restore isOffline() check.

Test adjustments:
- ReportTest: await queue processing before going offline (sync pause()
  blocks immediately unlike old async Onyx.set)
- APITest: account for deferred flush timing in reauthentication test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
setTimeout(0) races with waitForBatchedUpdates() (also setTimeout(0)),
causing MiddlewareTest failures where flush hasn't fired when assertions
run. Switch to Promise.resolve().then() which is a microtask — drained
by waitForBatchedUpdates() before its macrotask resolves. Same FIFO
ordering guarantee for pending Onyx callbacks, no test timing issues.

Also add "macrotask" to cspell.json and revert the extra
waitForBatchedUpdates() in APITest that was only needed for setTimeout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adhorodyski adhorodyski marked this pull request as ready for review March 13, 2026 18:58
@adhorodyski adhorodyski requested review from a team and tgolen as code owners March 13, 2026 18:58
@adhorodyski
Copy link
Contributor Author

@abzokhattab addressed everything

The test relied on a fire-and-forget Onyx.merge (triggered by
NetworkState.setHasRadio) propagating before the next .then() in the
chain. Add waitForBatchedUpdates to make the timing deterministic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adhorodyski adhorodyski requested a review from abzokhattab March 19, 2026 10:56
@mjasikowski
Copy link
Contributor

@codex review
please work this time

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0598e72329

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@abzokhattab
Copy link
Contributor

rechecking

@adhorodyski
Copy link
Contributor Author

Working on conflicts & AI comments now.

adhorodyski and others added 2 commits March 19, 2026 16:49
Switch from a denylist (skip cancelled/duplicate, record everything else)
to an allowlist: only FAILED_TO_FETCH and EXPENSIFY_SERVICE_INTERRUPTED
count as connectivity failures. This prevents 429 throttle responses and
other non-connectivity HTTP errors from incorrectly tripping the
sustained failure hard stop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adhorodyski adhorodyski changed the title Replace RecheckConnection with three-layer network state detection HOLD fix network state detection Mar 20, 2026
@adhorodyski
Copy link
Contributor Author

@mjasikowski still working on a sync version of isOffline that is not tied to Onyx, will push some updates soon

adhorodyski and others added 2 commits March 20, 2026 18:27
Replace the async Onyx-based isOffline with a synchronous in-memory
state machine in NetworkState.ts. All consumers now read isOffline()
directly instead of subscribing to ONYXKEYS.NETWORK.

Key changes:
- NetworkState.ts is the single source of truth for offline status
- useNetwork uses useSyncExternalStore for React integration
- SequentialQueue subscribes to NetworkState for offline→online flush
- Background location task uses NetInfo.fetch() for headless JS safety
- Delete NetworkConnection.ts (consolidated into NetworkState.ts)
- Reconnect.ts owns app foreground + reachability recovery listeners
- Fix lastOfflineAt to update on every offline transition
- Fix poor connection simulation timer leak

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tion

# Conflicts:
#	src/components/Form/FormProvider.tsx
@adhorodyski
Copy link
Contributor Author

@codex review

adhorodyski and others added 2 commits March 20, 2026 18:41
Restore the original two separate if-guards so each condition has its
own early return and distinct log message.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same treatment as process() — restore two separate if-guards with
distinct log messages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 036abe49b4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

adhorodyski and others added 6 commits March 20, 2026 18:55
Previously only jsonCode 200 was counted as success. Non-200 server
responses (407 auth required, 429 throttled, etc.) were ignored,
leaving the failure counter dirty. This could cause false hard stops
when auth errors preceded a transient network failure.

Any resolved response proves connectivity — record success for all.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add section explaining the three layers of natural backoff that
protect against thundering herd on server recovery. Fix references
to deleted NetworkConnection.ts — all functionality is in
NetworkState.ts now.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevents unnecessary re-renders of useNetwork() consumers when
multiple hard stop triggers fire while already offline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Early return when failureCount is already 0 — avoids logging,
counter writes, and listener notifications on every successful
request when there's nothing to reset.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each browser tab detects connectivity independently via module-level
state rather than shared Onyx persistence. This is intentional.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Skip triggering flush when offline — the request is already persisted
and will be sent when back online. Matches the original behavior.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adhorodyski adhorodyski force-pushed the fix-network-state-detection branch from 1fc28fb to bb5e511 Compare March 20, 2026 19:40
adhorodyski and others added 3 commits March 20, 2026 20:46
…tion

# Conflicts:
#	src/libs/Navigation/AppNavigator/AuthScreensInitHandler.tsx
The NetInfo listener was early-returning when shouldForceOffline was true,
which meant hasRadio and prevIsInternetReachable stopped tracking reality.
When force-offline was turned off, updateState() used stale values.

Now the listener always updates hasRadio and reachability tracking. Only
the onReachabilityRestored() reconnect side effect is gated behind
!shouldForceOffline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Module-level listeners (reachability, foreground) fire even after logout.
Guard reconnect() with a session check so we don't issue openApp/reconnectApp
requests with no active session.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adhorodyski
Copy link
Contributor Author

@MelvinBot review

adhorodyski and others added 2 commits March 20, 2026 21:07
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…int errors

The module-level isOffline export shadowed function parameters of the
same name in IOU/index.ts, PolicyUtils.ts, and ReportActionsUtils.ts,
causing 10 @typescript-eslint/no-shadow errors in CI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants