Skip to content

fix(ws): remove periodic status broadcast; align push pattern with BT#62

Open
skialpine wants to merge 2 commits into
decentespresso:mainfrom
skialpine:fix/ws-remove-periodic-broadcasts
Open

fix(ws): remove periodic status broadcast; align push pattern with BT#62
skialpine wants to merge 2 commits into
decentespresso:mainfrom
skialpine:fix/ws-remove-periodic-broadcasts

Conversation

@skialpine
Copy link
Copy Markdown
Contributor

Summary

The /snapshot WebSocket was the only protocol pushing periodic state snapshots. BT notifies only weight + on-event; USB binary follows the same shape. WiFi did a periodic type:\"status\" frame every 5 s when events_enabled — for state that nothing on-device consumed and that was the dominant heap-pressure source in PR #61's 8 h soak.

After this PR, clients get state on demand via {\"command\":\"status\"} — same model as BT's 0x22 voltage request, same per-client reply path that already exists. No protocol field is removed; the status frame shape is unchanged, it's just no longer pushed automatically.

Why

  • No consumers: the on-device web apps (web_apps/{Quality_Control_Assistant,Weigh_Save,dosing_assistant,index.html}) have zero references to battery_percent, timer_seconds, charging, low_power, etc. Grepped. Verified.
  • Heap pressure: in the 8 h soak that motivated fix: tighten WS heap defenses (cap=4, broadcast floor 32K) #61, status broadcasts were generating ~5,000 OOM-gate skip events under marginal RSSI. Removing the periodic push removes the largest WiFi-only allocation pattern.
  • Cross-protocol consistency: BT clients ask for battery; WiFi clients had it pushed. Now both ask. Easier-to-write apps that work the same regardless of transport.

What's preserved

  • sendWebsocketStatus(client, ...) — per-client reply to {\"command\":\"status\"} / battery / info. Unchanged.
  • sendWebsocketButton, sendWebsocketPowerOff — event-driven broadcasts. Still gated on b_websocketEventsEnabled. Unchanged.
  • Status frame shape (16 fields including protocol_version, firmware_version, battery, timer state, etc.). Unchanged.
  • The default-2Hz weight stream, rate negotiation, all WS commands. Unchanged.

What's removed

  • sendWebsocketStatusAll() function (no callers after this PR)
  • The periodic-tick block in loop() that called it every 5 s
  • t_lastWebsocketStatusUpdate global (no period timer to track)
  • WEBSOCKET_STATUS_NOTIFY_INTERVAL_MS constant
  • WS event handler no longer initialises/resets the period timer
  • README sentence claiming periodic status; "events on" paragraph updated

Backward compat

The events on command still works and still controls button + power-off event broadcasts. Clients that were relying on the 5 s periodic status push need to poll {\"command\":\"status\"} themselves (any cadence they want). Existing clients that ignore status frames they didn't ask for are unaffected.

Test plan

  • Build clean (pio run -e esp32s3) — RAM 16.8%, Flash 45.5%
  • Hardware smoke test: 20 s window after events on + rate 10k201 weight frames at 10 Hz, 0 status frames (would have caught 4 periodic status frames if leaking)
  • On-request {\"command\":\"status\"} still answers with the full 16-field frame
  • Verify decentespresso app and on-device web UI still work (no consumers, low risk)

🤖 Generated with Claude Code

skialpine and others added 2 commits May 27, 2026 08:54
The /snapshot WebSocket was the only protocol pushing periodic state
snapshots. BT notifies only weight + on-event (power_off, buttons); USB
binary follows the same shape. WiFi did a periodic status frame every
5 s when events_enabled, which:

  - allocated ~310 B x N clients every 5 s for state the on-device web
    apps never consumed (verified: zero references to battery_percent,
    timer_seconds, charging, low_power, etc. in web_apps/)
  - was the dominant heap-pressure source in the 8 h soak that
    motivated PR decentespresso#61
  - was inconsistent with BT, surprising clients that interoperate
    across both transports

After this change, clients request status on demand via
{"command":"status"} -- same model as BT's 0x22 voltage request, same
per-client reply path that already exists (sendWebsocketStatus, kept
intact). The events_enabled flag now gates only button events and
power_off broadcasts -- the cross-protocol event types that BT also
notifies, just without an opt-in over there. Weight, button, and
power_off push behavior on WiFi is unchanged.

Cleanup follows:
  - sendWebsocketStatusAll() removed (no callers; was the periodic
    broadcast helper)
  - t_lastWebsocketStatusUpdate global removed (no periodic tick to
    track)
  - WEBSOCKET_STATUS_NOTIFY_INTERVAL_MS constant removed
  - WS event handler no longer initialises/resets the period timer
  - README updated: status frame is on-request only; the "events on"
    paragraph no longer mentions periodic status; cross-references the
    BT pattern explicitly

Verified on hardware:
  20 s window, events on, rate 10k -> 201 weight frames (10.05 Hz)
  + 0 status frames. On-request {"command":"status"} still answers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three findings from the code review on cb569d3:

1. include/websocket.h:166 heap-gate doc-comment claimed "Skipping a frame
   is invisible (the next weight frame is <=500 ms away, status <=5 s);
   crashing is not." The "status <=5 s" recovery guarantee no longer
   holds after this PR removes the only periodic status broadcaster.
   Rewrote to describe what's actually still gated by wsBroadcastHeapOk
   (weight + button + power_off).

2. include/websocket.h:243-247 had an addendum paragraph framing the
   surviving printfAll comment as "Note: status is no longer broadcast
   periodically...". The historical framing reads as documentation for
   a function that no longer exists; dropping it leaves the printfAll
   technical commentary applying cleanly to sendWebsocketWeightAll
   (which is what it now sits above).

3. sendWebsocketStatus (include/websocket.h:224-225) reads stopWatch
   directly from the AsyncTCP task. CLAUDE.md threading model #1
   footgun: "stopWatch.* -- No -- multi-field (running flag + start ts +
   accumulator) and also mutated from loop(), BLE and USB; a status-
   frame read can tear a write across tasks." The bug is pre-existing
   on main (not introduced by this PR), but CLAUDE.md's "Fixing bugs you
   find along the way" applies. Mirroring the snapshot pattern used by
   PR decentespresso#57: new g_timerRunning/g_timerElapsed volatiles in parameter.h,
   refreshed once per main-loop pass in src/hds.ino's WiFi-housekeeping
   block, read by sendWebsocketStatus. (PR decentespresso#57 adds the same snapshot
   independently; merge order between decentespresso#57 and decentespresso#62 will produce a trivial
   same-lines conflict that git auto-resolves.)

Verified on hardware:
  flash + 12s window with events on + rate 10k -> 121 weight frames at
  10 Hz, 0 periodic status frames; on-request {"command":"status"}
  returns the full 16-field frame with timer_running/timer_seconds
  populated from the snapshot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@skialpine
Copy link
Copy Markdown
Contributor Author

Code review

Found 3 issues in cb569d3, all addressed in 7c3b004:

  1. Heap-gate doc comment in include/websocket.h still claimed "Skipping a frame is invisible (the next weight frame is <=500 ms away, status <=5 s); crashing is not." This PR removes the only periodic status broadcaster (sendWebsocketStatusAll), so the "status <=5 s" recovery guarantee no longer holds. Misleads future readers about what the gate actually protects.

https://github.com/decentespresso/openscale/blob/cb569d3b6c20d4cd1da3acb5fe9ffdaa6f4ea29f/include/websocket.h#L164-L172

  1. The multi-line "Broadcast via printfAll()..." preamble was originally written to introduce sendWebsocketStatusAll. After deletion it sits above sendWebsocketWeightAll with an appended "Note: status is no longer broadcast periodically..." paragraph. The technical claims (mutex, no availableForWriteAll, setCloseClientOnQueueFull) still apply to weight broadcasts, but the historical framing reads oddly. Dropped the addendum so the comment reads as native documentation for sendWebsocketWeightAll.

https://github.com/decentespresso/openscale/blob/cb569d3b6c20d4cd1da3acb5fe9ffdaa6f4ea29f/include/websocket.h#L242-L256

  1. (pre-existing, CLAUDE.md Missing files so_boot_menu.h and so_wifi_ota.h #1 footgun) sendWebsocketStatus is invoked from handleWebsocketControlCommand on the AsyncTCP task and reads stopWatch.isRunning() / stopWatch.elapsed() directly. CLAUDE.md threading model: "stopWatch.* -- No -- multi-field (running flag + start ts + accumulator) and also mutated from loop(), BLE and USB; a status-frame read can tear a write across tasks". After removing the periodic broadcast this is now the sole path that reads stopWatch from the WS side, so the tear window matters more. Mirrored PR Scale telemetry: SoC temp, weight-stall watchdog, reset reason #57's g_timerRunning/g_timerElapsed snapshot pattern (refreshed once per main-loop pass; status frame reads snapshots).

https://github.com/decentespresso/openscale/blob/cb569d3b6c20d4cd1da3acb5fe9ffdaa6f4ea29f/include/websocket.h#L222-L228

Note: PR #57 adds the same g_timer* snapshot independently. Whichever lands second produces a trivial same-lines conflict that git auto-resolves.


1 below-threshold issue (50-74)
  • [OK to skip] Comment at include/websocket.h:243-246 says "BT has no periodic state push" — technically accurate for state (battery/gyro/voltage are request-only on BT) but could be misread as "BT is entirely request-driven", which is false (weight + button + power_off are pushed). Wording suggestion, no behavior implication (score: 25 — actually below the report threshold but worth a glance if you're in this area).

Verified on hardware after the fix commit: 12 s window with events on + rate 10k → 121 weight frames, 0 periodic status frames; on-request {\"command\":\"status\"} returns the full frame with timer_running/timer_seconds populated from the snapshot.

🤖 Generated with Claude Code

If this code review was useful, please react with 👍. Otherwise, react with 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant