Skip to content

fix: decode wire-format fields the firmware emits but the reader drops (AUTOADD_CONFIG, LOGIN_SUCCESS, ACK, RAW_DATA, DEFAULT_FLOOD_SCOPE)#87

Open
mwolter805 wants to merge 1 commit into
meshcore-dev:mainfrom
mwolter805:fix/wire-format-parity-bundle
Open

fix: decode wire-format fields the firmware emits but the reader drops (AUTOADD_CONFIG, LOGIN_SUCCESS, ACK, RAW_DATA, DEFAULT_FLOOD_SCOPE)#87
mwolter805 wants to merge 1 commit into
meshcore-dev:mainfrom
mwolter805:fix/wire-format-parity-bundle

Conversation

@mwolter805
Copy link
Copy Markdown

Background

This started while wiring the AUTOADD_CONFIG response into a Home Assistant MeshCore integration built on this SDK — the auto-add config dialog was missing max_hops because the reader stopped before that trailing byte. Rather than patch the one field downstream, it seemed worth checking the SDK's other firmware-facing decoders the same way, by diffing every frame the companion-radio firmware emits against what reader.py actually reads. That surfaced four more frames with the same class of gap. This PR fixes all five at the source so every SDK consumer benefits, not just the one integration that happened to trip over AUTOADD_CONFIG.

Summary

A sweep of every RESP_CODE_* / PUSH_CODE_* the companion-radio firmware emits against reader.py's decoders found five frames where the SDK reads fewer (or wrong) bytes than the firmware writes. Four are trailing-field drops where the firmware grew the frame and the reader never caught up; one (RAW_DATA) reads the wrong bytes as the payload. All five are fixed here using the existing BATT_AND_STORAGE defensive-read pattern (up-front minimum-length check plus per-field cumulative-length gates keyed on len(data) >= N), so older firmware that doesn't emit the trailing fields continues to decode without raising.

Each fix ships with a legacy-frame and a modern-frame unit test, making the firmware-version dependency explicit.

Fixes

AUTOADD_CONFIGmax_hops trailing byte. The firmware grew this response from 1 byte to 2 bytes (config + max_hops) in companion-v1.14.0+. The reader read only config. Adds a length-gated 1-byte max_hops read; the legacy 1-byte frame decodes to just config with no exception.

LOGIN_SUCCESS — three trailing fields. The new-style RESP_SERVER_LOGIN_OK payload is 13 bytes (permissions + pubkey_prefix + server_timestamp 4B + acl_permissions 1B + fw_ver_level 1B), but the reader stopped after pubkey_prefix. Adds per-field gated reads for the three trailing fields. The legacy 8-byte "OK" frame is unaffected.

Firmware history: server_timestamp introduced in 0e90b731 (2025-05-26); acl_permissions in 7947e8a2 (2025-07-15); fw_ver_level in 418ae08b (2025-09-25). All three first shipped in companion-v1.10.0 — roughly 8 months of the firmware emitting fields the SDK never read.

ACKtrip_time trailing field. The firmware emits PUSH_CODE_SEND_CONFIRMED as 9 bytes (ack_hash 4B + trip_time 4B, the round-trip latency in ms), but the reader read only ack_hash. Adds a length-gated 4-byte trip_time read.

Firmware history: present since d9dc76f1 (2025-01-28), first release companion-v1.0.0a — about 16 months on the wire. The SDK key is trip_time (matching the firmware variable name); happy to rename to trip_time_ms if you prefer the unit suffix.

RAW_DATA — byte realignment (field-semantics change). The firmware emits code + snr + rssi + reserved(0xFF) + payload(N). The reader read SNR and RSSI correctly, then read a fixed 4 bytes starting at the reserved 0xFF — so subscribers got payload = "ff" + first 3 bytes of the real payload, truncated regardless of frame size. The fix advances past the reserved byte and reads the remaining variable-length payload.

This changes the bytes delivered in EventType.RAW_DATA.payload. Firmware history: the 0xFF-reserved layout was introduced in 274bd6dd (2025-02-23, first release companion-v1.0.0a); the SDK's misaligned reader dates to 8f0ecd7 (2025-04-08) — ~13 months of mis-decoding. A public GitHub consumer survey of the SDK's downstream users found no consumer that would break: the top consumers either check payload keys the SDK never emitted (dead-code handlers) or store the value opaquely. Flagging it here so the CHANGELOG can call out the field-semantics change for any private/unsurveyed consumer.

DEFAULT_FLOOD_SCOPE — over-read guard (no observable change). On the 1-byte sentinel frame the handler read 47 bytes past the end. The fix gates the read on len(data) >= 48; the sentinel frame now dispatches an empty payload. Behavior is invisible to consumers (the previous over-read produced empty strings, functionally equivalent to absent keys). One migration note for the CHANGELOG: a consumer treating payload["scope_name"] == "" as a sentinel should switch to a key-presence check ("scope_name" in payload).

Tests

tests/unit/test_protocol_surface_gaps.py adds 10 tests — a legacy-frame + modern-frame pair per fix:

  • AUTOADD_CONFIG: 2-byte frame yields max_hops; 1-byte frame omits it.
  • LOGIN_SUCCESS: 13-byte frame yields all fields; 8-byte legacy frame yields only the base fields.
  • ACK: 9-byte frame yields trip_time; 4-byte legacy frame omits it.
  • RAW_DATA: full payload hex matches the firmware bytes exactly (no leading ff, no truncation); SNR/RSSI parsed correctly.
  • DEFAULT_FLOOD_SCOPE: 48-byte frame populates the scope fields; 1-byte sentinel yields an empty payload, no exception.

Baseline observation pass before applying fixes confirmed each test failed in exactly the documented pre-fix pattern. Full unit suite: 151 passed.

Notes for the maintainer

  • Version bump is your call — the RAW_DATA field-semantics change is the only behavior change a consumer could notice; everything else is purely additive trailing fields gated for backward compatibility.
  • ACK.trip_time naming (trip_time vs trip_time_ms) — see above; either is forward-compatible since no consumer read the field before.
  • Submitted alongside a sibling PR adding the CHANNEL_DATA_RECV (RESP_CODE 27) packet type. Both branch off v2.3.7 and both touch reader.py, so whichever merges second will need a trivial rebase.

…orrect RAW_DATA payload framing

Five wire-format parity fixes in reader.py, all backed by per-finding
unit tests in tests/unit/test_protocol_surface_gaps.py that pair a
legacy-firmware frame against a modern-firmware frame.

AUTOADD_CONFIG (PacketType 25): companion-v1.14.0 firmware emits a
trailing max_hops byte (introduced in firmware commit 00566741); the
SDK was dropping it. Adds a defensive `if len(data) >= 3` trailing-byte
read. Pre-v1.14.0 firmware emits a 1-byte response and continues to
decode to `{"config": ...}` with no `max_hops` key.

LOGIN_SUCCESS (PacketType 0x85): the new-style RESP_SERVER_LOGIN_OK
path emits three trailing fields the SDK was silently dropping —
server_timestamp (4B, firmware commit 0e90b731 / companion-v1.10.0),
acl_permissions (1B, firmware commit 7947e8a2 / companion-v1.10.0),
and fw_ver_level (1B, firmware commit 418ae08b / companion-v1.10.0).
Adds three per-field length gates (>=12, >=13, >=14) so each shows up
if and only if the firmware emitted it. Legacy 8-byte "OK"-path frames
decode unchanged.

ACK (PacketType 0x82): firmware emits a trailing 4-byte trip_time
(round-trip latency in ms) since companion-v1.0.0a (firmware commit
d9dc76f1, Jan 2025) — on the wire for ~16 months but never surfaced
by the SDK. Adds an `if len(data) >= 9` trailing read. Legacy 5-byte
ACK frames decode unchanged.

RAW_DATA (PacketType 0x84): firmware emits
`code(1) + snr(1) + rssi(1) + reserved(1, 0xFF) + payload(N)`, but
the SDK was reading SNR/RSSI then `payload = dbuf.read(4).hex()` —
which swallowed the reserved 0xFF byte as the first payload byte
AND truncated payload to a fixed 4 bytes regardless of frame size.
Replaces the broken read with an explicit 1-byte cursor advance
(discarding the reserved byte) plus a `dbuf.read().hex()` for the
remainder. Note: this is a field-semantics change to `payload` — the
key name and type don't change, but the bytes it carries are now the
firmware's actual payload bytes instead of a 4-byte misaligned slice.
Verified via primary-source code-read of all four top-ranked public
GitHub consumers that no current consumer is impacted (three have a
key-mismatch bug that never reads `payload["payload"]`; the fourth
stores it opaquely). Distinct CHANGELOG entry below.

DEFAULT_FLOOD_SCOPE (PacketType 28): firmware emits either a 48-byte
populated frame or a 1-byte sentinel frame when no scope is set. The
SDK was unconditionally `dbuf.read(31)` + `dbuf.read(16)`, which on
the sentinel frame returns empty bytes and dispatches
`{"scope_name": "", "scope_key": ""}`. Adds an `if len(data) >= 48`
guard so the sentinel dispatches `{}` instead. Consumers detecting
"no scope" via `event.payload["scope_name"] == ""` must migrate to
`"scope_name" in event.payload` (key-presence check); no known
consumer relies on the empty-string sentinel today.

CHANGELOG:
  - Decode max_hops trailing byte in AUTOADD_CONFIG (companion-v1.14.0+).
  - Decode server_timestamp / acl_permissions / fw_ver_level trailing
    fields in LOGIN_SUCCESS (companion-v1.10.0+).
  - Decode trip_time trailing field in ACK (companion-v1.0.0a+).
  - **Field-semantics change**: RAW_DATA `payload` now carries the
    firmware's actual packet payload bytes (no leading reserved byte,
    no fixed 4-byte truncation). Consumers that previously read
    `event.payload["payload"]` will see different bytes; verified no
    current public consumer is impacted.
  - DEFAULT_FLOOD_SCOPE now dispatches `{}` on the 1-byte sentinel
    frame instead of `{scope_name: "", scope_key: ""}`. Consumers
    should detect "no scope" via key presence.

Tests: 10 new tests in tests/unit/test_protocol_surface_gaps.py
(legacy + modern frame pair per finding), full suite passes (151).
Baseline observation pass before fixes: 6 of the 10 new tests failed
in exactly the documented bug patterns (4 legacy pairs already
passed). Post-fix: all 26 tests in the file pass.

Why: companion-radio firmware has been emitting these fields/payloads
on the wire for between 2 and 16 months; SDK consumers (integrations,
bots, dashboards) lose access to data that is on the wire today.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant