fix: decode wire-format fields the firmware emits but the reader drops (AUTOADD_CONFIG, LOGIN_SUCCESS, ACK, RAW_DATA, DEFAULT_FLOOD_SCOPE)#87
Open
mwolter805 wants to merge 1 commit into
Conversation
…orrect RAW_DATA payload framing
Five wire-format parity fixes in reader.py, all backed by per-finding
unit tests in tests/unit/test_protocol_surface_gaps.py that pair a
legacy-firmware frame against a modern-firmware frame.
AUTOADD_CONFIG (PacketType 25): companion-v1.14.0 firmware emits a
trailing max_hops byte (introduced in firmware commit 00566741); the
SDK was dropping it. Adds a defensive `if len(data) >= 3` trailing-byte
read. Pre-v1.14.0 firmware emits a 1-byte response and continues to
decode to `{"config": ...}` with no `max_hops` key.
LOGIN_SUCCESS (PacketType 0x85): the new-style RESP_SERVER_LOGIN_OK
path emits three trailing fields the SDK was silently dropping —
server_timestamp (4B, firmware commit 0e90b731 / companion-v1.10.0),
acl_permissions (1B, firmware commit 7947e8a2 / companion-v1.10.0),
and fw_ver_level (1B, firmware commit 418ae08b / companion-v1.10.0).
Adds three per-field length gates (>=12, >=13, >=14) so each shows up
if and only if the firmware emitted it. Legacy 8-byte "OK"-path frames
decode unchanged.
ACK (PacketType 0x82): firmware emits a trailing 4-byte trip_time
(round-trip latency in ms) since companion-v1.0.0a (firmware commit
d9dc76f1, Jan 2025) — on the wire for ~16 months but never surfaced
by the SDK. Adds an `if len(data) >= 9` trailing read. Legacy 5-byte
ACK frames decode unchanged.
RAW_DATA (PacketType 0x84): firmware emits
`code(1) + snr(1) + rssi(1) + reserved(1, 0xFF) + payload(N)`, but
the SDK was reading SNR/RSSI then `payload = dbuf.read(4).hex()` —
which swallowed the reserved 0xFF byte as the first payload byte
AND truncated payload to a fixed 4 bytes regardless of frame size.
Replaces the broken read with an explicit 1-byte cursor advance
(discarding the reserved byte) plus a `dbuf.read().hex()` for the
remainder. Note: this is a field-semantics change to `payload` — the
key name and type don't change, but the bytes it carries are now the
firmware's actual payload bytes instead of a 4-byte misaligned slice.
Verified via primary-source code-read of all four top-ranked public
GitHub consumers that no current consumer is impacted (three have a
key-mismatch bug that never reads `payload["payload"]`; the fourth
stores it opaquely). Distinct CHANGELOG entry below.
DEFAULT_FLOOD_SCOPE (PacketType 28): firmware emits either a 48-byte
populated frame or a 1-byte sentinel frame when no scope is set. The
SDK was unconditionally `dbuf.read(31)` + `dbuf.read(16)`, which on
the sentinel frame returns empty bytes and dispatches
`{"scope_name": "", "scope_key": ""}`. Adds an `if len(data) >= 48`
guard so the sentinel dispatches `{}` instead. Consumers detecting
"no scope" via `event.payload["scope_name"] == ""` must migrate to
`"scope_name" in event.payload` (key-presence check); no known
consumer relies on the empty-string sentinel today.
CHANGELOG:
- Decode max_hops trailing byte in AUTOADD_CONFIG (companion-v1.14.0+).
- Decode server_timestamp / acl_permissions / fw_ver_level trailing
fields in LOGIN_SUCCESS (companion-v1.10.0+).
- Decode trip_time trailing field in ACK (companion-v1.0.0a+).
- **Field-semantics change**: RAW_DATA `payload` now carries the
firmware's actual packet payload bytes (no leading reserved byte,
no fixed 4-byte truncation). Consumers that previously read
`event.payload["payload"]` will see different bytes; verified no
current public consumer is impacted.
- DEFAULT_FLOOD_SCOPE now dispatches `{}` on the 1-byte sentinel
frame instead of `{scope_name: "", scope_key: ""}`. Consumers
should detect "no scope" via key presence.
Tests: 10 new tests in tests/unit/test_protocol_surface_gaps.py
(legacy + modern frame pair per finding), full suite passes (151).
Baseline observation pass before fixes: 6 of the 10 new tests failed
in exactly the documented bug patterns (4 legacy pairs already
passed). Post-fix: all 26 tests in the file pass.
Why: companion-radio firmware has been emitting these fields/payloads
on the wire for between 2 and 16 months; SDK consumers (integrations,
bots, dashboards) lose access to data that is on the wire today.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
This started while wiring the
AUTOADD_CONFIGresponse into a Home Assistant MeshCore integration built on this SDK — the auto-add config dialog was missingmax_hopsbecause the reader stopped before that trailing byte. Rather than patch the one field downstream, it seemed worth checking the SDK's other firmware-facing decoders the same way, by diffing every frame the companion-radio firmware emits against whatreader.pyactually reads. That surfaced four more frames with the same class of gap. This PR fixes all five at the source so every SDK consumer benefits, not just the one integration that happened to trip overAUTOADD_CONFIG.Summary
A sweep of every
RESP_CODE_*/PUSH_CODE_*the companion-radio firmware emits againstreader.py's decoders found five frames where the SDK reads fewer (or wrong) bytes than the firmware writes. Four are trailing-field drops where the firmware grew the frame and the reader never caught up; one (RAW_DATA) reads the wrong bytes as the payload. All five are fixed here using the existingBATT_AND_STORAGEdefensive-read pattern (up-front minimum-length check plus per-field cumulative-length gates keyed onlen(data) >= N), so older firmware that doesn't emit the trailing fields continues to decode without raising.Each fix ships with a legacy-frame and a modern-frame unit test, making the firmware-version dependency explicit.
Fixes
AUTOADD_CONFIG—max_hopstrailing byte. The firmware grew this response from 1 byte to 2 bytes (config+max_hops) in companion-v1.14.0+. The reader read onlyconfig. Adds a length-gated 1-bytemax_hopsread; the legacy 1-byte frame decodes to justconfigwith no exception.LOGIN_SUCCESS— three trailing fields. The new-styleRESP_SERVER_LOGIN_OKpayload is 13 bytes (permissions+pubkey_prefix+server_timestamp4B +acl_permissions1B +fw_ver_level1B), but the reader stopped afterpubkey_prefix. Adds per-field gated reads for the three trailing fields. The legacy 8-byte"OK"frame is unaffected.Firmware history:
server_timestampintroduced in0e90b731(2025-05-26);acl_permissionsin7947e8a2(2025-07-15);fw_ver_levelin418ae08b(2025-09-25). All three first shipped in companion-v1.10.0 — roughly 8 months of the firmware emitting fields the SDK never read.ACK—trip_timetrailing field. The firmware emitsPUSH_CODE_SEND_CONFIRMEDas 9 bytes (ack_hash4B +trip_time4B, the round-trip latency in ms), but the reader read onlyack_hash. Adds a length-gated 4-bytetrip_timeread.Firmware history: present since
d9dc76f1(2025-01-28), first release companion-v1.0.0a — about 16 months on the wire. The SDK key istrip_time(matching the firmware variable name); happy to rename totrip_time_msif you prefer the unit suffix.RAW_DATA— byte realignment (field-semantics change). The firmware emitscode + snr + rssi + reserved(0xFF) + payload(N). The reader read SNR and RSSI correctly, then read a fixed 4 bytes starting at the reserved0xFF— so subscribers gotpayload = "ff" + first 3 bytes of the real payload, truncated regardless of frame size. The fix advances past the reserved byte and reads the remaining variable-length payload.This changes the bytes delivered in
EventType.RAW_DATA.payload. Firmware history: the0xFF-reserved layout was introduced in274bd6dd(2025-02-23, first release companion-v1.0.0a); the SDK's misaligned reader dates to8f0ecd7(2025-04-08) — ~13 months of mis-decoding. A public GitHub consumer survey of the SDK's downstream users found no consumer that would break: the top consumers either check payload keys the SDK never emitted (dead-code handlers) or store the value opaquely. Flagging it here so the CHANGELOG can call out the field-semantics change for any private/unsurveyed consumer.DEFAULT_FLOOD_SCOPE— over-read guard (no observable change). On the 1-byte sentinel frame the handler read 47 bytes past the end. The fix gates the read onlen(data) >= 48; the sentinel frame now dispatches an empty payload. Behavior is invisible to consumers (the previous over-read produced empty strings, functionally equivalent to absent keys). One migration note for the CHANGELOG: a consumer treatingpayload["scope_name"] == ""as a sentinel should switch to a key-presence check ("scope_name" in payload).Tests
tests/unit/test_protocol_surface_gaps.pyadds 10 tests — a legacy-frame + modern-frame pair per fix:AUTOADD_CONFIG: 2-byte frame yieldsmax_hops; 1-byte frame omits it.LOGIN_SUCCESS: 13-byte frame yields all fields; 8-byte legacy frame yields only the base fields.ACK: 9-byte frame yieldstrip_time; 4-byte legacy frame omits it.RAW_DATA: full payload hex matches the firmware bytes exactly (no leadingff, no truncation); SNR/RSSI parsed correctly.DEFAULT_FLOOD_SCOPE: 48-byte frame populates the scope fields; 1-byte sentinel yields an empty payload, no exception.Baseline observation pass before applying fixes confirmed each test failed in exactly the documented pre-fix pattern. Full unit suite: 151 passed.
Notes for the maintainer
RAW_DATAfield-semantics change is the only behavior change a consumer could notice; everything else is purely additive trailing fields gated for backward compatibility.ACK.trip_timenaming (trip_timevstrip_time_ms) — see above; either is forward-compatible since no consumer read the field before.CHANNEL_DATA_RECV(RESP_CODE 27) packet type. Both branch offv2.3.7and both touchreader.py, so whichever merges second will need a trivial rebase.