Skip to content

Harden SD/I2C failure paths against on-track reboots & corruption#70

Merged
TheAngryRaven merged 1 commit into
BETAfrom
claude/code-quality-edge-cases-dpbi7p
Jun 27, 2026
Merged

Harden SD/I2C failure paths against on-track reboots & corruption#70
TheAngryRaven merged 1 commit into
BETAfrom
claude/code-quality-edge-cases-dpbi7p

Conversation

@TheAngryRaven

Copy link
Copy Markdown
Owner

Summary

A code-quality sweep prompted by spontaneous on-track reboots (suspected failing SD card tray). The reboot itself is the 4 s hardware watchdog doing its job when SdFat blocks on a card that's physically dropping out mid-write — that can't be fully prevented in software. But four spots turned a flaky card or a glitched I2C bus into a reboot loop or silent corruption; this fixes those:

  • i2cBusRecover() could boot-loop. It re-inits Wire + the OLED over I2C but never fed the watchdog (unlike the GPS baud-recovery path), so under sustained ignition EMI it could trip the WDT while recovering and re-trip on reboot. Now pets the watchdog before each blocking re-init step, and safeDisplayUpdate() recovers a hung bus inline instead of one frame late.
  • Truncated DOVEX header went unnoticed. The 1 KB header pre-fill ignored each write() return, so a card dropping sectors during log creation was still marked "ready" and streamed rows into a truncated region. Now verifies every write and aborts log init cleanly (retries next second).
  • Lost lap times on mid-session write failure. A failed data-row write stopped logging cleanly but never wrote the header, losing the session's lap list. Now attempts writeDovexHeader() before closing.
  • buildTrackList() bypassed the SD mutex. The lone SD consumer touching SdFat raw, leaving a window where a BLE track upload/delete completing during a logging teardown could hit SdFat from two tasks. Now holds SD_ACCESS_TRACK_PARSE for the directory walk (both callers release first, so no self-deadlock) and checks the directory open.

Type of change

  • Bug fix (no user-visible behavior change beyond the fix)
  • New feature / behavior
  • Refactor (no behavior change)
  • Tests only
  • CI / tooling / docs
  • Breaking change (track files, log format, BLE protocol, or a removed mode)

How it was verified

  • Host unit tests pass (ctest --test-dir tests/build) — pure units untouched, ran as a regression check
  • clang-tidy clean — relying on CI
  • Compiles for the XIAO nRF52840 Sense — relying on CI compile-sketch
  • Tested on real hardware — not yet; these are failure-path changes that need a flaky-card / EMI scenario to exercise. Worth a bench test with a marginal card before relying on them.

Checklist

  • CHANGELOG.md updated under [Unreleased]
  • ARCHITECTURE.md / CLAUDE.md updated — no module or interface changed
  • New testable logic has a matching test in tests/ — changes are Arduino/SdFat-bound failure paths, not host-testable pure logic
  • Branch is focused — all four fixes are the same concern (SD/I2C failure-path hardening)

Notes

A related finding from the same sweep was left out as a separate concern: auto-race can fire while viewing replay, and live racing + replay share the same lapHistory[] buffer (stale-read hazard). Also worth wiring NRF_POWER->RESETREAS to a debug surface to distinguish watchdog resets (card stalls) from hard faults.

🤖 Generated with Claude Code


Generated by Claude Code

A code-quality sweep prompted by spontaneous on-track reboots (suspected
failing SD card tray). Fixes four edge cases where a flaky card or a
glitched I2C bus turns into a reboot loop or silent corruption:

- i2cBusRecover(): feed the watchdog before each blocking I2C re-init step
  so the recovery routine can't itself trip the 4 s WDT (boot loop under
  sustained ignition EMI); recover a hung bus inline instead of one frame
  late.
- DOVEX header pre-fill: verify each write() and abort log init cleanly on
  a short write instead of marking logging ready over a truncated header.
- Mid-session write failure: attempt writeDovexHeader() before close so the
  session's lap times survive a dying card.
- buildTrackList(): hold SD_ACCESS_TRACK_PARSE for the directory walk (the
  lone SD consumer that bypassed the mutex) and check the directory open.

Host unit tests pass; CHANGELOG updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01WkdH4kE9pBKSfRBM7FTG1k
@github-actions

Copy link
Copy Markdown

Coverage — host-testable units

📂 Overall coverage

Metric Coverage
Lines 🟢 259/260 (99.6%)
Functions 🟢 26/26 (100.0%)
Branches 🟡 237/271 (87.5%)

📄 File coverage

File Lines Functions Branches
BirdsEye/crc32.cpp 🟢 30/30 (100.0%) 🟢 4/4 (100.0%) 🟢 24/24 (100.0%)
BirdsEye/dovex_header.cpp 🟢 100/101 (99.0%) 🟢 6/6 (100.0%) 🔴 61/92 (66.3%)
BirdsEye/filename_validator.cpp 🟢 14/14 (100.0%) 🟢 1/1 (100.0%) 🟢 30/30 (100.0%)
BirdsEye/gps_time.cpp 🟢 45/45 (100.0%) 🟢 6/6 (100.0%) 🟢 30/32 (93.8%)
BirdsEye/gps_validation.cpp 🟢 24/24 (100.0%) 🟢 2/2 (100.0%) 🟢 66/66 (100.0%)
BirdsEye/haversine.cpp 🟢 8/8 (100.0%) 🟢 1/1 (100.0%) ⚫ 0/0 (0.0%)
BirdsEye/lap_format.cpp 🟢 18/18 (100.0%) 🟢 1/1 (100.0%) 🟢 9/9 (100.0%)
BirdsEye/sd_access_policy.cpp 🟢 5/5 (100.0%) 🟢 2/2 (100.0%) 🟢 10/10 (100.0%)
BirdsEye/tach_filter.cpp 🟢 15/15 (100.0%) 🟢 3/3 (100.0%) 🟡 7/8 (87.5%)

@TheAngryRaven TheAngryRaven merged commit 6920115 into BETA Jun 27, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants