Don't persist inbound committed onions in prod#4599
Don't persist inbound committed onions in prod#4599valentinewallace wants to merge 1 commit intolightningdevkit:mainfrom
Conversation
|
👋 I see @wpaulino was un-assigned. |
| &InboundHTLCState::Committed { update_add_htlc: ref _update_add } => { | ||
| 3u8.write(writer)?; | ||
| inbound_committed_update_adds.push(update_add_htlc); | ||
| #[cfg(test)] | ||
| inbound_committed_update_adds.push(_update_add); |
There was a problem hiding this comment.
Bug: The write side is now gated by #[cfg(test)], so in production inbound_committed_update_adds remains empty and TLV field 75 is never written. However, on the read side (line 16554), inbound_committed_update_adds_opt is initialized to Some(Vec::new()) by the optional_vec macro (when the TLV is absent, it stays as Some(empty_vec)). The deserialization code at line 16554-16564 then enters the if let Some(update_adds) block unconditionally and calls iter.next().ok_or(DecodeError::InvalidValue)? for each committed HTLC — which immediately fails because the iterator is empty.
Impact: Any production node that restarts while it has committed inbound HTLCs will fail to deserialize its channel state with DecodeError::InvalidValue. This is a critical availability bug.
Fix: The deserialization code at line 16554 needs a corresponding guard. For example:
if let Some(update_adds) = inbound_committed_update_adds_opt {
if !update_adds.is_empty() {
let mut iter = update_adds.into_iter();
// ... existing code ...
}
}Or gate the entire read-side block with #[cfg(test)] to match the write side.
|
After thorough re-examination, I now realize my prior critical bug report was incorrect. Here's why: The variable Review SummaryCorrection to prior review: The critical deserialization bug I flagged previously is a false positive. The No new issues found. The PR correctly:
|
A few PRs ago, we started persisting inbound committed HTLC onions in Channels. These onions were persisted to lay groundwork for reconstructing the ChannelManager's pending HTLC maps from them in a future version. However, we've since discovered a different direction where we can instead reconstruct these same maps using persistent monitor events, which may mean that we don't need to persist these onions. Since persisting a bunch of onions on every manager write is a big commitment that we're not fully confident in yet, switch it off for now until we either confirm the monitor events direction and can delete all this onion persisting code OR realize that we definitely do need it.
c50518a to
d68dbf8
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4599 +/- ##
==========================================
+ Coverage 86.22% 86.28% +0.05%
==========================================
Files 159 159
Lines 109170 109269 +99
Branches 109170 109269 +99
==========================================
+ Hits 94136 94280 +144
+ Misses 12424 12379 -45
Partials 2610 2610
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
A few PRs ago, we started persisting inbound committed HTLC onions in Channels. These onions were persisted to lay groundwork for reconstructing the
ChannelManager's pending HTLC maps from them in a future version. However, we've since discovered a different direction where we can instead reconstruct these same maps using persistent monitor events, which may mean that we don't need to persist these onions.Since persisting a bunch of onions on every manager write is a big commitment that we're not fully confident in yet, switch it off for now until we either confirm the monitor events direction and can delete all this onion persisting code OR realize that we definitely do need it.