fuzz: remove macros from chanmon_consistency#4571
fuzz: remove macros from chanmon_consistency#4571joostjager wants to merge 11 commits intolightningdevkit:mainfrom
Conversation
|
👋 Thanks for assigning @jkczyz as a reviewer! |
dede551 to
54a3434
Compare
|
Pushed result of experimental automatic commit splitting using a stop hook. |
f370956 to
4e3cce7
Compare
|
Rebased, cleaned up auto-splitted commits, minimized diffs. Currently fuzz CI fails and seems to be a regression. |
4e3cce7 to
66dc37b
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4571 +/- ##
=======================================
Coverage 87.16% 87.16%
=======================================
Files 161 161
Lines 109251 109251
Branches 109251 109251
=======================================
+ Hits 95230 95232 +2
+ Misses 11547 11541 -6
- Partials 2474 2478 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| ); | ||
| for (id, data) in state.pending_monitors.drain(..) { | ||
| self.monitor.chain_monitor.channel_monitor_updated(*chan_id, id).unwrap(); | ||
| if id > state.persisted_monitor_id { |
There was a problem hiding this comment.
Nit: This uses > (strict) while complete_all_pending_monitor_updates at line 1083 uses >=. When watch_channel returns InProgress, persisted_monitor_id is set equal to the pending monitor's id (and persisted_monitor is empty Vec::new()). If this method were ever called on a channel whose initial watch_channel InProgress entry hasn't been completed yet, > would skip updating persisted_monitor, leaving it as an empty vec.
In practice this path isn't reachable today because make_channel always calls complete_all_pending_monitor_updates (with >=) first. But using >= here (matching the other method) would be safer and less surprising for future maintenance.
Same applies to complete_monitor_update at line 1116.
There was a problem hiding this comment.
Leaving as is to keep this PR a strict refactor.
|
Both previously reported issues are still present and unresolved. I've now completed a thorough review of the entire file including:
No new issues found beyond the two already reported. Review SummaryPreviously reported issues (still unresolved):
No new issues found on this re-review pass. |
|
Verified coverage in codecov. It is equal to main. Still reaching the same code paths after the refactor. |
6e05ce1 to
6feccc1
Compare
joostjager
left a comment
There was a problem hiding this comment.
Pair-review with @jkczyz
|
|
||
| struct PaymentTracker { | ||
| pending_payments: [Vec<PaymentId>; 3], | ||
| resolved_payments: [HashMap<PaymentId, Option<PaymentHash>>; 3], |
There was a problem hiding this comment.
The arrays of three, maybe this is indicative of a potential split.
There was a problem hiding this comment.
Added NodePayments struct
| last_htlc_clear_fee: u32, | ||
| } | ||
|
|
||
| impl<'a> std::ops::Deref for HarnessNode<'a> { |
There was a problem hiding this comment.
Was introduced in this PR and now removed?
There was a problem hiding this comment.
It's now left in place, because it saves a lot of .node pass throughs in calls.
| } | ||
|
|
||
| #[derive(Copy, Clone)] | ||
| enum MonitorUpdateSelector { |
There was a problem hiding this comment.
Unnecessary move?
| }, | ||
| 0xfa => { | ||
| bc_link.complete_monitor_updates_for_node(1, &nodes, MonitorUpdateSelector::Last) | ||
| bc_link.complete_monitor_updates_for_node(1, &nodes, MonitorUpdateSelector::Last); |
There was a problem hiding this comment.
Adding semi-colon?
There was a problem hiding this comment.
Generally cleaned up all the match arms for minimal diff
| }, | ||
| 0x5b => { | ||
| harness.payments.send_direct(&harness.nodes, 2, 1, chan_b_id, 100); | ||
| harness.send_direct(2, 1, harness.chan_b_id(), 100); |
There was a problem hiding this comment.
This looks a bit over the top with passing the id obtained from the harness back into it.
There was a problem hiding this comment.
This now moved into the harness
| } | ||
| } | ||
|
|
||
| fn complete_monitor_updates_for_node( |
There was a problem hiding this comment.
This is now on the link, but it is an isolated call to a node. Is there value in this?
There was a problem hiding this comment.
The iteration over channels is actually useful
013356e to
9869de7
Compare
|
@jkczyz, I addressed the points that we identified yesterday, but creating fixups really wasn't possible. Too many (commit) structural changes necessary. I've cleaned up a lot more along the way, along the lines that we already identified. This is the diff since yesterday: https://github.com/lightningdevkit/rust-lightning/compare/a7584c8..9869de76c288bad107079f259500cf99729a3d4b. It is quite a bit, unfortunately. AI is helpful in pointing out the differences. What I did do as an extra validation is take a corpus of pass and failing (under |
|
Rebase conflict is just a formatting thing because #4582 landed. Git cleanly rebases |
Extract the repeated peer-connection and channel-funding setup into small helpers. This leaves the fuzz scenario setup behavior unchanged while making later harness refactors easier to review.
Introduce a small wrapper around each channel manager and its test resources. This keeps node-local state together before moving more operations onto the harness.
Move construction of loggers, keys, monitors, broadcasters, wallets, and fee estimators into node resource setup. This removes ad hoc local closures while preserving the deterministic test inputs used by the fuzzer.
Centralize creation of the three chanmon harness nodes. The fuzzer now initializes the node array through one path, which reduces duplicated setup before the event and payment helpers are split out.
Move persistence, reload, and chain sync state onto each harness node. Keeping serialized managers and heights with the node makes restarts and block updates easier to reason about.
Move the action helpers onto `HarnessNode` methods. Node-local operations now live with the state they mutate, which reduces argument threading through the fuzz loop.
Replace the four directional message vectors with one queue owner. Move per-node queue draining, middle-node routing, and disconnect cleanup into EventQueues so routing behavior lives with the queue state.
Represent each channel pair as a peer link with its channel ids and disconnect state. Link methods now own peer reconnect, disconnect, and monitor-update operations for that channel group.
Move payment bookkeeping into a payment tracker. Sending, resolving, claiming, and stuck-payment assertions now share one state owner instead of borrowing several local maps.
Replace the local test_return macro with a labeled fuzz loop. Keep one invariant check after the loop. Leave harness setup extraction for the next commit.
Collect the chanmon consistency setup, state, and main fuzz flow into a harness. Keep do_test focused on reading fuzz bytes and dispatching actions.
|
🔔 1st Reminder Hey @jkczyz! This PR has been waiting for your review. |
|
🔔 2nd Reminder Hey @jkczyz! This PR has been waiting for your review. |
jkczyz
left a comment
There was a problem hiding this comment.
I went a few rounds with Claude to see if anything could be better organized. Below is what it came up with. The branch is https://github.com/jkczyz/rust-lightning/tree/pr/4571-restructure. Feel free to take anything from there if you'd like.
Summary of changes between pr/4571 and pr/4571-restructure
The original PR is 11 commits; the restructured branch is 14. Same end-state code (modulo small in-commit fixes detailed below), repackaged for easier review.
Net code diff on fuzz/src/chanmon_consistency.rs: +53 / -79 (~26 lines smaller). Every commit builds cleanly under both --cfg fuzzing --cfg secp256k1_fuzz and the same plus --cfg splicing.
Commit-shape changes
Three transitional commits squashed into one
The original PR has three commits that together introduce HarnessNode and adopt it:
- Wrap chanmon nodes in HarnessNode — adds the wrapper struct with three fields plus a
Derefimpl. - Build chanmon node resources — adds
HarnessNode::newplus the rest of the per-node fields (logger, broadcaster, fee_estimator, wallet) and adds parallelArc::clonelocals (monitor_a,keys_manager_a, etc.) that shadow the new fields. - Extract chanmon harness nodes — deletes those parallel locals.
The middle commit introduces a transient parallel-state pattern that the third commit removes. A reviewer reading these in sequence walks past throwaway plumbing twice. They squash cleanly into one commit (Wrap chanmon nodes in HarnessNode) that lands the unified struct in one step. Review surface drops by ~150 lines of intermediate state that no longer needs walking.
Payment-helpers commit split into two
The original Extract chanmon harness payment helpers does two unrelated jobs in one commit:
- Hoists the
process_all_events!macro from inside the0xffarm to the top of the fuzz loop, so it's visible to other arms. - Extracts payment bookkeeping into
PaymentTracker.
These are independent. The hoist is needed because the next commit's payment helpers want to reference the macro from elsewhere; that's the only connection. Splitting them gives reviewers two single-purpose commits:
- Hoist process_all_events out of 0xff arm — relocation only, no semantic change.
- Extract chanmon harness payment tracker — the actual bookkeeping refactor.
The split makes it easy to verify the hoist is purely a relocation (zero semantic change) before reading the larger tracker change.
Build chanmon consistency harness split into three
The original final commit (+1056/-810) does three distinguishable jobs:
- Introduces the
Harness<'a, Out>struct and absorbs the per-node setup logic intoHarness::new. - Converts the
process_msg_events!/process_events!/process_msg_noret!/process_ev_noret!macros into methods onHarness, with extracted helpers (find_destination_node,log_msg_delivery,handle_update_htlcs_event). - Introduces
MppDirectChannelsandMppHopChannelsenums plus theHarness::send_mpp_*methods that consume them.
A reviewer reading +1000 lines as one commit can't easily check whether the macro→method conversion preserves semantics, because it's mixed in with the struct introduction and the MPP enum work. The split:
- Build chanmon harness setup — adds
Harnessstruct andHarness::new.do_testconstructsHarness::new(...)and destructure-moves the fields back into local bindings, so the rest ofdo_testis unchanged. The macros and inline MPP dispatch still live where they were. This commit is a setup-relocation refactor; nothing else moves. - Convert process_msg/process_events to Harness methods — replaces the four message-processing macros with concrete methods on
Harness. Extracts the per-event helpers. Updates dispatch arms to callharness.process_msg_event(...)etc. The_noretnaming is dropped here — methods return()directly. - Introduce Mpp{Direct,Hop}Channels enums — adds the two enums and the methods that take them. Replaces inline MPP dispatch arms with the enum-based calls.
Verified: after the third split commit lands, the file is byte-identical to the original. No semantic change; just three reviewable slices.
In-commit changes (same commit boundaries as original, modified content)
Extract chanmon harness node lifecycle: introduce INITIAL_HTLC_CLEAR_FEERATE constant
The original commit initializes last_htlc_clear_fee with the magic literal 253. The same value is used in FuzzEstimator { ret_val: AtomicU32::new(253) }, but the two are conceptually paired (the manager's perceived initial fee should match what the estimator returns). A named constant INITIAL_HTLC_CLEAR_FEERATE: u32 = 253 makes the relationship visible and the commit's new site self-documenting. Only the commit's new site is updated; the pre-existing FuzzEstimator literal is left alone since it's outside this commit's scope.
Extract chanmon harness node operations: store chan_type on HarnessNode
The original commit's bump_fee_estimate(&mut self, chan_type: ChanType) takes chan_type as a parameter, which means dispatch arms have to pass harness.chan_type (or equivalent) every time. Since chan_type is a run-wide constant — set once from the config byte and never changed — storing it as a field on HarnessNode lets bump_fee_estimate(&mut self) read self.chan_type directly. The dispatch arms simplify to harness.nodes[i].bump_fee_estimate() with no extra argument. Small commit-local refactor; the field follows the existing pattern of other run-wide values stored on the node.
Extract chanmon harness peer links: don't add EventQueues::clear_link here
The original commit introduces clear_link as a method on EventQueues (queue-only operations) but does so inside the peer-links commit because clear_link takes &PeerLink. That places a queue-encapsulation method in a commit nominally about peer-link state. The restructured version inlines the queue-clear logic directly in the peer-link methods that need it (matching (node_a, node_b) against (0,1)/(1,2) and clearing the relevant queue pair). clear_link itself is reintroduced later as a separate consolidation commit (see Move clear_link onto EventQueues below). This separates "introduce PeerLink" from "give EventQueues its own clear_link method."
Extract chanmon harness payment tracker: three small fixes
Three corrections inside the payment-tracker commit:
- Drop the
_noretwrappers. The original commit addssend_noret,send_hop_noret,send_mpp_hop_noret— pure pass-through methods that discard return values.sendreturnsbool(used in end-of-test asserts) and the wrappers exist solely to discard it at match-arm callers. The wrappers can be removed if the few match-arm callers wrap the call with{ payments.send(...); }instead. Naming-leak removed; three methods deleted. - Rename
PaymentTracker.nodes→per_node. The tracker's per-node bookkeeping field is namednodes, which collides with the&[HarnessNode; 3]parameter passed to send methods. Inside method bodies,self.nodes[source_idx].pendingsits next tonodes[source_idx]— two different things spelled the same. Renaming the field toper_noderemoves the collision. - Restore
panic!("Unhandled event: {:?}", event). The original commit drops the formatter from this catch-all panic (becomespanic!("Unhandled event")). For a fuzz harness that's the worst place to lose the variant name in failure output. Restored.
New commits at the tip (consolidations)
Introduce HarnessContext
Run-wide fuzz inputs (out, router, chan_type) are threaded through HarnessNode::new, HarnessNode::reload, and stored separately on Harness. Bundling them into a HarnessContext<'a, Out> type stored on Harness collapses three separate parameters into one (&context) at every call site that consumes them. HarnessNode<'a> is unchanged — HarnessContext is passed as a method parameter, not stored on the node, so the Out generic stays at the method level rather than propagating onto the struct.
Move clear_link onto EventQueues
With EventQueues and PeerLink both defined, clear_link can land where it belongs: as a method on EventQueues (since it's a pure queue operation that the link only identifies which queues to clear). Replaces the inline match-on-(node_a, node_b) blocks (added in the modified peer-links commit) with single-line queues.clear_link(self) calls. Net effect: every commit's diff stays scoped to its own type — the EventQueues commit doesn't introduce PeerLink, the PeerLink commit doesn't introduce queue methods, and a small tip commit consolidates clear_link onto its rightful home once both types exist.
| fee_estimator: Arc<FuzzEstimator>, | ||
| wallet: TestWalletSource, | ||
| persistence_style: ChannelMonitorUpdateStatus, | ||
| serialized_manager: Vec<u8>, |
There was a problem hiding this comment.
Hmm... does serialized_manager belong here? It seems more a property of the test loop rather than of HarnessNode.
| persistence_style: ChannelMonitorUpdateStatus, | ||
| serialized_manager: Vec<u8>, | ||
| height: u32, | ||
| last_htlc_clear_fee: u32, |
There was a problem hiding this comment.
Maybe also for these other fields? Also, it seems persistence_style is only applied once the node is reloaded, though that seems to predate this PR. I wonder if that was intentional.
This PR removes the macro-heavy structure from
chanmon_consistency.rsand rewrites it as explicit Rust code.The main reason is compile time. The macros in this harness slow builds down enough that it becomes very noticeable during iteration. In follow-up force-close fuzzing work, the
chanmon_consistencybuild time increased to around 5 minutes on my machine. That is too expensive for a fuzz target that needs frequent rebuilds.The macros also make the file harder to read and reason about. Replacing them with normal types and methods should make the harness easier to maintain while also reducing the macro expansion cost.