Skip to content

fuzz: remove macros from chanmon_consistency#4571

Open
joostjager wants to merge 11 commits intolightningdevkit:mainfrom
joostjager:fuzz-no-macros
Open

fuzz: remove macros from chanmon_consistency#4571
joostjager wants to merge 11 commits intolightningdevkit:mainfrom
joostjager:fuzz-no-macros

Conversation

@joostjager
Copy link
Copy Markdown
Contributor

@joostjager joostjager commented Apr 20, 2026

This PR removes the macro-heavy structure from chanmon_consistency.rs and rewrites it as explicit Rust code.

The main reason is compile time. The macros in this harness slow builds down enough that it becomes very noticeable during iteration. In follow-up force-close fuzzing work, the chanmon_consistency build time increased to around 5 minutes on my machine. That is too expensive for a fuzz target that needs frequent rebuilds.

The macros also make the file harder to read and reason about. Replacing them with normal types and methods should make the harness easier to maintain while also reducing the macro expansion cost.

@ldk-reviews-bot
Copy link
Copy Markdown

ldk-reviews-bot commented Apr 20, 2026

👋 Thanks for assigning @jkczyz as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@joostjager
Copy link
Copy Markdown
Contributor Author

Pushed result of experimental automatic commit splitting using a stop hook.

@joostjager joostjager force-pushed the fuzz-no-macros branch 6 times, most recently from f370956 to 4e3cce7 Compare April 28, 2026 11:26
@joostjager
Copy link
Copy Markdown
Contributor Author

joostjager commented Apr 28, 2026

Rebased, cleaned up auto-splitted commits, minimized diffs. Currently fuzz CI fails and seems to be a regression.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.16%. Comparing base (0c7e6e7) to head (0d340f8).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4571   +/-   ##
=======================================
  Coverage   87.16%   87.16%           
=======================================
  Files         161      161           
  Lines      109251   109251           
  Branches   109251   109251           
=======================================
+ Hits        95230    95232    +2     
+ Misses      11547    11541    -6     
- Partials     2474     2478    +4     
Flag Coverage Δ
fuzzing-fake-hashes 31.15% <ø> (+<0.01%) ⬆️
fuzzing-real-hashes 23.01% <ø> (+0.08%) ⬆️
tests 86.22% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@joostjager joostjager marked this pull request as ready for review April 29, 2026 06:32
Comment thread fuzz/src/chanmon_consistency.rs
Comment thread fuzz/src/chanmon_consistency.rs
);
for (id, data) in state.pending_monitors.drain(..) {
self.monitor.chain_monitor.channel_monitor_updated(*chan_id, id).unwrap();
if id > state.persisted_monitor_id {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This uses > (strict) while complete_all_pending_monitor_updates at line 1083 uses >=. When watch_channel returns InProgress, persisted_monitor_id is set equal to the pending monitor's id (and persisted_monitor is empty Vec::new()). If this method were ever called on a channel whose initial watch_channel InProgress entry hasn't been completed yet, > would skip updating persisted_monitor, leaving it as an empty vec.

In practice this path isn't reachable today because make_channel always calls complete_all_pending_monitor_updates (with >=) first. But using >= here (matching the other method) would be safer and less surprising for future maintenance.

Same applies to complete_monitor_update at line 1116.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving as is to keep this PR a strict refactor.

@ldk-claude-review-bot
Copy link
Copy Markdown
Collaborator

ldk-claude-review-bot commented Apr 29, 2026

Both previously reported issues are still present and unresolved. I've now completed a thorough review of the entire file including:

  • HarnessNode struct and all methods (new, reload, set_persistence_style, complete_monitor, sync_with_chain_state, bump/reset fee, splice_in/out)
  • EventQueues (take_for_node, push/extend, route_from_middle, clear_link, drain_on_disconnect)
  • PeerLink (disconnect, reconnect, disconnect_for_reload)
  • PaymentTracker (next_payment, send, send_hop, send_mpp_, claim_payment, mark_, assertions)
  • Harness (new, process_msg_events, process_events, settle_all, restart_node, all opcode handlers)
  • make_channel, lock_fundings, connect_peers, build_node_config

No new issues found beyond the two already reported.

Review Summary

Previously reported issues (still unresolved):

  1. fuzz/src/chanmon_consistency.rs:755-757set_persistence_style does not propagate to the live TestPersister::update_ret, silently neutering fuzz coverage for InProgress <-> Completed persistence transitions (opcodes 0x00-0x06).

  2. fuzz/src/chanmon_consistency.rs:767 / 812complete_all_monitor_updates and complete_monitor_update use id > (strict) while complete_all_pending_monitor_updates at line 779 uses id >=. Inconsistency is a latent hazard.

No new issues found on this re-review pass.

@joostjager joostjager removed the request for review from valentinewallace April 29, 2026 06:52
@joostjager
Copy link
Copy Markdown
Contributor Author

joostjager commented Apr 29, 2026

Verified coverage in codecov. It is equal to main. Still reaching the same code paths after the refactor.

@joostjager joostjager force-pushed the fuzz-no-macros branch 2 times, most recently from 6e05ce1 to 6feccc1 Compare April 29, 2026 10:35
Copy link
Copy Markdown
Contributor Author

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pair-review with @jkczyz

Comment thread fuzz/src/chanmon_consistency.rs
Comment thread fuzz/src/chanmon_consistency.rs Outdated

struct PaymentTracker {
pending_payments: [Vec<PaymentId>; 3],
resolved_payments: [HashMap<PaymentId, Option<PaymentHash>>; 3],
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arrays of three, maybe this is indicative of a potential split.

Copy link
Copy Markdown
Contributor Author

@joostjager joostjager May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added NodePayments struct

Comment thread fuzz/src/chanmon_consistency.rs Outdated
last_htlc_clear_fee: u32,
}

impl<'a> std::ops::Deref for HarnessNode<'a> {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was introduced in this PR and now removed?

Copy link
Copy Markdown
Contributor Author

@joostjager joostjager May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now left in place, because it saves a lot of .node pass throughs in calls.

Comment thread fuzz/src/chanmon_consistency.rs Outdated
}

#[derive(Copy, Clone)]
enum MonitorUpdateSelector {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary move?

Copy link
Copy Markdown
Contributor Author

@joostjager joostjager May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved back

Comment thread fuzz/src/chanmon_consistency.rs Outdated
},
0xfa => {
bc_link.complete_monitor_updates_for_node(1, &nodes, MonitorUpdateSelector::Last)
bc_link.complete_monitor_updates_for_node(1, &nodes, MonitorUpdateSelector::Last);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding semi-colon?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally cleaned up all the match arms for minimal diff

Comment thread fuzz/src/chanmon_consistency.rs Outdated
},
0x5b => {
harness.payments.send_direct(&harness.nodes, 2, 1, chan_b_id, 100);
harness.send_direct(2, 1, harness.chan_b_id(), 100);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit over the top with passing the id obtained from the harness back into it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now moved into the harness

Comment thread fuzz/src/chanmon_consistency.rs Outdated
Comment thread fuzz/src/chanmon_consistency.rs
}
}

fn complete_monitor_updates_for_node(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now on the link, but it is an isolated call to a node. Is there value in this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iteration over channels is actually useful

Comment thread fuzz/src/chanmon_consistency.rs Outdated
@joostjager joostjager self-assigned this Apr 30, 2026
@joostjager joostjager force-pushed the fuzz-no-macros branch 4 times, most recently from 013356e to 9869de7 Compare May 1, 2026 14:45
@joostjager joostjager requested a review from jkczyz May 1, 2026 14:49
@joostjager
Copy link
Copy Markdown
Contributor Author

joostjager commented May 1, 2026

@jkczyz, I addressed the points that we identified yesterday, but creating fixups really wasn't possible. Too many (commit) structural changes necessary. I've cleaned up a lot more along the way, along the lines that we already identified.

This is the diff since yesterday: https://github.com/lightningdevkit/rust-lightning/compare/a7584c8..9869de76c288bad107079f259500cf99729a3d4b. It is quite a bit, unfortunately. AI is helpful in pointing out the differences.

What I did do as an extra validation is take a corpus of pass and failing (under splicing cfg), and compare to main. It results in the exact same passes and fails, so I am reasonable confident that this huge refactor is correct.

@joostjager
Copy link
Copy Markdown
Contributor Author

Rebase conflict is just a formatting thing because #4582 landed. Git cleanly rebases

joostjager added 11 commits May 1, 2026 20:31
Extract the repeated peer-connection and channel-funding setup into
small helpers. This leaves the fuzz scenario setup behavior unchanged
while making later harness refactors easier to review.
Introduce a small wrapper around each channel manager and its test
resources. This keeps node-local state together before moving more
operations onto the harness.
Move construction of loggers, keys, monitors, broadcasters, wallets,
and fee estimators into node resource setup. This removes ad hoc local
closures while preserving the deterministic test inputs used by the
fuzzer.
Centralize creation of the three chanmon harness nodes. The fuzzer now
initializes the node array through one path, which reduces duplicated
setup before the event and payment helpers are split out.
Move persistence, reload, and chain sync state onto each harness node.
Keeping serialized managers and heights with the node makes restarts and
block updates easier to reason about.
Move the action helpers onto `HarnessNode` methods. Node-local
operations now live with the state they mutate, which reduces argument
threading through the fuzz loop.
Replace the four directional message vectors with one queue owner.
Move per-node queue draining, middle-node routing, and disconnect
cleanup into EventQueues so routing behavior lives with the queue
state.
Represent each channel pair as a peer link with its channel ids and
disconnect state. Link methods now own peer reconnect, disconnect, and
monitor-update operations for that channel group.
Move payment bookkeeping into a payment tracker. Sending, resolving,
claiming, and stuck-payment assertions now share one state owner instead
of borrowing several local maps.
Replace the local test_return macro with a labeled fuzz loop.

Keep one invariant check after the loop.

Leave harness setup extraction for the next commit.
Collect the chanmon consistency setup, state, and main fuzz
flow into a harness.

Keep do_test focused on reading fuzz bytes and dispatching
actions.
@ldk-reviews-bot
Copy link
Copy Markdown

🔔 1st Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

@ldk-reviews-bot
Copy link
Copy Markdown

🔔 2nd Reminder

Hey @jkczyz! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

Copy link
Copy Markdown
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went a few rounds with Claude to see if anything could be better organized. Below is what it came up with. The branch is https://github.com/jkczyz/rust-lightning/tree/pr/4571-restructure. Feel free to take anything from there if you'd like.

Summary of changes between pr/4571 and pr/4571-restructure

The original PR is 11 commits; the restructured branch is 14. Same end-state code (modulo small in-commit fixes detailed below), repackaged for easier review.

Net code diff on fuzz/src/chanmon_consistency.rs: +53 / -79 (~26 lines smaller). Every commit builds cleanly under both --cfg fuzzing --cfg secp256k1_fuzz and the same plus --cfg splicing.

Commit-shape changes

Three transitional commits squashed into one

The original PR has three commits that together introduce HarnessNode and adopt it:

  • Wrap chanmon nodes in HarnessNode — adds the wrapper struct with three fields plus a Deref impl.
  • Build chanmon node resources — adds HarnessNode::new plus the rest of the per-node fields (logger, broadcaster, fee_estimator, wallet) and adds parallel Arc::clone locals (monitor_a, keys_manager_a, etc.) that shadow the new fields.
  • Extract chanmon harness nodes — deletes those parallel locals.

The middle commit introduces a transient parallel-state pattern that the third commit removes. A reviewer reading these in sequence walks past throwaway plumbing twice. They squash cleanly into one commit (Wrap chanmon nodes in HarnessNode) that lands the unified struct in one step. Review surface drops by ~150 lines of intermediate state that no longer needs walking.

Payment-helpers commit split into two

The original Extract chanmon harness payment helpers does two unrelated jobs in one commit:

  1. Hoists the process_all_events! macro from inside the 0xff arm to the top of the fuzz loop, so it's visible to other arms.
  2. Extracts payment bookkeeping into PaymentTracker.

These are independent. The hoist is needed because the next commit's payment helpers want to reference the macro from elsewhere; that's the only connection. Splitting them gives reviewers two single-purpose commits:

  • Hoist process_all_events out of 0xff arm — relocation only, no semantic change.
  • Extract chanmon harness payment tracker — the actual bookkeeping refactor.

The split makes it easy to verify the hoist is purely a relocation (zero semantic change) before reading the larger tracker change.

Build chanmon consistency harness split into three

The original final commit (+1056/-810) does three distinguishable jobs:

  1. Introduces the Harness<'a, Out> struct and absorbs the per-node setup logic into Harness::new.
  2. Converts the process_msg_events! / process_events! / process_msg_noret! / process_ev_noret! macros into methods on Harness, with extracted helpers (find_destination_node, log_msg_delivery, handle_update_htlcs_event).
  3. Introduces MppDirectChannels and MppHopChannels enums plus the Harness::send_mpp_* methods that consume them.

A reviewer reading +1000 lines as one commit can't easily check whether the macro→method conversion preserves semantics, because it's mixed in with the struct introduction and the MPP enum work. The split:

  • Build chanmon harness setup — adds Harness struct and Harness::new. do_test constructs Harness::new(...) and destructure-moves the fields back into local bindings, so the rest of do_test is unchanged. The macros and inline MPP dispatch still live where they were. This commit is a setup-relocation refactor; nothing else moves.
  • Convert process_msg/process_events to Harness methods — replaces the four message-processing macros with concrete methods on Harness. Extracts the per-event helpers. Updates dispatch arms to call harness.process_msg_event(...) etc. The _noret naming is dropped here — methods return () directly.
  • Introduce Mpp{Direct,Hop}Channels enums — adds the two enums and the methods that take them. Replaces inline MPP dispatch arms with the enum-based calls.

Verified: after the third split commit lands, the file is byte-identical to the original. No semantic change; just three reviewable slices.

In-commit changes (same commit boundaries as original, modified content)

Extract chanmon harness node lifecycle: introduce INITIAL_HTLC_CLEAR_FEERATE constant

The original commit initializes last_htlc_clear_fee with the magic literal 253. The same value is used in FuzzEstimator { ret_val: AtomicU32::new(253) }, but the two are conceptually paired (the manager's perceived initial fee should match what the estimator returns). A named constant INITIAL_HTLC_CLEAR_FEERATE: u32 = 253 makes the relationship visible and the commit's new site self-documenting. Only the commit's new site is updated; the pre-existing FuzzEstimator literal is left alone since it's outside this commit's scope.

Extract chanmon harness node operations: store chan_type on HarnessNode

The original commit's bump_fee_estimate(&mut self, chan_type: ChanType) takes chan_type as a parameter, which means dispatch arms have to pass harness.chan_type (or equivalent) every time. Since chan_type is a run-wide constant — set once from the config byte and never changed — storing it as a field on HarnessNode lets bump_fee_estimate(&mut self) read self.chan_type directly. The dispatch arms simplify to harness.nodes[i].bump_fee_estimate() with no extra argument. Small commit-local refactor; the field follows the existing pattern of other run-wide values stored on the node.

Extract chanmon harness peer links: don't add EventQueues::clear_link here

The original commit introduces clear_link as a method on EventQueues (queue-only operations) but does so inside the peer-links commit because clear_link takes &PeerLink. That places a queue-encapsulation method in a commit nominally about peer-link state. The restructured version inlines the queue-clear logic directly in the peer-link methods that need it (matching (node_a, node_b) against (0,1)/(1,2) and clearing the relevant queue pair). clear_link itself is reintroduced later as a separate consolidation commit (see Move clear_link onto EventQueues below). This separates "introduce PeerLink" from "give EventQueues its own clear_link method."

Extract chanmon harness payment tracker: three small fixes

Three corrections inside the payment-tracker commit:

  • Drop the _noret wrappers. The original commit adds send_noret, send_hop_noret, send_mpp_hop_noret — pure pass-through methods that discard return values. send returns bool (used in end-of-test asserts) and the wrappers exist solely to discard it at match-arm callers. The wrappers can be removed if the few match-arm callers wrap the call with { payments.send(...); } instead. Naming-leak removed; three methods deleted.
  • Rename PaymentTracker.nodesper_node. The tracker's per-node bookkeeping field is named nodes, which collides with the &[HarnessNode; 3] parameter passed to send methods. Inside method bodies, self.nodes[source_idx].pending sits next to nodes[source_idx] — two different things spelled the same. Renaming the field to per_node removes the collision.
  • Restore panic!("Unhandled event: {:?}", event). The original commit drops the formatter from this catch-all panic (becomes panic!("Unhandled event")). For a fuzz harness that's the worst place to lose the variant name in failure output. Restored.

New commits at the tip (consolidations)

Introduce HarnessContext

Run-wide fuzz inputs (out, router, chan_type) are threaded through HarnessNode::new, HarnessNode::reload, and stored separately on Harness. Bundling them into a HarnessContext<'a, Out> type stored on Harness collapses three separate parameters into one (&context) at every call site that consumes them. HarnessNode<'a> is unchanged — HarnessContext is passed as a method parameter, not stored on the node, so the Out generic stays at the method level rather than propagating onto the struct.

Move clear_link onto EventQueues

With EventQueues and PeerLink both defined, clear_link can land where it belongs: as a method on EventQueues (since it's a pure queue operation that the link only identifies which queues to clear). Replaces the inline match-on-(node_a, node_b) blocks (added in the modified peer-links commit) with single-line queues.clear_link(self) calls. Net effect: every commit's diff stays scoped to its own type — the EventQueues commit doesn't introduce PeerLink, the PeerLink commit doesn't introduce queue methods, and a small tip commit consolidates clear_link onto its rightful home once both types exist.

fee_estimator: Arc<FuzzEstimator>,
wallet: TestWalletSource,
persistence_style: ChannelMonitorUpdateStatus,
serialized_manager: Vec<u8>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... does serialized_manager belong here? It seems more a property of the test loop rather than of HarnessNode.

Comment on lines +955 to +958
persistence_style: ChannelMonitorUpdateStatus,
serialized_manager: Vec<u8>,
height: u32,
last_htlc_clear_fee: u32,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also for these other fields? Also, it seems persistence_style is only applied once the node is reloaded, though that seems to predate this PR. I wonder if that was intentional.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants