instantout: harden instant out error handling#1143
Conversation
The instant out client indexes peer nonces and signatures by reservation index while signing and finalizing MuSig transactions. A server response with too few peer values can panic loopd before an error is returned. Validate the peer-provided slice counts against the reservation inputs before indexing them.
New instant outs were added to the active swap map before the start action generated the swap preimage and hash. That stores the FSM under the zero hash, so later lookups by the real hash fail and concurrent swaps can overwrite the same zero-hash entry. Insert the active FSM only after initialization reaches the expected state and the real hash is available.
The payment polling action read from the payment status and error channels without checking whether they were closed. A closed error channel yielded a nil error, and context cancellation also passed nil into HandleError, allowing the observer's abort-on-error path to return nil for a failed swap. Convert closed channels, nil channel errors, and cancellation into explicit errors before failing the FSM.
The error cleanup path derived unlock and cancel contexts from the caller context. When the caller context was already canceled, cleanup could fail immediately; the async server cancel also captured a context that was canceled by the parent function's defer. Use independent timeout contexts for local reservation unlocks and the server cancel RPC.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces several hardening fixes to the instant out implementation. The changes focus on improving the robustness of the finite state machine (FSM) by addressing edge cases in error handling, validating critical data structures before indexing, and ensuring that resource cleanup tasks are not prematurely aborted by canceled contexts. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request enhances the robustness of the instantout package by adding channel closure checks, validating MuSig2 session and signature lengths, and refactoring mutex locking in the manager. It also ensures that cleanup operations like unlocking reservations use a fresh background context to avoid failures when the parent context is canceled. Review feedback suggested renaming a shadowed context variable in a goroutine to improve code clarity and maintain consistency with other parts of the file.
| ctx, cancel := context.WithTimeout( | ||
| context.Background(), time.Second*30, | ||
| ) | ||
| defer cancel() | ||
| _, cancelErr := f.cfg.InstantOutClient.CancelInstantSwap( | ||
| ctx, &swapserverrpc.CancelInstantSwapRequest{ |
There was a problem hiding this comment.
For consistency with the context handling for UnlockReservation a few lines above (which uses unlockCtx), it would be clearer to use a distinct name for this new context instead of shadowing the parent function's ctx variable. This improves readability and reduces the chance of accidentally using the outer, possibly canceled, context in the future.
| ctx, cancel := context.WithTimeout( | |
| context.Background(), time.Second*30, | |
| ) | |
| defer cancel() | |
| _, cancelErr := f.cfg.InstantOutClient.CancelInstantSwap( | |
| ctx, &swapserverrpc.CancelInstantSwapRequest{ | |
| cancelCtx, cancel := context.WithTimeout( | |
| context.Background(), time.Second*30, | |
| ) | |
| defer cancel() | |
| _, cancelErr := f.cfg.InstantOutClient.CancelInstantSwap( | |
| cancelCtx, &swapserverrpc.CancelInstantSwapRequest{ |
|
Superseded by #1144. |
Summary
This PR applies small hardening fixes for the instant out implementation:
Why
The previous logic could panic on malformed peer MuSig data, keep active swaps under the zero hash, return nil for failed payment stream paths, or skip cleanup when the caller context was already canceled.
Validation
go test ./instantout/...