Skip to content

chainreg: use getnetworkinfo to count outbound peers (bitcoind backend)#10686

Open
starius wants to merge 5 commits intolightningnetwork:masterfrom
starius:fix-slow-peerinfo-healthcheck4a
Open

chainreg: use getnetworkinfo to count outbound peers (bitcoind backend)#10686
starius wants to merge 5 commits intolightningnetwork:masterfrom
starius:fix-slow-peerinfo-healthcheck4a

Conversation

@starius
Copy link
Copy Markdown
Collaborator

@starius starius commented Mar 28, 2026

Change Description

Use getnetworkinfo.connections_out for bitcoind outbound peer checks instead of getpeerinfo. This keeps the isolation-safety signal while avoiding heavier per-peer work. This helper is bitcoind-specific, btcd does not currently implement it.

Also made two unrelated cleanups to use structured rpcclient methods instead of RawRequest in the same file.

See #10641 This should partially address it, switching to lighter calls for health-checks. CC @jogc

Fixes #10568

Steps to Test

Verification done in signet:

  1. lncli getinfo works and shows correct block_height.
  2. grep -nE "Health check: chain backend failed after|SRVR: Health check|timed out after" ~/.lnd/logs/bitcoin/signet/lnd.log returns nothing
  3. grep -nE "insufficient number of connected outbound peers|Skipping chain backend peer check" ~/.lnd/logs/bitcoin/signet/lnd.log returns nothing

Also tested loop out and loop in on that node.

Pull Request Checklist

Testing

  • Your PR passes all CI checks.
  • Tests covering the positive and negative (error paths) are included.
  • Bug fixes contain tests triggering the bug to prevent regressions.

Code Style and Documentation

📝 Please see our Contribution Guidelines for further guidance.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request improves the efficiency of chain backend health checks when using bitcoind by utilizing a lighter RPC call for monitoring outbound peer connections. Additionally, it refactors existing code to leverage structured RPC methods, reducing reliance on raw JSON parsing.

Highlights

  • Optimized Peer Health Checks: Switched from using 'getpeerinfo' to 'getnetworkinfo.connections_out' for bitcoind outbound peer checks to reduce resource usage.
  • Code Cleanup: Replaced several instances of 'RawRequest' with structured rpcclient methods for improved type safety and maintainability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the retrieval of bitcoind network information and ZMQ notifications by replacing raw RPC requests and manual JSON unmarshaling with typed methods from the rpcclient. It also introduces a bitcoind-specific outbound peer check and extracts the peer count warning logic into a shared helper function. I have no feedback to provide.

@jogc
Copy link
Copy Markdown

jogc commented Mar 28, 2026

I don't really know anything about the behavior of getnetworkinfo so I can't say.

Here is the latency data I do have if it's of any use (frequency + seconds):

getnetworkinfo

    3         0.00056 -    0.00065
    5         0.00065 -    0.00074
    0         0.00074 -    0.00083
    0         0.00083 -    0.00092
    0         0.00092 -    0.00101
    1         0.00101 -     0.0011
    1          0.0011 -    0.00119
    0         0.00119 -    0.00128
    3         0.00128 -    0.00137
    0         0.00137 -    0.00146
    1         0.00146 -    0.00154
    0         0.00154 -    0.00163
    1         0.00163 -    0.00172
    0         0.00172 -    0.00181
    0         0.00181 -     0.0019
    0          0.0019 -    0.00199
    0         0.00199 -    0.00208
    0         0.00208 -    0.00217
    0         0.00217 -    0.00226
    1         0.00226 -    0.00235
getpeerinfo

19251         0.00088 -   20.21323
   68        20.21323 -   40.42558
   30        40.42558 -   60.63793
   15        60.63793 -   80.85027
   13        80.85027 -  101.06262
   11       101.06262 -  121.27497
    5       121.27497 -  141.48732
    3       141.48732 -  161.69967
    2       161.69967 -  181.91202
    2       181.91202 -  202.12436
    2       202.12436 -  222.33671
    2       222.33671 -  242.54906
    1       242.54906 -  262.76141
    2       262.76141 -  282.97376
    2       282.97376 -   303.1861
    0        303.1861 -  323.39845
    0       323.39845 -   343.6108
    3        343.6108 -  363.82315
    0       363.82315 -   384.0355
    1        384.0355 -  404.24784
uptime

16109         0.00056 -    0.00984
   12         0.00984 -    0.01912
    4         0.01912 -     0.0284
    0          0.0284 -    0.03768
    0         0.03768 -    0.04696
    0         0.04696 -    0.05624
    0         0.05624 -    0.06552
    0         0.06552 -     0.0748
    0          0.0748 -    0.08408
    0         0.08408 -    0.09336
    0         0.09336 -    0.10264
    0         0.10264 -    0.11192
    0         0.11192 -     0.1212
    0          0.1212 -    0.13048
    0         0.13048 -    0.13976
    0         0.13976 -    0.14904
    0         0.14904 -    0.15832
    0         0.15832 -     0.1676
    0          0.1676 -    0.17688
    1         0.17688 -    0.18616

@jogc
Copy link
Copy Markdown

jogc commented Mar 28, 2026

By the way since you are looking at chainreg/chainregistry.go you might want to check out #6483 as well

@starius
Copy link
Copy Markdown
Collaborator Author

starius commented Mar 28, 2026

@jogc Thanks for sharing the latency data! If I read it correctly, getnetworkinfo is much faster than getpeerinfo, especially in its tail. So the transition makes sense.

By the way since you are looking at chainreg/chainregistry.go you might want to check out #6483 as well

Can you elaborate on that, please? That PR replaced an error with a warning. I just refactored the checking code there, to use a higher level method of rpcclient instead of RawRequest. Do you think that code can be further improved?

@starius starius force-pushed the fix-slow-peerinfo-healthcheck4a branch from d8b3ec9 to bccfaea Compare March 29, 2026 03:23
@jogc
Copy link
Copy Markdown

jogc commented Mar 29, 2026

@jogc Thanks for sharing the latency data! If I read it correctly, getnetworkinfo is much faster than getpeerinfo, especially in its tail. So the transition makes sense.

No. There are not enough requests recorded to say anything regarding getnetworkinfo. I assume you have some other reason to believe getnetworkinfo is faster than getpeerinfo.

By the way since you are looking at chainreg/chainregistry.go you might want to check out #6483 as well

Can you elaborate on that, please? That PR replaced an error with a warning. I just refactored the checking code there, to use a higher level method of rpcclient instead of RawRequest. Do you think that code can be further improved?

Sorry I meant #10568

starius added 5 commits March 31, 2026 16:50
Simplify the code. Use rpcclient's typed GetNetworkInfo call in
getBitcoindHealthCheckCmd instead of a raw JSON-RPC request and
manual JSON decoding.
Simplify the code. Use rpcclient's GetZmqNotifications method instead of a raw
request and manual unmarshalling when validating bitcoind ZMQ subscriptions.
The typed result already parses notification addresses, so the extra per-entry
URL parsing is removed.
Use getnetworkinfo.connections_out for bitcoind outbound peer checks instead
of getpeerinfo. This keeps the isolation-safety signal while avoiding heavier
per-peer work.

This helper is bitcoind-specific, btcd does not currently implement it.
The previous warning text ("unable to subscribe to zmq ... events") suggested
that lnd failed to create the ZMQ connection, when in reality it only means the
configured port differs from what bitcoind reports via getzmqnotifications.
Reword both messages to say "port mismatch" and tell to verify the port.

Fixes lightningnetwork#10568
Document that bitcoind outbound peer health checks now use
getnetworkinfo.connections_out instead of getpeerinfo.

Also mention that texts of zmq port mismatch warnings were fixed.
@starius starius force-pushed the fix-slow-peerinfo-healthcheck4a branch from bccfaea to 4bf9810 Compare March 31, 2026 21:50
@starius
Copy link
Copy Markdown
Collaborator Author

starius commented Mar 31, 2026

@jogc Thanks for sharing the latency data! If I read it correctly, getnetworkinfo is much faster than getpeerinfo, especially in its tail. So the transition makes sense.

No. There are not enough requests recorded to say anything regarding getnetworkinfo. I assume you have some other reason to believe getnetworkinfo is faster than getpeerinfo.

I'm judging from Bitcoin Core code. Latency of getpeerinfo is O(number of peers), while getnetworkinfo is constant. getnetworkinfo is significantly faster under high load.

Here is AI provided table comparing latencies of these two calls based on recent bitcoin core version (0831173c0171de33f95b96324db4041c4799b163).

Aspect getnetworkinfo getpeerinfo
Complexity O(1) O(N) where N = peers
cs_main hold 1×, brief work inside N× acquisitions (once per peer)
m_nodes_mutex 3× for counting only 1× held for full iteration + per-node locks
Per-peer locks None ~9 locks × N peers
Data volume ~20 scalar fields ~40 fields × N peers
Worst-case under contention Waits once for cs_main Waits N times for cs_main + blocks connection mgmt via m_nodes_mutex
Typical latency Sub-millisecond Scales linearly with peer count; seconds under contention

Sorry I meant #10568

I fixed the issue with warnings' texts in another commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug]: Misleading warning message when lnd encounters mismatched zmq ports

2 participants