Skip to content

ci: reduce nproc multipliers#264

Merged
ryanofsky merged 1 commit intobitcoin-core:masterfrom
Sjors:2026/03/chill-ci
Mar 26, 2026
Merged

ci: reduce nproc multipliers#264
ryanofsky merged 1 commit intobitcoin-core:masterfrom
Sjors:2026/03/chill-ci

Conversation

@Sjors Sjors marked this pull request as ready for review March 26, 2026 10:57
@DrahtBot
Copy link
Copy Markdown

DrahtBot commented Mar 26, 2026

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK ryanofsky

If your review is incorrectly listed, please copy-paste <!--meta-tag:bot-skip--> into the comment that the bot should ignore.

@Sjors
Copy link
Copy Markdown
Member Author

Sjors commented Mar 26, 2026

This seems more stable, given that it ran for an hour without failing. But now it takes too long, so the total number of jobs needs to be reduced (we can add a long runner on master later).

Pushed again: https://github.com/bitcoin-core/libmultiprocess/compare/31a4b41ac12d42f9b2112165e175695963844ddb..cb9586c9e8ad65ca962e97030761749bc6196b9e

@Sjors Sjors force-pushed the 2026/03/chill-ci branch from f0d07e7 to cb9586c Compare March 26, 2026 12:03
@ryanofsky
Copy link
Copy Markdown
Collaborator

This seems more stable, given that it ran for an hour without failing. But now it takes too long, so the total number of jobs needs to be reduced (we can add a long runner on master later).

It seems like there is a big disparity in the time it takes to run the interface_ipc.py test, which is 1-3 sec, and the time it takes to interface_ipc_mining.py test, which is around 2 minutes. Ideally, we would not need to reduce the number of times the first test is run just because the second test is slow.

The disparity could also explain the reason why the resource exhaustion doesn't happen right away, because if the second test uses more resources than the first test, the runner will start off executing using equal numbers of both tests and the resource limit won't be hit. But as instances of the first test quickly finish, more copies of the second test will be running and resource usage will gradually increase.

I think a good solution to these problems might be simplify the run_ipc_functional_tests function to just run test_runner.py twice and use a higher number of runs and higher job count for the cheap test and a lower number of runs and lower job count for the expensive test. We could replace functional_test_runs and nproc_multiplier with separate variables for each test to make this configurable, or hardcode values into the function, or use different multiplier variables

@Sjors
Copy link
Copy Markdown
Member Author

Sjors commented Mar 26, 2026

I reduced the number of runs further.

Ideally, we would not need to reduce the number of times the first test is run just because the second test is slow.

I don't think these tests "deserve" equal run time. The mining interface is far more likely to have bugs, so the fact that uses proportionally more runtime seems fine to me.

We could replace functional_test_runs and nproc_multiplier with separate variables for each test to make this configurable

For this PR I'd rather not increase complexity. In general I tried to write this in a way that if Bitcoin Core adds new IPC functional tests, or splits the big mining test, they get picked up, but then the number of variables would have to keep increasing.

@Sjors Sjors force-pushed the 2026/03/chill-ci branch from cb9586c to b47bcec Compare March 26, 2026 12:40
@Sjors Sjors force-pushed the 2026/03/chill-ci branch from b47bcec to 3360233 Compare March 26, 2026 12:41
@ryanofsky
Copy link
Copy Markdown
Collaborator

ryanofsky commented Mar 26, 2026

I don't think these tests "deserve" equal run time.

The current approach seems ok, but I think if the goal of the repeating the tests is to detect race conditions, the faster tests will be more effective at triggering races so it makes sense to run them more times. Similarly, if one test uses less resources than the other, it could make sense to use a higher number of jobs for it so resource usage is maximized.

So I do think running just test runner twice as suggested could help us get more utility out of these tests and not add much complexity, though I'd agree it would not be an ideal solution.

Probably more ideally the test runner would accept a --repeat-for option that automatically repeats tests for a number of seconds or minutes, continuously adding tests to the queue and roughly giving equal runtime to each test so slow/expensive tests do not dominate the queue. (A simple approach might be to add new test to the queue each time an existing test completes, choosing the test that has the least total runtime at that point.)

Copy link
Copy Markdown
Collaborator

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 3360233. Thanks for the fix!

Happy to merge as-is or you can let me know if you want to make more tweaks

@Sjors
Copy link
Copy Markdown
Member Author

Sjors commented Mar 26, 2026

The "run functional test" phases are still slightly longer than I would like, but it's good enough. Assuming TSan passes, go ahead and merge.

@Sjors
Copy link
Copy Markdown
Member Author

Sjors commented Mar 26, 2026

the test runner would accept a --repeat-for option that automatically repeats tests for a number of seconds or minutes

Yeah that would be much better.

@ryanofsky ryanofsky merged commit e606fd8 into bitcoin-core:master Mar 26, 2026
13 checks passed
@Sjors Sjors deleted the 2026/03/chill-ci branch March 26, 2026 13:20
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Mar 26, 2026
…bf18

ff0eed1bf18 refactor: Use loop variable in type-context.h
ff1d8ba172a refactor: Move type-context.h getParams() call closer to use
1dbc59a4aa3 race fix: m_on_cancel called after request finishes
1643d05ba07 test: m_on_cancel called after request finishes
f5509a31fcc race fix: getParams() called after request cancel
4a60c39f24a test: getParams() called after request cancel
f11ec29ed20 race fix: worker thread destroyed before it is initialized
a1d643348f4 test: worker thread destroyed before it is initialized
e606fd84a8c Merge bitcoin-core/libmultiprocess#264: ci: reduce nproc multipliers
336023382c4 ci: reduce nproc multipliers
b090beb9651 Merge bitcoin-core/libmultiprocess#256: ci: cache gnu32 nix store
be8622816da ci: cache gnu32 nix store
975270b619c Merge bitcoin-core/libmultiprocess#263: ci: bump timeout factor to 40
09f10e5a598 ci: bump timeout factor to 40
db8f76ad290 Merge bitcoin-core/libmultiprocess#253: ci: run some Bitcoin Core CI jobs
55a9b557b19 ci: set Bitcoin Core CI test repetition
fb0fc84d556 ci: add TSan job with instrumented libc++
0f29c38725b ci: add Bitcoin Core IPC tests (ASan + macOS)
3f64320315d Merge bitcoin-core/libmultiprocess#262: ci: enable clang-tidy in macOS job, use nullptr
cd9f8bdc9f0 Merge bitcoin-core/libmultiprocess#258: log: add socket connected info message and demote destroy logs to debug
b5d6258a42f Merge bitcoin-core/libmultiprocess#255: fix: use unsigned char cast and sizeof in LogEscape escape sequence
d94688e2c32 Merge bitcoin-core/libmultiprocess#251: Improved CustomBuildField for std::optional in IPC/libmultiprocess
a9499fad755 mp: use nullptr with pthread_threadid_np
f499e37850f ci: enable clang-tidy in macOS job
98f1352159d log: add socket connected info message and demote destroy logs to debug
554a481ea73 fix: use unsigned char cast and sizeof in LogEscape escape sequence
1977b9f3f65 Use std::forward in CustomBuildField for std::optional to allow move semantics, resolves FIXME
22bec918c97 Merge bitcoin-core/libmultiprocess#247: type-map: Work around LLVM 22 "out of bounds index" error
8a5e3ae6ed2 Merge bitcoin-core/libmultiprocess#242: proxy-types: add CustomHasField hook to map Cap'n Proto values to null C++ values
e8d35246918 Merge bitcoin-core/libmultiprocess#246: doc: Bump version 8 > 9
97d877053b6 proxy-types: add CustomHasField hook for nullable decode paths
8c2f10252c9 refactor: add missing includes to mp/type-data.h
b1638aceb40 doc: Bump version 8 > 9
f61af487217 type-map: Work around LLVM 22 "out of bounds index" error

git-subtree-dir: src/ipc/libmultiprocess
git-subtree-split: ff0eed1bf183627c89007e71d631f039bd61bfb5
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Mar 27, 2026
…bf18

ff0eed1bf18 refactor: Use loop variable in type-context.h
ff1d8ba172a refactor: Move type-context.h getParams() call closer to use
1dbc59a4aa3 race fix: m_on_cancel called after request finishes
1643d05ba07 test: m_on_cancel called after request finishes
f5509a31fcc race fix: getParams() called after request cancel
4a60c39f24a test: getParams() called after request cancel
f11ec29ed20 race fix: worker thread destroyed before it is initialized
a1d643348f4 test: worker thread destroyed before it is initialized
e606fd84a8c Merge bitcoin-core/libmultiprocess#264: ci: reduce nproc multipliers
336023382c4 ci: reduce nproc multipliers
b090beb9651 Merge bitcoin-core/libmultiprocess#256: ci: cache gnu32 nix store
be8622816da ci: cache gnu32 nix store
975270b619c Merge bitcoin-core/libmultiprocess#263: ci: bump timeout factor to 40
09f10e5a598 ci: bump timeout factor to 40
db8f76ad290 Merge bitcoin-core/libmultiprocess#253: ci: run some Bitcoin Core CI jobs
55a9b557b19 ci: set Bitcoin Core CI test repetition
fb0fc84d556 ci: add TSan job with instrumented libc++
0f29c38725b ci: add Bitcoin Core IPC tests (ASan + macOS)
3f64320315d Merge bitcoin-core/libmultiprocess#262: ci: enable clang-tidy in macOS job, use nullptr
cd9f8bdc9f0 Merge bitcoin-core/libmultiprocess#258: log: add socket connected info message and demote destroy logs to debug
b5d6258a42f Merge bitcoin-core/libmultiprocess#255: fix: use unsigned char cast and sizeof in LogEscape escape sequence
d94688e2c32 Merge bitcoin-core/libmultiprocess#251: Improved CustomBuildField for std::optional in IPC/libmultiprocess
a9499fad755 mp: use nullptr with pthread_threadid_np
f499e37850f ci: enable clang-tidy in macOS job
98f1352159d log: add socket connected info message and demote destroy logs to debug
554a481ea73 fix: use unsigned char cast and sizeof in LogEscape escape sequence
1977b9f3f65 Use std::forward in CustomBuildField for std::optional to allow move semantics, resolves FIXME
22bec918c97 Merge bitcoin-core/libmultiprocess#247: type-map: Work around LLVM 22 "out of bounds index" error
8a5e3ae6ed2 Merge bitcoin-core/libmultiprocess#242: proxy-types: add CustomHasField hook to map Cap'n Proto values to null C++ values
e8d35246918 Merge bitcoin-core/libmultiprocess#246: doc: Bump version 8 > 9
97d877053b6 proxy-types: add CustomHasField hook for nullable decode paths
8c2f10252c9 refactor: add missing includes to mp/type-data.h
b1638aceb40 doc: Bump version 8 > 9
f61af487217 type-map: Work around LLVM 22 "out of bounds index" error

git-subtree-dir: src/ipc/libmultiprocess
git-subtree-split: ff0eed1bf183627c89007e71d631f039bd61bfb5
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Mar 27, 2026
…da8f

70f632bda8f Merge bitcoin-core/libmultiprocess#265: ci: set LC_ALL in shell scripts
8e8e564259a Merge bitcoin-core/libmultiprocess#249: fixes for race conditions on disconnects
05d34cc2ec3 ci: set LC_ALL in shell scripts
e606fd84a8c Merge bitcoin-core/libmultiprocess#264: ci: reduce nproc multipliers
ff0eed1bf18 refactor: Use loop variable in type-context.h
ff1d8ba172a refactor: Move type-context.h getParams() call closer to use
1dbc59a4aa3 race fix: m_on_cancel called after request finishes
1643d05ba07 test: m_on_cancel called after request finishes
f5509a31fcc race fix: getParams() called after request cancel
4a60c39f24a test: getParams() called after request cancel
f11ec29ed20 race fix: worker thread destroyed before it is initialized
a1d643348f4 test: worker thread destroyed before it is initialized
336023382c4 ci: reduce nproc multipliers
b090beb9651 Merge bitcoin-core/libmultiprocess#256: ci: cache gnu32 nix store
be8622816da ci: cache gnu32 nix store
975270b619c Merge bitcoin-core/libmultiprocess#263: ci: bump timeout factor to 40
09f10e5a598 ci: bump timeout factor to 40
db8f76ad290 Merge bitcoin-core/libmultiprocess#253: ci: run some Bitcoin Core CI jobs
55a9b557b19 ci: set Bitcoin Core CI test repetition
fb0fc84d556 ci: add TSan job with instrumented libc++
0f29c38725b ci: add Bitcoin Core IPC tests (ASan + macOS)
3f64320315d Merge bitcoin-core/libmultiprocess#262: ci: enable clang-tidy in macOS job, use nullptr
cd9f8bdc9f0 Merge bitcoin-core/libmultiprocess#258: log: add socket connected info message and demote destroy logs to debug
b5d6258a42f Merge bitcoin-core/libmultiprocess#255: fix: use unsigned char cast and sizeof in LogEscape escape sequence
d94688e2c32 Merge bitcoin-core/libmultiprocess#251: Improved CustomBuildField for std::optional in IPC/libmultiprocess
a9499fad755 mp: use nullptr with pthread_threadid_np
f499e37850f ci: enable clang-tidy in macOS job
98f1352159d log: add socket connected info message and demote destroy logs to debug
554a481ea73 fix: use unsigned char cast and sizeof in LogEscape escape sequence
1977b9f3f65 Use std::forward in CustomBuildField for std::optional to allow move semantics, resolves FIXME
22bec918c97 Merge bitcoin-core/libmultiprocess#247: type-map: Work around LLVM 22 "out of bounds index" error
8a5e3ae6ed2 Merge bitcoin-core/libmultiprocess#242: proxy-types: add CustomHasField hook to map Cap'n Proto values to null C++ values
e8d35246918 Merge bitcoin-core/libmultiprocess#246: doc: Bump version 8 > 9
97d877053b6 proxy-types: add CustomHasField hook for nullable decode paths
8c2f10252c9 refactor: add missing includes to mp/type-data.h
b1638aceb40 doc: Bump version 8 > 9
f61af487217 type-map: Work around LLVM 22 "out of bounds index" error

git-subtree-dir: src/ipc/libmultiprocess
git-subtree-split: 70f632bda8f80449b6240f98da768206a535a04e
ryanofsky added a commit to ryanofsky/bitcoin that referenced this pull request Mar 30, 2026
…da8f

70f632bda8f Merge bitcoin-core/libmultiprocess#265: ci: set LC_ALL in shell scripts
8e8e564259a Merge bitcoin-core/libmultiprocess#249: fixes for race conditions on disconnects
05d34cc2ec3 ci: set LC_ALL in shell scripts
e606fd84a8c Merge bitcoin-core/libmultiprocess#264: ci: reduce nproc multipliers
ff0eed1bf18 refactor: Use loop variable in type-context.h
ff1d8ba172a refactor: Move type-context.h getParams() call closer to use
1dbc59a4aa3 race fix: m_on_cancel called after request finishes
1643d05ba07 test: m_on_cancel called after request finishes
f5509a31fcc race fix: getParams() called after request cancel
4a60c39f24a test: getParams() called after request cancel
f11ec29ed20 race fix: worker thread destroyed before it is initialized
a1d643348f4 test: worker thread destroyed before it is initialized
336023382c4 ci: reduce nproc multipliers
b090beb9651 Merge bitcoin-core/libmultiprocess#256: ci: cache gnu32 nix store
be8622816da ci: cache gnu32 nix store
975270b619c Merge bitcoin-core/libmultiprocess#263: ci: bump timeout factor to 40
09f10e5a598 ci: bump timeout factor to 40
db8f76ad290 Merge bitcoin-core/libmultiprocess#253: ci: run some Bitcoin Core CI jobs
55a9b557b19 ci: set Bitcoin Core CI test repetition
fb0fc84d556 ci: add TSan job with instrumented libc++
0f29c38725b ci: add Bitcoin Core IPC tests (ASan + macOS)
3f64320315d Merge bitcoin-core/libmultiprocess#262: ci: enable clang-tidy in macOS job, use nullptr
cd9f8bdc9f0 Merge bitcoin-core/libmultiprocess#258: log: add socket connected info message and demote destroy logs to debug
b5d6258a42f Merge bitcoin-core/libmultiprocess#255: fix: use unsigned char cast and sizeof in LogEscape escape sequence
d94688e2c32 Merge bitcoin-core/libmultiprocess#251: Improved CustomBuildField for std::optional in IPC/libmultiprocess
a9499fad755 mp: use nullptr with pthread_threadid_np
f499e37850f ci: enable clang-tidy in macOS job
98f1352159d log: add socket connected info message and demote destroy logs to debug
554a481ea73 fix: use unsigned char cast and sizeof in LogEscape escape sequence
1977b9f3f65 Use std::forward in CustomBuildField for std::optional to allow move semantics, resolves FIXME
22bec918c97 Merge bitcoin-core/libmultiprocess#247: type-map: Work around LLVM 22 "out of bounds index" error
8a5e3ae6ed2 Merge bitcoin-core/libmultiprocess#242: proxy-types: add CustomHasField hook to map Cap'n Proto values to null C++ values
e8d35246918 Merge bitcoin-core/libmultiprocess#246: doc: Bump version 8 > 9
97d877053b6 proxy-types: add CustomHasField hook for nullable decode paths
8c2f10252c9 refactor: add missing includes to mp/type-data.h
b1638aceb40 doc: Bump version 8 > 9
f61af487217 type-map: Work around LLVM 22 "out of bounds index" error

git-subtree-dir: src/ipc/libmultiprocess
git-subtree-split: 70f632bda8f80449b6240f98da768206a535a04e
fanquake added a commit to bitcoin/bitcoin that referenced this pull request Mar 30, 2026
…n disconnects

2478a15 Squashed 'src/ipc/libmultiprocess/' changes from 1868a84451f..70f632bda8f (Ryan Ofsky)

Pull request description:

  Includes:

  - bitcoin-core/libmultiprocess#246
  - bitcoin-core/libmultiprocess#242
  - bitcoin-core/libmultiprocess#247
  - bitcoin-core/libmultiprocess#251
  - bitcoin-core/libmultiprocess#255
  - bitcoin-core/libmultiprocess#258
  - bitcoin-core/libmultiprocess#262
  - bitcoin-core/libmultiprocess#253
  - bitcoin-core/libmultiprocess#263
  - bitcoin-core/libmultiprocess#256
  - bitcoin-core/libmultiprocess#264
  - bitcoin-core/libmultiprocess#249
  - bitcoin-core/libmultiprocess#265

  The main change is bitcoin-core/libmultiprocess#249 which fixes 3 intermittent race conditions detected in bitcoin core CI and antithesis: #34711/#34756, #34777, and #34782.

  The changes can be verified by running `test/lint/git-subtree-check.sh src/ipc/libmultiprocess` as described in [developer notes](https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#subtrees) and [lint instructions](https://github.com/bitcoin/bitcoin/tree/master/test/lint#git-subtree-checksh)

ACKs for top commit:
  Sjors:
    ACK 613a548
  ismaelsadeeq:
    ACK 613a548

Tree-SHA512: d99eebc8b4f45b3c3099298167362cf5e7f3e9e622eef9f17af56388ee5207d77a04b915b2a5a894493e0395aeda70111216f2da0d2a6553f4f6396b3d31a744
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants