ci: run some Bitcoin Core CI jobs #253
Conversation
|
@ryanofsky can you expand the whitelist with (that's what's causing the new workflows to get ignored, see annotations here: https://github.com/bitcoin-core/libmultiprocess/actions/runs/22963039475) |
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste |
|
Thanks, the jobs run now, I'll debug them later. |
|
|
||
| - name: Run IPC and miner unit tests | ||
| run: | | ||
| ctest --test-dir build -R "ipc|miner_tests" --output-on-failure --timeout 480 |
There was a problem hiding this comment.
same. Also could use the longer --tests-regex instead of -R to be more self-explanatory?
| sudo apt-get install --no-install-recommends -y ${{ matrix.packages }} | ||
| sudo update-alternatives --install /usr/bin/clang++ clang++ "/usr/bin/clang++-${LLVM_VERSION}" 100 | ||
| sudo update-alternatives --install /usr/bin/clang clang "/usr/bin/clang-${LLVM_VERSION}" 100 | ||
| sudo update-alternatives --install /usr/bin/llvm-symbolizer llvm-symbolizer "/usr/bin/llvm-symbolizer-${LLVM_VERSION}" 100 |
There was a problem hiding this comment.
I wonder if we can split the upstream install script into smaller re-useable bash or python scripts, to be used here?
There was a problem hiding this comment.
I'm not sure about trying to expose bash scripts as a sort of stable CI API, which would essentially give us an undocumented and untested consumer. Inevitably, changes in our repo will break things, and we'll have another loop of having to re-refactor/change, based on requirements here, that nobody in our repository is accounting for when making changes there.
There was a problem hiding this comment.
In general we'll need to wait and see if these Bitcoin Core jobs are maintainable here over the long run. If it becomes more hassle that just opening a draft PR on bitcoin/bitcoin then we might just drop them.
a1e0fb1 to
cb8a9e8
Compare
|
All green. Pushed again to wire up ccache. |
Well, then this is obviously wrong, because it fails to find any of the countless bugs that still exist. It would be good to find at least one of them. Otherwise, this seems pointless? |
I think it makes more sense to try to find the bugs without the new tests. The goal here isn't to catch bugs in newly added deterministic tests, but to catch intermittent bugs in the vanilla (pre-existing) subtree. |
|
I added a third commit for tweaking the frequencies. All three jobs now run the unit and functional tests 200 times. |
|
Test run times:
Unit tests are still running, from previous experience the functional tests took about 1 second each. I'd like to keep the job runtimes under 15 minutes:
Assuming 3 minutes for setup, 4 minutes for unit tests and 4 minutes for functional tests, I'll tweak the numbers. (also forgot to install ccache on macOS) |
|
nice. Let's hope one of them fails 😅 |
|
ok, none failed :( |
|
I'm lowering the unit tests a bit. Also noticed I forgot to run the |
| run: | | ||
| for _ in $(seq 1 "${TSAN_FUNCTIONAL_TEST_RUNS}"); do | ||
| LD_LIBRARY_PATH="depends/${{ steps.host.outputs.host }}/lib" \ | ||
| build/test/functional/test_runner.py --filter='^interface_ipc' --jobs "${PARALLEL}" --timeout-factor=10 |
There was a problem hiding this comment.
there are only two tests, so the --jobs PARALLEL won't make any difference, also this is running sequentially over TSAN_FUNCTIONAL_TEST_RUNS
Maybe the GHA machines just won't trigger the failures here?
Has anyone reproduced any of the failures locally, when running the tests?
There was a problem hiding this comment.
Yeah, I got confused there. You can't use --filter to run tests multiple times, you have to pass the names in multiple times. The latest push should fix this.
(well, the next push...)
Agreed, done. There's a few one-liners that might not worthy of a function (e.g. |
ad75b14 test: scale IPC mining wait timeouts by timeout_factor (Enoch Azariah) e7a918b test: verify IPC error handling for invalid coinbase (Enoch Azariah) 63684d6 test: move make_mining_ctx to ipc_util.py (Enoch Azariah) 4ada575 test: verify createNewBlock wakes promptly when tip advances (Enoch Azariah) Pull request description: This is a follow-up to implement a couple of test improvements discussed in recent IPC PRs and issues. - adds a test to `interface_ipc_mining.py` to verify that `createNewBlock` wakes up immediately when the tip advances, rather than waiting for the cooldown timer to expire (#34184 (comment)). - moves `make_mining_ctx` into `ipc_util.py` so it can be reused across the IPC tests instead of duplicating the setup code (#34422 (comment)). - adds a test case to verify that providing an invalid coinbase to `submitSolution` returns a remote exception instead of crashing the node, closing the loop on the issue reported in #33341. - scales IPC wait timeouts using the test suite's `timeout_factor` to prevent spurious failures in heavily loaded CI environments, capping extended waits to avoid test runner hangs (bitcoin-core/libmultiprocess#253 (comment)). Closes #33341. ACKs for top commit: Sjors: ACK ad75b14 achow101: ACK ad75b14 sedited: ACK ad75b14 Tree-SHA512: 812aa64c03657761f06707e6a15b8b435ab7c75717a6748a919fcbcae317128e18403b0c1bddd4cdad877d286e69db52389633e4012faaa656acc01939091719
The IPC mining tests (interface_ipc_mining.py) currently use hardcoded timeouts (e.g., 1000ms, 60000ms) for operations like waitTipChanged and waiting for block templates. In heavily loaded CI environments, such as those running sanitizers with high parallelism, these hardcoded timeouts can be too short, leading to spurious test failures and brittleness. This commit multiplies these timeout variables by the test suite's global `self.options.timeout_factor`. This ensures that the IPC wait conditions scale appropriately when the test suite is run with a higher timeout factor, making the tests robust against slow execution environments. Addresses CI brittleness observed in bitcoin-core/libmultiprocess#253. Github-Pull: bitcoin#34727 Rebased-From: ad75b14
ryanofsky
left a comment
There was a problem hiding this comment.
Light code review ACK 55a9b55. I looked at all the code here and confirmed it should be safe and the build steps make sense. But did not look closely enough to follow all the internal logic and be very sure there aren't bugs.
Thanks for factoring out the command lines into bash functions! IMO the jobs are much easier to understand now.
I probably would have gone further and moved configuration of the jobs as well to bash in /ci/configs/. And instead of defining lower-level bash hooks like add_llvm_apt_repository, install_apt_packages, install_llvm_alternatives I would have defined higher level-hooks like install_deps. But the current approach seems reasonable too.
As a followup, it might be good if bitcoin core CI could follow the model of this PR and move its low-level command lines into libraries like ci_helpers.sh and bitcoin_core_ci.sh. If it did this we might be able to delete this code entirely from the libmultiprocess repository and just call the bitcoin core definitions.
Alternately, maybe the approach marco originally suggested in #250 (comment) which invokes bitcoin core's github actions could work. I'm not sure if there was a specific reason this PR did not take that approach.
Bitcoin Core already does this. You can call (let's say)
I think there is no hook for |
|
So it sounds like this PR were modified to call bitcoin core's test_run_all.sh or call its github actions, it might lose the ability to:
(1) and (2) do seem pretty useful and not having (3) seems like it could be a major drawback if every libmultiprocess change broke the depends cache and required depends rebuilds, but unclear if that would happen. In any case, it sound like it might be easier to modify bitcoin core's ci to support these things than I initially though. |
|
One advantage of moving towards the Bitcoin Core approach of doing ~everything with bash scripts, is that it's more CI agnostic. |
I think they are all supported already:
Not sure if there is a hook for |
…bf18 ff0eed1bf18 refactor: Use loop variable in type-context.h ff1d8ba172a refactor: Move type-context.h getParams() call closer to use 1dbc59a4aa3 race fix: m_on_cancel called after request finishes 1643d05ba07 test: m_on_cancel called after request finishes f5509a31fcc race fix: getParams() called after request cancel 4a60c39f24a test: getParams() called after request cancel f11ec29ed20 race fix: worker thread destroyed before it is initialized a1d643348f4 test: worker thread destroyed before it is initialized e606fd84a8c Merge bitcoin-core/libmultiprocess#264: ci: reduce nproc multipliers 336023382c4 ci: reduce nproc multipliers b090beb9651 Merge bitcoin-core/libmultiprocess#256: ci: cache gnu32 nix store be8622816da ci: cache gnu32 nix store 975270b619c Merge bitcoin-core/libmultiprocess#263: ci: bump timeout factor to 40 09f10e5a598 ci: bump timeout factor to 40 db8f76ad290 Merge bitcoin-core/libmultiprocess#253: ci: run some Bitcoin Core CI jobs 55a9b557b19 ci: set Bitcoin Core CI test repetition fb0fc84d556 ci: add TSan job with instrumented libc++ 0f29c38725b ci: add Bitcoin Core IPC tests (ASan + macOS) 3f64320315d Merge bitcoin-core/libmultiprocess#262: ci: enable clang-tidy in macOS job, use nullptr cd9f8bdc9f0 Merge bitcoin-core/libmultiprocess#258: log: add socket connected info message and demote destroy logs to debug b5d6258a42f Merge bitcoin-core/libmultiprocess#255: fix: use unsigned char cast and sizeof in LogEscape escape sequence d94688e2c32 Merge bitcoin-core/libmultiprocess#251: Improved CustomBuildField for std::optional in IPC/libmultiprocess a9499fad755 mp: use nullptr with pthread_threadid_np f499e37850f ci: enable clang-tidy in macOS job 98f1352159d log: add socket connected info message and demote destroy logs to debug 554a481ea73 fix: use unsigned char cast and sizeof in LogEscape escape sequence 1977b9f3f65 Use std::forward in CustomBuildField for std::optional to allow move semantics, resolves FIXME 22bec918c97 Merge bitcoin-core/libmultiprocess#247: type-map: Work around LLVM 22 "out of bounds index" error 8a5e3ae6ed2 Merge bitcoin-core/libmultiprocess#242: proxy-types: add CustomHasField hook to map Cap'n Proto values to null C++ values e8d35246918 Merge bitcoin-core/libmultiprocess#246: doc: Bump version 8 > 9 97d877053b6 proxy-types: add CustomHasField hook for nullable decode paths 8c2f10252c9 refactor: add missing includes to mp/type-data.h b1638aceb40 doc: Bump version 8 > 9 f61af487217 type-map: Work around LLVM 22 "out of bounds index" error git-subtree-dir: src/ipc/libmultiprocess git-subtree-split: ff0eed1bf183627c89007e71d631f039bd61bfb5
|
Hopefully this is the last CI headache to resolve, but the bitcoin core linter wants LC_ALL to be set in bash scripts added here: https://github.com/bitcoin/bitcoin/actions/runs/23598079789/job/68720184763?pr=34804 |
|
Fixed in #265. |
…bf18 ff0eed1bf18 refactor: Use loop variable in type-context.h ff1d8ba172a refactor: Move type-context.h getParams() call closer to use 1dbc59a4aa3 race fix: m_on_cancel called after request finishes 1643d05ba07 test: m_on_cancel called after request finishes f5509a31fcc race fix: getParams() called after request cancel 4a60c39f24a test: getParams() called after request cancel f11ec29ed20 race fix: worker thread destroyed before it is initialized a1d643348f4 test: worker thread destroyed before it is initialized e606fd84a8c Merge bitcoin-core/libmultiprocess#264: ci: reduce nproc multipliers 336023382c4 ci: reduce nproc multipliers b090beb9651 Merge bitcoin-core/libmultiprocess#256: ci: cache gnu32 nix store be8622816da ci: cache gnu32 nix store 975270b619c Merge bitcoin-core/libmultiprocess#263: ci: bump timeout factor to 40 09f10e5a598 ci: bump timeout factor to 40 db8f76ad290 Merge bitcoin-core/libmultiprocess#253: ci: run some Bitcoin Core CI jobs 55a9b557b19 ci: set Bitcoin Core CI test repetition fb0fc84d556 ci: add TSan job with instrumented libc++ 0f29c38725b ci: add Bitcoin Core IPC tests (ASan + macOS) 3f64320315d Merge bitcoin-core/libmultiprocess#262: ci: enable clang-tidy in macOS job, use nullptr cd9f8bdc9f0 Merge bitcoin-core/libmultiprocess#258: log: add socket connected info message and demote destroy logs to debug b5d6258a42f Merge bitcoin-core/libmultiprocess#255: fix: use unsigned char cast and sizeof in LogEscape escape sequence d94688e2c32 Merge bitcoin-core/libmultiprocess#251: Improved CustomBuildField for std::optional in IPC/libmultiprocess a9499fad755 mp: use nullptr with pthread_threadid_np f499e37850f ci: enable clang-tidy in macOS job 98f1352159d log: add socket connected info message and demote destroy logs to debug 554a481ea73 fix: use unsigned char cast and sizeof in LogEscape escape sequence 1977b9f3f65 Use std::forward in CustomBuildField for std::optional to allow move semantics, resolves FIXME 22bec918c97 Merge bitcoin-core/libmultiprocess#247: type-map: Work around LLVM 22 "out of bounds index" error 8a5e3ae6ed2 Merge bitcoin-core/libmultiprocess#242: proxy-types: add CustomHasField hook to map Cap'n Proto values to null C++ values e8d35246918 Merge bitcoin-core/libmultiprocess#246: doc: Bump version 8 > 9 97d877053b6 proxy-types: add CustomHasField hook for nullable decode paths 8c2f10252c9 refactor: add missing includes to mp/type-data.h b1638aceb40 doc: Bump version 8 > 9 f61af487217 type-map: Work around LLVM 22 "out of bounds index" error git-subtree-dir: src/ipc/libmultiprocess git-subtree-split: ff0eed1bf183627c89007e71d631f039bd61bfb5
…da8f 70f632bda8f Merge bitcoin-core/libmultiprocess#265: ci: set LC_ALL in shell scripts 8e8e564259a Merge bitcoin-core/libmultiprocess#249: fixes for race conditions on disconnects 05d34cc2ec3 ci: set LC_ALL in shell scripts e606fd84a8c Merge bitcoin-core/libmultiprocess#264: ci: reduce nproc multipliers ff0eed1bf18 refactor: Use loop variable in type-context.h ff1d8ba172a refactor: Move type-context.h getParams() call closer to use 1dbc59a4aa3 race fix: m_on_cancel called after request finishes 1643d05ba07 test: m_on_cancel called after request finishes f5509a31fcc race fix: getParams() called after request cancel 4a60c39f24a test: getParams() called after request cancel f11ec29ed20 race fix: worker thread destroyed before it is initialized a1d643348f4 test: worker thread destroyed before it is initialized 336023382c4 ci: reduce nproc multipliers b090beb9651 Merge bitcoin-core/libmultiprocess#256: ci: cache gnu32 nix store be8622816da ci: cache gnu32 nix store 975270b619c Merge bitcoin-core/libmultiprocess#263: ci: bump timeout factor to 40 09f10e5a598 ci: bump timeout factor to 40 db8f76ad290 Merge bitcoin-core/libmultiprocess#253: ci: run some Bitcoin Core CI jobs 55a9b557b19 ci: set Bitcoin Core CI test repetition fb0fc84d556 ci: add TSan job with instrumented libc++ 0f29c38725b ci: add Bitcoin Core IPC tests (ASan + macOS) 3f64320315d Merge bitcoin-core/libmultiprocess#262: ci: enable clang-tidy in macOS job, use nullptr cd9f8bdc9f0 Merge bitcoin-core/libmultiprocess#258: log: add socket connected info message and demote destroy logs to debug b5d6258a42f Merge bitcoin-core/libmultiprocess#255: fix: use unsigned char cast and sizeof in LogEscape escape sequence d94688e2c32 Merge bitcoin-core/libmultiprocess#251: Improved CustomBuildField for std::optional in IPC/libmultiprocess a9499fad755 mp: use nullptr with pthread_threadid_np f499e37850f ci: enable clang-tidy in macOS job 98f1352159d log: add socket connected info message and demote destroy logs to debug 554a481ea73 fix: use unsigned char cast and sizeof in LogEscape escape sequence 1977b9f3f65 Use std::forward in CustomBuildField for std::optional to allow move semantics, resolves FIXME 22bec918c97 Merge bitcoin-core/libmultiprocess#247: type-map: Work around LLVM 22 "out of bounds index" error 8a5e3ae6ed2 Merge bitcoin-core/libmultiprocess#242: proxy-types: add CustomHasField hook to map Cap'n Proto values to null C++ values e8d35246918 Merge bitcoin-core/libmultiprocess#246: doc: Bump version 8 > 9 97d877053b6 proxy-types: add CustomHasField hook for nullable decode paths 8c2f10252c9 refactor: add missing includes to mp/type-data.h b1638aceb40 doc: Bump version 8 > 9 f61af487217 type-map: Work around LLVM 22 "out of bounds index" error git-subtree-dir: src/ipc/libmultiprocess git-subtree-split: 70f632bda8f80449b6240f98da768206a535a04e
…da8f 70f632bda8f Merge bitcoin-core/libmultiprocess#265: ci: set LC_ALL in shell scripts 8e8e564259a Merge bitcoin-core/libmultiprocess#249: fixes for race conditions on disconnects 05d34cc2ec3 ci: set LC_ALL in shell scripts e606fd84a8c Merge bitcoin-core/libmultiprocess#264: ci: reduce nproc multipliers ff0eed1bf18 refactor: Use loop variable in type-context.h ff1d8ba172a refactor: Move type-context.h getParams() call closer to use 1dbc59a4aa3 race fix: m_on_cancel called after request finishes 1643d05ba07 test: m_on_cancel called after request finishes f5509a31fcc race fix: getParams() called after request cancel 4a60c39f24a test: getParams() called after request cancel f11ec29ed20 race fix: worker thread destroyed before it is initialized a1d643348f4 test: worker thread destroyed before it is initialized 336023382c4 ci: reduce nproc multipliers b090beb9651 Merge bitcoin-core/libmultiprocess#256: ci: cache gnu32 nix store be8622816da ci: cache gnu32 nix store 975270b619c Merge bitcoin-core/libmultiprocess#263: ci: bump timeout factor to 40 09f10e5a598 ci: bump timeout factor to 40 db8f76ad290 Merge bitcoin-core/libmultiprocess#253: ci: run some Bitcoin Core CI jobs 55a9b557b19 ci: set Bitcoin Core CI test repetition fb0fc84d556 ci: add TSan job with instrumented libc++ 0f29c38725b ci: add Bitcoin Core IPC tests (ASan + macOS) 3f64320315d Merge bitcoin-core/libmultiprocess#262: ci: enable clang-tidy in macOS job, use nullptr cd9f8bdc9f0 Merge bitcoin-core/libmultiprocess#258: log: add socket connected info message and demote destroy logs to debug b5d6258a42f Merge bitcoin-core/libmultiprocess#255: fix: use unsigned char cast and sizeof in LogEscape escape sequence d94688e2c32 Merge bitcoin-core/libmultiprocess#251: Improved CustomBuildField for std::optional in IPC/libmultiprocess a9499fad755 mp: use nullptr with pthread_threadid_np f499e37850f ci: enable clang-tidy in macOS job 98f1352159d log: add socket connected info message and demote destroy logs to debug 554a481ea73 fix: use unsigned char cast and sizeof in LogEscape escape sequence 1977b9f3f65 Use std::forward in CustomBuildField for std::optional to allow move semantics, resolves FIXME 22bec918c97 Merge bitcoin-core/libmultiprocess#247: type-map: Work around LLVM 22 "out of bounds index" error 8a5e3ae6ed2 Merge bitcoin-core/libmultiprocess#242: proxy-types: add CustomHasField hook to map Cap'n Proto values to null C++ values e8d35246918 Merge bitcoin-core/libmultiprocess#246: doc: Bump version 8 > 9 97d877053b6 proxy-types: add CustomHasField hook for nullable decode paths 8c2f10252c9 refactor: add missing includes to mp/type-data.h b1638aceb40 doc: Bump version 8 > 9 f61af487217 type-map: Work around LLVM 22 "out of bounds index" error git-subtree-dir: src/ipc/libmultiprocess git-subtree-split: 70f632bda8f80449b6240f98da768206a535a04e
…n disconnects 2478a15 Squashed 'src/ipc/libmultiprocess/' changes from 1868a84451f..70f632bda8f (Ryan Ofsky) Pull request description: Includes: - bitcoin-core/libmultiprocess#246 - bitcoin-core/libmultiprocess#242 - bitcoin-core/libmultiprocess#247 - bitcoin-core/libmultiprocess#251 - bitcoin-core/libmultiprocess#255 - bitcoin-core/libmultiprocess#258 - bitcoin-core/libmultiprocess#262 - bitcoin-core/libmultiprocess#253 - bitcoin-core/libmultiprocess#263 - bitcoin-core/libmultiprocess#256 - bitcoin-core/libmultiprocess#264 - bitcoin-core/libmultiprocess#249 - bitcoin-core/libmultiprocess#265 The main change is bitcoin-core/libmultiprocess#249 which fixes 3 intermittent race conditions detected in bitcoin core CI and antithesis: #34711/#34756, #34777, and #34782. The changes can be verified by running `test/lint/git-subtree-check.sh src/ipc/libmultiprocess` as described in [developer notes](https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#subtrees) and [lint instructions](https://github.com/bitcoin/bitcoin/tree/master/test/lint#git-subtree-checksh) ACKs for top commit: Sjors: ACK 613a548 ismaelsadeeq: ACK 613a548 Tree-SHA512: d99eebc8b4f45b3c3099298167362cf5e7f3e9e622eef9f17af56388ee5207d77a04b915b2a5a894493e0395aeda70111216f2da0d2a6553f4f6396b3d31a744
Slightly more elaborate version of #252 with ccache and depends caching.