Optimize CI for wolfProvider#400
Open
aidangarske wants to merge 9 commits into
Open
Conversation
aidangarske
added a commit
to aidangarske/wolfProvider
that referenced
this pull request
May 25, 2026
…ew fix) Was: every workflow pulled ghcr.io/wolfssl/wolfprovider-test-deps:bookworm, which doesn't exist until upstream master runs the publish workflow. Bootstrap chicken-and-egg. Now: publish-test-deps-image.yml fires on any branch push (and PRs) and pushes to ghcr.io/<repo-owner>/wolfprovider-test-deps:bookworm. Consumer workflows read from the PR head's owner when on a PR, else the running repo's owner. Result: a fork PR publishes to the fork's ghcr namespace and pulls from it; master pushes publish to the org's ghcr namespace and pulls from it. Also fixes copilot review feedback from wolfSSL#400 (review) - Phase B log filename renames broke check-workflow-result.sh's hardcoded log paths (curl-test.log, openvpn-test.log, sssd-test.log, net-snmp-test.log, nginx-test.log, openssh-test.log, tcpdump-test.log, liboauth2-test.log, stunnel-test.log) plus in-step greps in cjose, libcryptsetup, libfido2, libhashkit2, libtss2, opensc, python3-ntp, qt5network5, tnftp, tpm2-tools. Reverted log names back to <app>-test.log; second mode overwrites first. - libtss2.yml: fix `if $(grep -q ...)` (invalid shell -- command substitution of grep used as the if condition expanded to an empty command). Use `if grep -q ...; then`. - opensc.yml: fix `TEST_RESULT=$(((grep ...) && echo 0 || echo 1))` (arithmetic expansion `(( ))` can't contain shell commands). Hoist to a check_opensc_log() function called from both modes. - stunnel.yml: `grep -c "failed: 0"` returns 1 on success, but check-workflow-result.sh expects TEST_RESULT==0 for pass. Use `if grep -q ...; then TEST_RESULT=0; else TEST_RESULT=1; fi`. Also mirror tests/logs/results.log to stunnel-test.log so the force-fail check finds the expected file. - hostap.yml: drop continue-on-error from the normal-mode test step. Without it the step's exit code was swallowed and normal-mode test failures didn't fail the job. One-time setup: after this lands, the owner of each fork that opens a PR has to make their ghcr.io/<owner>/wolfprovider-test-deps package public (GitHub UI: Packages -> Package settings -> Change visibility). GitHub's Actions runners can only pull public packages from another namespace.
aidangarske
added a commit
to aidangarske/wolfProvider
that referenced
this pull request
May 25, 2026
…vate)
Earlier commits tried to make fork CI work by:
- having publish-test-deps-image.yml push to a per-owner ghcr namespace
(ghcr.io/<owner>/wolfprovider-test-deps)
- having consumer workflows pull from the PR head's owner
- auto-PATCHing the test-deps package to visibility=public
- dropping the `github.repository == 'wolfSSL/wolfProvider'` guard on
the wolfprov-debs ORAS pull in build-wolfprovider.yml
That path only works if the packages can be public, which they can't
(some of the .debs contain commercially-licensed bits). Revert to the
canonical-only behavior:
publish-test-deps-image.yml
- fires only on push to master/main (was '**')
- guards the publish on github.repository == 'wolfSSL/wolfProvider'
- drops the per-owner namespace; always pushes to
ghcr.io/wolfssl/wolfprovider-test-deps
- removes the Mark-package-public step
build-wolfprovider.yml
- restores the github.repository == 'wolfSSL/wolfProvider' guard on
the Login, Download .debs, and Download WIC steps
39 consumer workflows
- container.image reverted from the per-owner expression back to the
literal ghcr.io/wolfssl/wolfprovider-test-deps:bookworm
Practical effect: PR CI and nightly only run on the canonical repo
(or once PR wolfSSL#400 merges, on wolfSSL/wolfProvider's runners). Fork
pushes will skip the wolfprov-deb pull and any container-using job
will fail loud at the image pull -- which is the right signal: those
runs need to happen on the canonical repo.
aidangarske
added a commit
to aidangarske/wolfProvider
that referenced
this pull request
May 25, 2026
…idation) Add pull_request trigger to nightly-osp.yml so PR wolfSSL#400's reviewers can see the dispatcher actually fan all 41 reusable workflows out and the notify job hit Slack. Marked temporary in the file header -- revert this trigger before merging if you don't want the full nightly job set firing on every PR. (For everyday CI, scheduled + workflow_dispatch is the intended shape.) Note: PR runs from forks will still hit the private-package issue for the wolfprov-debs pull (the wolfSSL/wolfProvider repo guard short-circuits the ORAS step on non-canonical repos). The plumbing itself -- dispatch, discover-versions, notify, Slack -- runs regardless and is what this PR-trigger lets you verify end-to-end.
aidangarske
added a commit
to aidangarske/wolfProvider
that referenced
this pull request
May 25, 2026
Adds aidangarske/wolfProvider to the publish workflow's repository allowlist so PR wolfSSL#400's working branch can bootstrap a test-deps image on the fork's ghcr namespace. Pushed image lands at ghcr.io/aidangarske/wolfprovider-test-deps:bookworm. Also adds 'ci-draft-pause' to the branches list (alongside master/ main) so a push to that branch triggers the workflow without needing a separate workflow_dispatch. Consumer workflows continue to pull from ghcr.io/wolfssl/... so this fork-side push is purely for the fork owner to verify the build/push pipeline works end to end before PR merges. After merge, the canonical wolfSSL/wolfProvider master push will publish the authoritative image and consumers will find it. Note: the 'ci-draft-pause' branch entry is TEMPORARY for PR wolfSSL#400. Drop it (and remove aidangarske from the allowlist if desired) once the PR merges.
dgarske
pushed a commit
that referenced
this pull request
May 26, 2026
) Bootstrap PR: introduces the test-deps container image that PR #400's nightly OSP workflows consume. This is a minimal subset of PR #400 intended to merge first, so the publish workflow fires once on master and the test-deps image lands at ghcr.io/wolfssl/wolfprovider-test-deps :bookworm before the rest of PR #400 merges. Without this, PR #400's OSP container jobs all fail with "manifest unknown" because the image they pull doesn't exist anywhere yet. Two files only: docker/wolfprovider-test-deps/Dockerfile Single Debian-bookworm image with every apt dep that the OSP integration tests used to install at job time. One apt-get update at build time, zero at job time -- eliminates Debian mirror flake. .github/workflows/publish-test-deps-image.yml Builds the Dockerfile and pushes to ghcr.io/wolfssl/wolfprovider-test-deps:bookworm on push to master/main (path-filtered to docker/wolfprovider-test-deps/**) or workflow_dispatch. Guarded with github.repository == 'wolfSSL/wolfProvider' so forks don't try to push to wolfSSL's namespace. The OSP workflows themselves, the discover-versions resolver, the ASan/UBSan workflow, and all the matrix/force-fail consolidation land via PR #400 once this is in place.
dgarske
added a commit
that referenced
this pull request
May 26, 2026
ci: bootstrap test-deps Docker image (prep for PR #400)
aidangarske
added a commit
to aidangarske/wolfProvider
that referenced
this pull request
May 26, 2026
PR wolfSSL#402 published ghcr.io/wolfssl/wolfprovider-test-deps:bookworm. This empty commit bumps the head SHA so PR wolfSSL#400's checks rerun against the now-existing image.
5ce6df6 to
91f2549
Compare
82d537b to
e5226fb
Compare
wolfSSL-Fenrir-bot
left a comment
There was a problem hiding this comment.
Fenrir Automated Review — PR #400
Scan targets checked: wolfprovider-bugs, wolfprovider-src
No new issues found in the changed files. ✅
Member
Author
|
Jenkins retest this please |
aidangarske
added a commit
to aidangarske/wolfProvider
that referenced
this pull request
May 27, 2026
…n run The Smoke Test workflow ran on PR wolfSSL#400 head commit and concluded as startup_failure with 0 jobs. That's GH Actions failing to validate the workflow before any container spawns. Compared against every other workflow that calls _discover-versions.yml (simple, cmdline, multi-compiler, fips-ready, sanitizers, seed-src), smoke-test.yml is the only one with a workflow-level 'permissions: contents: read' block. The reusable _discover-versions.yml job declares 'permissions: { contents: read, packages: read }' for its oras login ghcr.io step. Workflow-level permissions clamp every job including reusable workflows, so the discover_versions job ended up with strictly fewer permissions than it declared, which trips startup validation. Grant packages:read at the workflow level so the reusable workflow's declared permissions can be satisfied. Keep the explicit block instead of removing it - the other working workflows just rely on the repo default token, but smoke-test.yml should stay explicit since it's the gate everything else waits on.
* Orchestrate the OSP suite via a single Nightly OSP workflow (.github/workflows/nightly-osp.yml) that fans out every per-app workflow (bind9, cjose, curl, debian-package, git-ssh-dr, grpc, hostap, iperf, krb5, libcryptsetup, libeac3, libfido2, libhashkit2, libnice, liboauth2, librelp, libssh2, libtss2, libwebsockets, net-snmp, nginx, openldap, opensc, openssh, openvpn, pam-pkcs11, ppp, python3-ntp, qt5network5, rsync, socat, sscep, sssd, stunnel, systemd, tcpdump, tnftp, tpm2-tools, x11vnc, xmlsec) plus the openssl-version sweep and the static-analysis suite, then aggregates results to Slack. * Resolve wolfSSL and OpenSSL versions dynamically per nightly run via .github/workflows/_discover-versions.yml so the matrix reflects what actually ships on ghcr.io and what's latest upstream rather than what was hand-bumped here. * Switch OSP test jobs to the test-deps image ghcr.io/wolfssl/wolfprovider-test-deps:bookworm with all deps pre-installed (built by .github/workflows/publish-test-deps-image.yml from docker/wolfprovider-test-deps/Dockerfile). * Drop the openssl-3.0.20 -> 3.5.4 source build from the OSP path; OSP suites now use the bookworm system OpenSSL (which is the wolfprov-replace-default .deb on ghcr). * Add a dedicated Sanitizers workflow that builds wolfssl + wolfprov with -fsanitize=address,undefined (one job) and -fsanitize=thread (separate job -- ASan and TSan can't coexist in one binary), then runs the cmd-tests + wolfprov unit tests under each. Cache openssl-source/install across runs so source-build skips when refs match. WOLFPROV_SKIP_TEST=1 lets the build step skip the internal make test (which needed LD_PRELOAD=libasan and segfaulted dpkg/grep in the build path) and run unit tests as a separate step instead. ASAN_OPTIONS=detect_odr_violation=0 suppresses a known false positive from the provider's static ASN.1 table being linked into both libwolfprov.so and the test binary. For TSan, the unit-test step skips LD_PRELOAD entirely -- libtsan is wired in via DT_NEEDED on the TSan-built test binary, and preloading it into make crashes the non-TSan host process. * Convert .github/workflows/static-analysis.yml (cppcheck, clang scan-build, Facebook Infer) from a standalone 2 AM cron to workflow_call so it runs in the nightly-osp fan-out alongside the OSP integrations. Single nightly cadence, single Slack summary. * Smoke test gate (.github/workflows/smoke-test.yml) runs on every push/PR including drafts; other PR-time workflows wait for it via .github/actions/wait-for-smoke. * PR mode runs smoke + simple + cmd-tests + multi-compiler + fips-ready + codespell + sanitizers. The full OSP matrix and the heavy static analyzers only run nightly / on workflow_dispatch. * Bump every per-app OSP workflow timeout-minutes to >= 60 so flaky long-tail tests don't trip the previous 15/20/30 minute caps. * Document the full CI structure in .github/README.md -- three tiers (PR/push, nightly, reusable), per-OSP inventory with the wolfprov surface each one exercises, the WOLFPROV_FORCE_FAIL XOR sanity check, the OSP workflow template, and a failure -> log-section cheat sheet. * Fix a real ASan global-buffer-overflow caught by the new sanitizer job: src/wp_aes_aead.c was using XMEMCMP(params->key, X, sizeof(X)) to compare a NUL-terminated provider parameter name against a string literal, which overreads the caller's buffer when their key is shorter than the constant (e.g. "tlsivinv" vs "tlsivfixed"). Switch to XSTRCMP for the five AEAD parameter key checks. This pairs with wolfssl/osp PR wolfSSL#340 which provides the 5.9.1 FIPS patches the per-app workflows reference. Once that merges these workflows will be green end-to-end.
…n run The Smoke Test workflow ran on PR wolfSSL#400 head commit and concluded as startup_failure with 0 jobs. That's GH Actions failing to validate the workflow before any container spawns. Compared against every other workflow that calls _discover-versions.yml (simple, cmdline, multi-compiler, fips-ready, sanitizers, seed-src), smoke-test.yml is the only one with a workflow-level 'permissions: contents: read' block. The reusable _discover-versions.yml job declares 'permissions: { contents: read, packages: read }' for its oras login ghcr.io step. Workflow-level permissions clamp every job including reusable workflows, so the discover_versions job ended up with strictly fewer permissions than it declared, which trips startup validation. Grant packages:read at the workflow level so the reusable workflow's declared permissions can be satisfied. Keep the explicit block instead of removing it - the other working workflows just rely on the repo default token, but smoke-test.yml should stay explicit since it's the gate everything else waits on.
These workflows were apt-get update'ing the host runner on every job, which is slow and intermittently hangs (e.g. the clang-14 build in run 26527356013 timed out after 20m on apt-get). All the packages they install are already in the test-deps container. Add 'container: ghcr.io/wolfssl/wolfprovider-test-deps:bookworm' to each job and drop the apt-get step: - sanitizers.yml: both ASan+UBSan and TSan jobs - the install set (build-essential autoconf automake libtool pkg-config git curl wget patch m4 gettext) is already baked in. - static-analysis.yml: cppcheck, scan-build, and infer jobs. cppcheck, clang, clang-tools, and build deps are already baked. Add opam to the image so the infer job can drop its apt step too. Infer itself (~100MB tarball) is still downloaded at job time to keep the image small. - libnice.yml: drop the redundant apt step entirely - the workflow was already running in the container; build-essential, pkg-config, meson, ninja-build, libglib2.0-dev, libgstreamer1.0-dev, and gstreamer1.0-plugins-base-apps are all in the image. Add the one missing piece (libunwind-dev) to the Dockerfile. Dockerfile delta: add opam (infer dep) and libunwind-dev (libnice dep). Image rebuilds on push via publish-test-deps-image.yml. multi-compiler.yml is not converted in this commit. Its matrix needs gcc-9, gcc-10, gcc-13, gcc-14, and clang-12 which are not available in Debian Bookworm; that workflow needs either a separate ubuntu-base container or a matrix reduction.
PR-time multi-compiler was apt-get update'ing the runner before installing gcc-X/clang-X. When the runner's apt cache was stale this hung past the 20m job timeout (e.g. clang-14 in run 26527356013) and cancelled the whole compiler-matrix run. Restrict the PR-time matrix to compilers that ship in the test-deps container (Debian Bookworm: gcc-11, gcc-12, clang-13, clang-14, clang-15) and run inside the container, so the apt-get step goes away entirely. Six matrix entries cover the common compiler bases plus one entry pinned to wolfssl v5.8.0-stable for back-compat. Dropped from PR-time vs prior matrix: gcc-9, gcc-10, gcc-13, gcc-14, clang-12 (not in Bookworm or its backports). To restore that coverage at nightly cadence, add nightly-multi-compiler.yml which runs the FULL original matrix on GitHub-hosted Ubuntu runners (22.04 + latest). The verify-or-install step skips apt-get when the compiler is already on PATH, so most matrix entries don't apt-get at all and the slow path only fires when a runner image change drops a pre-installed compiler. Wired into nightly-osp.yml as 'multi-compiler:' with the matching needs: entry on the Slack notify job.
resolve-ref.sh pipes curl into jq; image was missing it, multi-compiler hit 'jq: command not found' in run 26529715823.
The test-deps container's default shell is dash (Debian's /bin/sh), not bash. 'source' is a bash builtin - dash has no such command, so the sanitizers step exits 127 with 'source: not found' before the actual test runs (job 78146645114). Add 'shell: bash' to the three steps in sanitizers.yml that source scripts/env-setup (ASan job test + cmd-tests, TSan job test). Other PR-time workflows that use 'source scripts/env-setup' (cmdline, fips-ready) run on the host ubuntu-22.04 runner where bash is the default, so they don't need this fix.
1dc06fe to
154312b
Compare
…d-14-dev publish-test-deps-image run 26537414820 failed: libunwind-14-dev : Conflicts: libunwind-dev clang pulls libunwind-14-dev in transitively. If libnice actually needs libunwind it can use libunwind-14-dev, not the unversioned package.
OpenSSL/wolfSSL build errors get logged to scripts/build-release.log by build-wolfprovider.sh, not test-suite.log. Without dumping the build log the workflow just shows 'Build OpenSSL master ... ERROR.' with no detail. Match the sanitizers.yml log-dump pattern.
multi-compiler matrix asks for gcc-11, gcc-12, clang-13, clang-14, clang-15 as explicit binary names. The image only had unversioned 'gcc' (= gcc-12) and 'clang' (= clang-14), so make hit 'gcc-11: not found' for any other matrix entry (run 26538567521 job 78173898262).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
related PR's need to go in first in this order then this one