Skip to content

feat(ibmcloud): add GitHub Actions runner support for IBM Power and IBM Z#831

Open
deekay2310 wants to merge 17 commits into
redhat-developer:mainfrom
deekay2310:ibmcloud-gh-runners
Open

feat(ibmcloud): add GitHub Actions runner support for IBM Power and IBM Z#831
deekay2310 wants to merge 17 commits into
redhat-developer:mainfrom
deekay2310:ibmcloud-gh-runners

Conversation

@deekay2310

Copy link
Copy Markdown
Contributor

Integrate the existing pkg/integrations/github framework into the IBM Cloud providers so that mapt ibmcloud ibm-power create and mapt ibmcloud ibm-z create can provision VMs that auto-register as ephemeral GitHub Actions self-hosted runners.

Since no official runner binaries exist for ppc64le/s390x, arch-specific setup scripts clone action-runner-image-pz and build the runner from source on the target VM. A new --ghactions-runner-image-repo flag controls which repo is cloned (defaults to deekay2310 fork until the RHEL script merges to github.com/IBM/action-runner-image-pz).

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Summary by CodeRabbit

  • New Features
    • Added support for ppc64le and s390x GitHub Actions runner architectures on IBM Power and IBM Z systems.
    • Added --ghactions-runner-image-repo flag to configure custom runner image repositories.
    • Automatic GitHub Actions runner registration token generation when using personal access tokens.
    • Improved IBM Power provisioning with automatic system type selection.
    • Added s390x smoke test workflow.

Walkthrough

Adds GitHub Actions self-hosted runner provisioning for IBM Cloud PowerVS (ppc64le) and Z (s390x) architectures. New arch constants, CLI flags, a GitHub API token generator, architecture-specific Bash setup scripts, cloud-init template extensions, a PowerVS system-pool discovery helper with retry, and an s390x smoke-test workflow are introduced. Runner naming is changed to use project name.

Changes

GitHub Actions Runner Multiarch Support on IBM Cloud

Layer / File(s) Summary
Type definitions and data contracts
pkg/integrations/github/types.go, pkg/integrations/integrations.go
Adds Ppc64le and S390x Arch constants; adds RunnerImageRepo field to GithubRunnerArgs and UserDataValues structs.
CLI parameters, token generation, and architecture mapping
cmd/mapt/cmd/params/params.go, pkg/integrations/github/api.go
Adds --ghactions-runner-image-repo flag with HTTPS validation; implements GenerateRegistrationToken to call the GitHub API using GITHUB_TOKEN when no manual token is supplied; extends GithubRunnerArgs() to auto-generate tokens, validate inputs, and set RunnerImageRepo; maps ppc64le/s390x to their GitHub Actions arch values.
Architecture-specific runner provisioning scripts
pkg/integrations/github/snippet-linux-ppc64le.sh, pkg/integrations/github/snippet-linux-s390x.sh, pkg/integrations/github/ghrunner.go
Adds embedded Bash scripts to provision RHEL 9 (ppc64le, with SSHD watchdog and COS log upload) and Ubuntu 22.04 (s390x) self-hosted runners; introduces archSnippets map and updates GetSetupScriptTemplate() to select architecture-specific snippets on Linux.
IBM PowerVS system pool discovery
pkg/provider/ibmcloud/data/pisystemtypes.go
Introduces GetAvailableSystemTypes, exported types SystemTypeRequirements/SystemTypeResult, and DefaultSystemTypePriority; queries IBM PI datacenter capabilities and returns an ordered, filtered list of candidate system types.
IBM Power ppc64le command and provider integration
cmd/mapt/cmd/ibmcloud/hosts/ibm-power.go, pkg/provider/ibmcloud/action/ibm-power/ibm-power.go, pkg/provider/ibmcloud/action/ibm-power/cloud-config
Forces Ppc64le arch on runner args in the create command; reworks New() to iterate discovered system types with capacity-error retry and partial-stack cleanup; extends piUserData signature and cloud-config template to wire the GitHub runner script and COS credentials.
IBM Power user-data tests
pkg/provider/ibmcloud/action/ibm-power/ibm-power_test.go
Updates three piUserData test calls to pass the new empty ghRunnerScript argument.
IBM Z s390x command and provider integration
cmd/mapt/cmd/ibmcloud/hosts/ibm-z.go, pkg/provider/ibmcloud/action/ibm-z/ibm-z.go, pkg/provider/ibmcloud/action/ibm-z/cloud-config
Forces S390x arch on runner args in the create command; extends buildUserDataInput and izUserData to render and thread the GitHub runner script; adds conditional write_files and runcmd blocks to the cloud-config template.
IBM Z user-data tests
pkg/provider/ibmcloud/action/ibm-z/ibm-z_test.go
Updates three izUserData test calls to pass the new empty ghRunnerScript third argument.
Runner naming fix and s390x smoke-test workflow
pkg/manager/context/context.go, .github/workflows/smoke-test-s390x.yaml
Changes GitHub runner Name from run ID to project name in manageIntegration; adds a workflow_dispatch smoke-test workflow that reports s390x runner architecture and OS identity.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as mapt ibm-power/ibm-z create
  participant params as params.GithubRunnerArgs()
  participant api as github.GenerateRegistrationToken
  participant ghAPI as api.github.com
  participant provider as ibmpower.New / ibmz.New
  participant icdata as GetAvailableSystemTypes
  participant deploy as deploy()
  participant snippet as GetSetupScriptTemplate(archSnippets)
  participant cloudInit as cloud-config template

  CLI->>CLI: set ghRunnerArgs.Arch = Ppc64le | S390x
  CLI->>params: GithubRunnerArgs()
  alt --ghactions-runner-token absent, GITHUB_TOKEN present
    params->>api: GenerateRegistrationToken(pat, repoURL)
    api->>ghAPI: POST /repos/{owner}/{repo}/actions/runners/registration-token
    ghAPI-->>api: 201 {token}
    api-->>params: token
  end
  params-->>CLI: GithubRunnerArgs{Arch, RunnerImageRepo, Token, ...}
  CLI->>provider: New(ContextArgs{GHRunnerArgs})
  provider->>icdata: GetAvailableSystemTypes(preferred)
  icdata-->>provider: SystemTypeResult{Types}
  loop each sysType
    provider->>deploy: deploy(sysType)
    deploy->>snippet: GetSetupScriptTemplate() → arch-specific script
    snippet-->>deploy: ppc64le or s390x Bash snippet
    deploy->>cloudInit: render piUserData/izUserData with ghRunnerScript
    cloudInit-->>deploy: base64 cloud-init user-data
    alt capacity error
      deploy-->>provider: error
      provider->>provider: destroy partial stack, retry next sysType
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: integrating GitHub Actions runner support for IBM Power and IBM Z architectures in the IBM Cloud providers.
Description check ✅ Passed The description is directly related to the changeset, explaining the integration of GitHub Actions runners, architecture-specific setup scripts, and the new CLI flag for controlling the runner image repository.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pkg/provider/ibmcloud/action/ibm-z/ibm-z_test.go (1)

40-55: 💤 Low value

Consider adding a test case for GitHub runner provisioning.

The existing tests verify GitLab runner and otelcol integration but don't cover the new GitHub Actions runner path. A test case calling izUserData(nil, "", " #!/bin/bash\n echo gh") and verifying install-ghrunner.sh appears in the decoded output would increase confidence in the template rendering.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/provider/ibmcloud/action/ibm-z/ibm-z_test.go` around lines 40 - 55, Add a
new unit test to cover the GitHub Actions runner path: create a test (e.g.,
TestIzUserData_githubRunner) that calls izUserData(nil, "", "      #!/bin/bash\n
echo gh") and fails if izUserData returns an error; decode the output with
decodeIzOutput(t, out) and assert the resulting cfg contains
"install-ghrunner.sh" (and optionally that it includes any expected
runcmd/write_files lines). Use the same helpers and style as
TestIzUserData_noRunner so the new test lives alongside it and verifies the
template renders the GitHub runner installer.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/integrations/github/snippet-linux-ppc64le.sh`:
- Around line 4-7: The RunnerImageRepo value risks executing arbitrary code
because it's embedded into snippet-linux-ppc64le.sh and snippet-linux-s390x.sh
(git clone {{ .RunnerImageRepo }} then bash -c '. scripts/vm.sh ...'); update
cmd/mapt/cmd/params/params.go and pkg/integrations/github/ghrunner.go to enforce
a safe trust model: either restrict the --ghactions-runner-image-repo flag to
admin-only (check caller permissions where flags are parsed/used in params.go),
add validation/allowlist logic in ghrunner.go to validate RunnerImageRepo
against a configured set of allowed hostnames/URLs before embedding into the
snippet, and if neither is possible add clear documentation and runtime warning
logs wherever RunnerImageRepo is accepted (and sanitize inputs to prevent
local-path/ssh/git-protocol abuses); reference the symbols RunnerImageRepo,
cmd/mapt/cmd/params/params.go, pkg/integrations/github/ghrunner.go, and the two
snippet files when making the change.

In `@pkg/integrations/github/snippet-linux-s390x.sh`:
- Around line 4-7: The template embeds an untrusted RunnerImageRepo directly
into shell commands (git clone {{ .RunnerImageRepo }} ... and bash -c '.
scripts/vm.sh ...'), which allows arbitrary repo URLs to be executed; fix by
enforcing validation or admin-only restriction where RunnerImageRepo is set:
implement a validateRunnerImageRepo(url) check for HTTPS scheme, host ==
"github.com" (or other approved host), and an allowlist of specific owner/repo
patterns (reject raw strings, file://, ssh, or IP hosts), call this validation
where the --ghactions-runner-image-repo flag is parsed and refuse/exit on
invalid values, and then continue to render RunnerImageRepo into
pkg/integrations/github/snippet-linux-s390x.sh only after validation (also
consider quoting the variable in the git clone command and running git clone
--depth=1 to limit exposure); alternatively restrict the flag to trusted
administrators and document the requirement for approved repositories.

---

Nitpick comments:
In `@pkg/provider/ibmcloud/action/ibm-z/ibm-z_test.go`:
- Around line 40-55: Add a new unit test to cover the GitHub Actions runner
path: create a test (e.g., TestIzUserData_githubRunner) that calls
izUserData(nil, "", "      #!/bin/bash\n      echo gh") and fails if izUserData
returns an error; decode the output with decodeIzOutput(t, out) and assert the
resulting cfg contains "install-ghrunner.sh" (and optionally that it includes
any expected runcmd/write_files lines). Use the same helpers and style as
TestIzUserData_noRunner so the new test lives alongside it and verifies the
template renders the GitHub runner installer.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 0708cbed-1d28-4c34-b06b-76c3e58bcb05

📥 Commits

Reviewing files that changed from the base of the PR and between a2f7a23 and c9a4b05.

📒 Files selected for processing (14)
  • cmd/mapt/cmd/ibmcloud/hosts/ibm-power.go
  • cmd/mapt/cmd/ibmcloud/hosts/ibm-z.go
  • cmd/mapt/cmd/params/params.go
  • pkg/integrations/github/ghrunner.go
  • pkg/integrations/github/snippet-linux-ppc64le.sh
  • pkg/integrations/github/snippet-linux-s390x.sh
  • pkg/integrations/github/types.go
  • pkg/integrations/integrations.go
  • pkg/provider/ibmcloud/action/ibm-power/cloud-config
  • pkg/provider/ibmcloud/action/ibm-power/ibm-power.go
  • pkg/provider/ibmcloud/action/ibm-power/ibm-power_test.go
  • pkg/provider/ibmcloud/action/ibm-z/cloud-config
  • pkg/provider/ibmcloud/action/ibm-z/ibm-z.go
  • pkg/provider/ibmcloud/action/ibm-z/ibm-z_test.go

Comment thread pkg/integrations/github/snippet-linux-ppc64le.sh Outdated
Comment thread pkg/integrations/github/snippet-linux-s390x.sh Outdated
@adrianriobo adrianriobo changed the title feat(ibmcloud): add GitHub Actions runner support for IBM Power and IBM Z WIP: feat(ibmcloud): add GitHub Actions runner support for IBM Power and IBM Z Jun 9, 2026
deekay2310 and others added 12 commits June 9, 2026 15:54
- Quote the URL in snippet git clone commands to prevent shell injection
- Add --depth=1 to limit clone exposure and speed up provisioning
- Validate that only HTTPS URLs are accepted for the runner image repo

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The upstream configure-limits.sh appends duplicate pam_limits.so entries
to system-auth and password-auth, causing sshd to drop connections before
sending its banner. Deduplicate PAM entries and restart sshd after build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…snippet

Adds a background monitor that logs sshd status every 30s during the
runner build to identify what breaks SSH. After build completion, dumps
full sshd diagnostics (config test, journal, host key perms, crypto
policies, PAM config) and attempts repair.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nostics

The upstream configure-system.sh runs chmod -R 777 /usr/share which
makes /usr/share/empty.sshd (sshd's privilege separation directory)
world-writable. sshd refuses to start when this directory is not owned
by root or is world-writable. Fix by restoring 755 after the build.

Also adds sshd watchdog logging with COS upload so diagnostics are
accessible even when SSH is broken. COS credentials are passed through
cloud-config template variables.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
On RHEL 9/ppc64le, dotnet installs to /usr/lib64/dotnet via dnf, not
/opt/dotnet. The GH runner is self-contained (uses ./bin/Runner.Listener)
and does not need DOTNET_ROOT. The chown on /opt/dotnet caused cloud-init
to fail after a successful build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@deekay2310 deekay2310 changed the title WIP: feat(ibmcloud): add GitHub Actions runner support for IBM Power and IBM Z feat(ibmcloud): add GitHub Actions runner support for IBM Power and IBM Z Jun 12, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
pkg/integrations/github/snippet-linux-ppc64le.sh (1)

22-23: ⚡ Quick win

|| true silently masks all build failures, not just flaky tests.

The comment states "Allow build to continue past flaky upstream test failures", but || true suppresses all exit codes from the build, including genuine errors like missing dependencies, syntax errors, or configuration failures. This makes it impossible to distinguish between expected flaky failures and unexpected breakage.

The check at lines 94-97 catches the most critical failure (missing runner binary), but other build issues will go unnoticed until runtime.

💡 Suggested improvement

Option 1: Capture and log the exit code

-bash -c '. scripts/vm.sh rhel 9 minimal --skip-snap-lxd' || true
+bash -c '. scripts/vm.sh rhel 9 minimal --skip-snap-lxd' || {
+    BUILD_EXIT=$?
+    echo "WARNING: vm.sh exited with code $BUILD_EXIT (continuing due to known flaky tests)" >&2
+}

Option 2: Check for specific known-flaky exit codes if possible

-bash -c '. scripts/vm.sh rhel 9 minimal --skip-snap-lxd' || true
+bash -c '. scripts/vm.sh rhel 9 minimal --skip-snap-lxd' || {
+    BUILD_EXIT=$?
+    # Known flaky test exit codes
+    if [[ $BUILD_EXIT -ne 1 && $BUILD_EXIT -ne 2 ]]; then
+        echo "ERROR: vm.sh failed with unexpected exit code $BUILD_EXIT" >&2
+        exit $BUILD_EXIT
+    fi
+    echo "WARNING: vm.sh failed with known-flaky exit code $BUILD_EXIT (continuing)" >&2
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/integrations/github/snippet-linux-ppc64le.sh` around lines 22 - 23, The
bash command executing scripts/vm.sh with the unconditional || true operator is
suppressing all exit codes, including genuine failures. Remove the blanket ||
true from the end of the bash command and instead implement targeted error
handling that either captures and logs the exit code before allowing the build
to continue, or specifically checks for known flaky test exit codes while
failing on actual build errors. This ensures that only expected flaky failures
are ignored while real issues like missing dependencies or syntax errors are
still caught and reported.
pkg/integrations/github/snippet-linux-s390x.sh (1)

9-10: ⚡ Quick win

|| true silently masks all build failures, not just flaky tests.

The comment states "Allow build to continue past flaky upstream test failures", but || true suppresses all exit codes from the build, including genuine errors. This makes it impossible to distinguish between expected flaky failures and real configuration or dependency issues.

The check at lines 12-15 catches the most critical failure (missing runner binary), but other build problems will go undetected.

💡 Suggested improvement

Option 1: Capture and log the exit code

-bash -c '. scripts/vm.sh ubuntu 22.04 minimal --skip-snap-lxd' || true
+bash -c '. scripts/vm.sh ubuntu 22.04 minimal --skip-snap-lxd' || {
+    BUILD_EXIT=$?
+    echo "WARNING: vm.sh exited with code $BUILD_EXIT (continuing due to known flaky tests)" >&2
+}

Option 2: Check for specific known-flaky exit codes if possible

-bash -c '. scripts/vm.sh ubuntu 22.04 minimal --skip-snap-lxd' || true
+bash -c '. scripts/vm.sh ubuntu 22.04 minimal --skip-snap-lxd' || {
+    BUILD_EXIT=$?
+    # Known flaky test exit codes
+    if [[ $BUILD_EXIT -ne 1 && $BUILD_EXIT -ne 2 ]]; then
+        echo "ERROR: vm.sh failed with unexpected exit code $BUILD_EXIT" >&2
+        exit $BUILD_EXIT
+    fi
+    echo "WARNING: vm.sh failed with known-flaky exit code $BUILD_EXIT (continuing)" >&2
+}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/integrations/github/snippet-linux-s390x.sh` around lines 9 - 10, The `||
true` operator at the end of the bash command that executes `scripts/vm.sh
ubuntu 22.04 minimal --skip-snap-lxd` unconditionally suppresses all exit codes,
masking both flaky test failures and genuine errors alike. Instead of using `||
true`, capture the exit code from the bash command execution and conditionally
handle it by either logging the exit code for debugging purposes or checking if
it matches specific known-flaky exit codes before allowing the build to
continue, ensuring that real configuration or dependency failures are not
silently masked.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/mapt/cmd/params/params.go`:
- Around line 293-333: The repoURL validation in the GithubRunnerArgs function
only checks if repoURL is empty when the token is being auto-generated (when
token is empty), but repoURL is required for GitHub Actions runner setup
regardless of whether the token is manually provided or auto-generated. Add an
additional validation check to ensure repoURL is not empty that applies to both
the manual token and auto-generated token paths. This validation should occur
after confirming that at least one of token or pat is provided, ensuring that
users cannot bypass the repoURL requirement by providing a manual token via the
ghactions-runner-token flag.
- Around line 335-340: The validateRunnerImageRepo function currently only
validates that the URL uses HTTPS, which is insufficient since a malicious HTTPS
URL can still be provided and executed as code. Either implement a domain
allowlist approach to restrict repository sources to known trusted domains, or
if an allowlist is too restrictive, add documentation to the flag description
warning that this repository will be cloned and executed with elevated
privileges, and add a runtime warning when the imageRepo flag is set to alert
users about the security implications of using untrusted repositories.

In `@pkg/integrations/github/api.go`:
- Around line 41-44: The code currently uses http.DefaultClient without a
timeout, which can cause indefinite hangs if the GitHub API is unresponsive.
Replace the http.DefaultClient.Do(req) call with a custom http.Client that has a
Timeout field set to an appropriate duration value. Create this client with a
reasonable timeout for GitHub API calls and use it instead of the default
client. Ensure the "time" package is imported at the top of the file.

In `@pkg/provider/ibmcloud/data/pisystempools.go`:
- Around line 112-117: The poolAvailableMemory function is inconsistent with
poolHasCapacity in how it checks capacity sources. While poolHasCapacity uses OR
logic to check three capacity sources (MaxCoresAvailable, MaxMemoryAvailable,
and MaxAvailable), poolAvailableMemory only checks MaxAvailable.Memory. Update
poolAvailableMemory to check all three capacity sources in the same manner as
poolHasCapacity, falling back appropriately through MaxCoresAvailable and
MaxMemoryAvailable when MaxAvailable is unavailable, to ensure accurate headroom
calculation and consistent pool prioritization.

---

Nitpick comments:
In `@pkg/integrations/github/snippet-linux-ppc64le.sh`:
- Around line 22-23: The bash command executing scripts/vm.sh with the
unconditional || true operator is suppressing all exit codes, including genuine
failures. Remove the blanket || true from the end of the bash command and
instead implement targeted error handling that either captures and logs the exit
code before allowing the build to continue, or specifically checks for known
flaky test exit codes while failing on actual build errors. This ensures that
only expected flaky failures are ignored while real issues like missing
dependencies or syntax errors are still caught and reported.

In `@pkg/integrations/github/snippet-linux-s390x.sh`:
- Around line 9-10: The `|| true` operator at the end of the bash command that
executes `scripts/vm.sh ubuntu 22.04 minimal --skip-snap-lxd` unconditionally
suppresses all exit codes, masking both flaky test failures and genuine errors
alike. Instead of using `|| true`, capture the exit code from the bash command
execution and conditionally handle it by either logging the exit code for
debugging purposes or checking if it matches specific known-flaky exit codes
before allowing the build to continue, ensuring that real configuration or
dependency failures are not silently masked.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 84f5a3ce-a823-4bd7-aef3-cf2cab08c1c8

📥 Commits

Reviewing files that changed from the base of the PR and between c9a4b05 and 9f02ad6.

📒 Files selected for processing (8)
  • .github/workflows/smoke-test-s390x.yaml
  • cmd/mapt/cmd/params/params.go
  • pkg/integrations/github/api.go
  • pkg/integrations/github/snippet-linux-ppc64le.sh
  • pkg/integrations/github/snippet-linux-s390x.sh
  • pkg/provider/ibmcloud/action/ibm-power/cloud-config
  • pkg/provider/ibmcloud/action/ibm-power/ibm-power.go
  • pkg/provider/ibmcloud/data/pisystempools.go
✅ Files skipped from review due to trivial changes (1)
  • .github/workflows/smoke-test-s390x.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/provider/ibmcloud/action/ibm-power/cloud-config
  • pkg/provider/ibmcloud/action/ibm-power/ibm-power.go

Comment thread cmd/mapt/cmd/params/params.go
Comment thread cmd/mapt/cmd/params/params.go
Comment thread pkg/integrations/github/api.go Outdated
Comment thread pkg/provider/ibmcloud/data/pisystempools.go Outdated
deekay2310 and others added 2 commits June 17, 2026 16:05
The s390x runner snippet referenced /opt/dotnet in chown and environment
setup, but the runner build extracts everything into /opt/runner-cache.
This caused cloud-init to fail with "chown: cannot access '/opt/dotnet'"
under set -euo pipefail, preventing runner registration even though the
binary built successfully.

Mirrors the same fix already applied to the ppc64le snippet in de8fc1f.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/manager/context/context.go`:
- Line 186: The change to use ProjectName instead of RunID for runner naming
requires all setup scripts to safely handle name reuse across multiple runs of
the same project. Review the four registration scripts: snippet-linux.sh,
snippet-darwin.sh, snippet-linux-ppc64le.sh, and snippet-linux-s390x.sh. Ensure
that either all scripts include the --replace flag in their config.sh calls, or
verify that the --ephemeral flag in the architecture-specific scripts
(snippet-linux-ppc64le.sh and snippet-linux-s390x.sh) is sufficient to prevent
registration failures due to naming conflicts. Add the --replace flag to any
script that lacks it, or add documentation confirming that --ephemeral alone
handles the collision prevention.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: a3e72b5e-269c-4353-9f7d-7f163c23c5a9

📥 Commits

Reviewing files that changed from the base of the PR and between 9f02ad6 and 28e55f7.

📒 Files selected for processing (5)
  • cmd/mapt/cmd/params/params.go
  • pkg/integrations/github/api.go
  • pkg/integrations/github/snippet-linux-s390x.sh
  • pkg/manager/context/context.go
  • pkg/provider/ibmcloud/data/pisystempools.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/integrations/github/api.go
  • cmd/mapt/cmd/params/params.go
  • pkg/provider/ibmcloud/data/pisystempools.go

func manageIntegration(c *Context, ca *ContextArgs) error {
if ca.GHRunnerArgs != nil {
ca.GHRunnerArgs.Name = c.RunID()
ca.GHRunnerArgs.Name = c.ProjectName()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Locate runner setup scripts and registration command usage =="
fd -i 'snippet-linux-.*\.sh' pkg/integrations/github --exec sed -n '1,220p' {}

echo
echo "== Find config.sh invocations and name/replace flags =="
rg -n -C3 --type=sh 'config\.sh|--name|--replace|ephemeral' pkg/integrations/github

echo
echo "== Confirm how GHRunnerArgs.Name flows into rendered user-data =="
rg -n -C3 --type=go 'GHRunnerArgs\.Name|UserDataValues\{|Name:' pkg/manager/context pkg/integrations/github pkg/provider/ibmcloud

Repository: redhat-developer/mapt

Length of output: 31595


Add --replace flag to all runner registration scripts or verify ephemeral behavior prevents naming conflicts.

Line 186 changes GitHub runner naming from run-scoped (RunID) to project-scoped (ProjectName). The older setup scripts (snippet-linux.sh, snippet-darwin.sh) use config.sh --replace to handle reuse, but the newer architecture-specific scripts (snippet-linux-ppc64le.sh, snippet-linux-s390x.sh) use only --ephemeral without --replace. Repeated provisioning for the same project will fail registration unless all registration paths safely handle name collisions.

Setup script inconsistency

snippet-linux.sh and snippet-darwin.sh:

./config.sh ... --name "{{ .Name }}" ... --replace ...

snippet-linux-ppc64le.sh and snippet-linux-s390x.sh:

./config.sh ... --ephemeral ... --name "{{ .Name }}" ...

(no --replace)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/manager/context/context.go` at line 186, The change to use ProjectName
instead of RunID for runner naming requires all setup scripts to safely handle
name reuse across multiple runs of the same project. Review the four
registration scripts: snippet-linux.sh, snippet-darwin.sh,
snippet-linux-ppc64le.sh, and snippet-linux-s390x.sh. Ensure that either all
scripts include the --replace flag in their config.sh calls, or verify that the
--ephemeral flag in the architecture-specific scripts (snippet-linux-ppc64le.sh
and snippet-linux-s390x.sh) is sufficient to prevent registration failures due
to naming conflicts. Add the --replace flag to any script that lacks it, or add
documentation confirming that --ephemeral alone handles the collision
prevention.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/provider/ibmcloud/action/ibm-power/ibm-power.go`:
- Around line 180-184: The error handling for the destroyForRetry call is too
lenient and allows retries to continue even when partial stack cleanup fails,
risking dirty state. Modify the error handling in this block so that if
destroyForRetry returns an error, the function only continues to the next
sysType iteration if the error specifically indicates "no stack exists"; for any
other error, the function should return that error and stop retrying rather than
proceeding with potentially leftover resources.

In `@pkg/provider/ibmcloud/data/pisystemtypes.go`:
- Around line 65-78: The filterAndPrioritize function currently filters system
types to only include those present in both the priority list and the supported
list, which causes it to fail when a zone supports types not in
DefaultSystemTypePriority. After building the filtered list from the priority
list iteration, append any remaining supported system types that were not
included in the priority list before checking if filtered is empty. This ensures
that valid API-supported types are preserved as a fallback, maintaining the
preferred ordering while still accepting all valid candidates.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 8428a96e-9095-4043-95ce-333a60cee504

📥 Commits

Reviewing files that changed from the base of the PR and between 28e55f7 and 107ab7f.

📒 Files selected for processing (2)
  • pkg/provider/ibmcloud/action/ibm-power/ibm-power.go
  • pkg/provider/ibmcloud/data/pisystemtypes.go

Comment on lines +180 to +184
if i < len(sysTypes.Types)-1 {
logging.Infof("destroying partial stack before retry...")
if dErr := destroyForRetry(mCtx); dErr != nil {
logging.Warnf("failed to destroy partial stack: %v", dErr)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Do not retry after failed partial-stack cleanup.

The next attempt reuses the same stack name/backend with a different sysType; if destroyForRetry failed, the retry can operate on dirty state or leftover resources. Stop here unless the error is positively known to mean “no stack exists.”

Proposed fix
 		if i < len(sysTypes.Types)-1 {
 			logging.Infof("destroying partial stack before retry...")
 			if dErr := destroyForRetry(mCtx); dErr != nil {
-				logging.Warnf("failed to destroy partial stack: %v", dErr)
+				return fmt.Errorf("failed to destroy partial stack before retry: %w", dErr)
 			}
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if i < len(sysTypes.Types)-1 {
logging.Infof("destroying partial stack before retry...")
if dErr := destroyForRetry(mCtx); dErr != nil {
logging.Warnf("failed to destroy partial stack: %v", dErr)
}
if i < len(sysTypes.Types)-1 {
logging.Infof("destroying partial stack before retry...")
if dErr := destroyForRetry(mCtx); dErr != nil {
return fmt.Errorf("failed to destroy partial stack before retry: %w", dErr)
}
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/provider/ibmcloud/action/ibm-power/ibm-power.go` around lines 180 - 184,
The error handling for the destroyForRetry call is too lenient and allows
retries to continue even when partial stack cleanup fails, risking dirty state.
Modify the error handling in this block so that if destroyForRetry returns an
error, the function only continues to the next sysType iteration if the error
specifically indicates "no stack exists"; for any other error, the function
should return that error and stop retrying rather than proceeding with
potentially leftover resources.

Comment on lines +65 to +78
priority := buildPriorityList(preferred)

var filtered []string
for _, t := range priority {
if slices.Contains(supported, t) {
filtered = append(filtered, t)
}
}

if len(filtered) == 0 {
return nil, fmt.Errorf(
"no system types from priority list %v are supported in zone (supported: %v)",
priority, supported)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Keep API-supported system types as a fallback.

filterAndPrioritize fails when a zone only supports system types not listed in DefaultSystemTypePriority, even though the API returned valid candidates. Preserve the preferred/default ordering, then append remaining supported types before deciding there is nothing to try.

Proposed fix
 	priority := buildPriorityList(preferred)
 
 	var filtered []string
+	seen := map[string]struct{}{}
 	for _, t := range priority {
 		if slices.Contains(supported, t) {
 			filtered = append(filtered, t)
+			seen[t] = struct{}{}
+		}
+	}
+	for _, t := range supported {
+		if _, ok := seen[t]; ok {
+			continue
 		}
+		filtered = append(filtered, t)
 	}
 
 	if len(filtered) == 0 {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/provider/ibmcloud/data/pisystemtypes.go` around lines 65 - 78, The
filterAndPrioritize function currently filters system types to only include
those present in both the priority list and the supported list, which causes it
to fail when a zone supports types not in DefaultSystemTypePriority. After
building the filtered list from the priority list iteration, append any
remaining supported system types that were not included in the priority list
before checking if filtered is empty. This ensures that valid API-supported
types are preserved as a fallback, maintaining the preferred ordering while
still accepting all valid candidates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant