Skip to content

MCO-2275: Migrate OCB test cases from openshift-tests-private#6080

Open
ptalgulk01 wants to merge 1 commit into
openshift:mainfrom
ptalgulk01:migrate-mco-ocb
Open

MCO-2275: Migrate OCB test cases from openshift-tests-private#6080
ptalgulk01 wants to merge 1 commit into
openshift:mainfrom
ptalgulk01:migrate-mco-ocb

Conversation

@ptalgulk01
Copy link
Copy Markdown
Contributor

@ptalgulk01 ptalgulk01 commented May 21, 2026

  • Add tests: 77497, 77498, 77576, 77977, 78001, 78196, 79137, 79172, 82536, 83136, 83137, 83139, 83755, 85843, 85980, 87176, 88801
  • Add supporting functions: checkUpdatedLists, SkipIfCompactOrSNO, GetAllApplicableExtensionsToMCPOrFail, AssertAllNonJobPodsToBeReadyWithPollerParams
  • Add MCP methods: SetMaxUnavailable, RemoveMaxUnavailable, GetSortedUpdatedNodes
  • Add MOSC methods: SetRenderedImagePushspec, GetRenderedImagePushspec, IsUsingInternalRegistry
  • Preserve function ordering to match otp3 source

Summary by CodeRabbit

  • Tests
    • Expanded test coverage for machine config pool updates, node rollouts, and image/boot scenarios, including OCB rebuild and image-deletion flows.
    • Added helpers for controlling max-unavailable behavior, detecting registry/use of internal registry, and deriving applicable extensions/packages.
    • Added utilities to skip Compact/SNO tests and to assert non-Job pod readiness with polling.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 21, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented May 21, 2026

@ptalgulk01: This pull request references MCO-2275 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

  • Add tests: 77497, 77498, 77576, 77977, 78001, 78196, 79137, 79172, 82536, 83136, 83137, 83139, 83755, 85843, 85980, 87176, 88801
  • Add supporting functions: checkUpdatedLists, SkipIfCompactOrSNO, GetAllApplicableExtensionsToMCPOrFail, AssertAllNonJobPodsToBeReadyWithPollerParams
  • Add MCP methods: SetMaxUnavailable, RemoveMaxUnavailable, GetSortedUpdatedNodes
  • Add MOSC methods: SetRenderedImagePushspec, GetRenderedImagePushspec, IsUsingInternalRegistry
  • Preserve function ordering to match otp3 source

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Walkthrough

This PR adds test infrastructure for extended-priv MCO tests: MachineConfigPool helpers (maxUnavailable mutation, node update polling, extension filtering), MachineOSConfig helpers (rendered image push spec and registry detection), general test utilities (cluster skipping, node list validation, pod readiness polling), and a large expansion of the OCB disruptive test suite with supporting helpers.

Changes

MCO extended-priv test infrastructure

Layer / File(s) Summary
MachineConfigPool test helpers
test/extended-priv/machineconfigpool.go
SetMaxUnavailable and RemoveMaxUnavailable patch/remove spec.maxUnavailable. GetSortedUpdatedNodes polls until pool nodes transition to updating with optional maxUnavailable enforcement and degraded-MCP aborts. GetAllApplicableExtensionsToMCPOrFail computes FIPS/ARM64 state, determines effective osImageStream, filters AllExtenstions, and returns extensions and aggregated packages.
MachineOSConfig test helpers
test/extended-priv/machineosconfig.go
SetRenderedImagePushspec and GetRenderedImagePushspec manage spec.renderedImagePushSpec. IsUsingInternalRegistry checks status.currentImagePullSpec for the internal registry URL.
MCO utilities
test/extended-priv/mco.go
checkUpdatedLists compares two []*Node lists by equal length and chunked membership (order-insensitive within chunks).
OCB disruptive test suite expansion
test/extended-priv/mco_ocb.go
Adds imports (os, path/filepath, sync, time), numerous new Ginkgo It tests for MOSC/MOSB/OCB scenarios (pull secret inheritance, panic/entitlement cases, maxUnavailable ordering, build/rebuild triggers, OCL boot validation), and unexported helpers for build detection, image deletion (imagestream or skopeo), and post-deletion rebuild verification.
General-purpose test utilities
test/extended-priv/util.go, test/extended-priv/util/pods.go
SkipIfCompactOrSNO skips tests on Compact or SNO clusters. AssertAllNonJobPodsToBeReadyWithPollerParams polls pod readiness excluding Job pods.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

lgtm, approved, verified

Suggested reviewers

  • pablintino
  • dkhater-redhat
🚥 Pre-merge checks | ✅ 7 | ❌ 5

❌ Failed checks (5 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 43.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Test 83138 tests three unrelated scenarios (wrong pullSecret, pushSecret, builder type) violating single responsibility. 9 assertions lack meaningful failure messages. Split test 83138 into separate tests for each validation scenario. Add failure messages to assertions at lines 609, 730, 1133, 1784, 1802, 1809, 1827, 1848, 1870.
Microshift Test Compatibility ⚠️ Warning Tests 88801 and 85843 use MachineSet API (machine.openshift.io) which is unavailable on MicroShift, but lack [Skipped:MicroShift] or [apigroup:machine.openshift.io] protection tags. Add [apigroup:machine.openshift.io] tag to test names, or add [Skipped:MicroShift] label, or wrap with exutil.IsMicroShiftCluster() check with g.Skip() to exclude these tests from MicroShift CI.
Single Node Openshift (Sno) Test Compatibility ⚠️ Warning Tests 77497, 77498, 79172, 83136, 85980 use CreateCustomMCP requiring worker pool nodes, which don't exist on SNO, causing test failures. Add SkipIfCompactOrSNO(oc.AsAdmin()) to unprotected tests: 77497, 77498, 79172, 83136, 85980. Review tests 77576, 77977, 78001, 78196, 83755 for SNO compatibility.
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning Tests at lines 31, 62, 115, 154, 176, 224, 451, 503, 574, 749 lack [Skipped:Disconnected] but require external registry access (quay.io) that will fail in disconnected environments. Add [Skipped:Disconnected] to test names or replace hardcoded quay.io references in Containerfiles. Tests calling CreateMachineOSConfigUsingExternalOrInternalRegistry should be marked [Skipped:Disconnected].
✅ Passed checks (7 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly identifies the main objective: migrating OCB test cases from openshift-tests-private, which aligns with the changeset that adds test cases, helper functions, and methods to support OCB testing.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All 22 test titles in mco_ocb.go are static, deterministic strings with no dynamic values like pod names, timestamps, UUIDs, or interpolation.
Topology-Aware Scheduling Compatibility ✅ Passed All test additions are topology-aware: tests assuming HA topology use SkipIfCompactOrSNO(), no hardcoded scheduling constraints found, and dynamic maxUnavailable handling respects node counts.
Ote Binary Stdout Contract ✅ Passed All code uses logext logger writing to GinkgoWriter, not stdout. No fmt.Print/klog/log/os.Stdout writes found in process-level code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 21, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ptalgulk01
Once this PR has been reviewed and has the lgtm label, please assign umohnani8 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended-priv/machineosconfig.go`:
- Around line 333-335: The SetRenderedImagePushspec function currently
concatenates the raw rips string into a JSON patch body which breaks when rips
contains quotes or backslashes; update this by JSON-escaping the value before
embedding (e.g., use json.Marshal or construct the patch as a struct/slice and
json.Marshal it) and pass the marshaled patch string or bytes into
MachineOSConfig.Patch instead of raw concatenation; reference the
SetRenderedImagePushspec function and the rips variable and ensure the final
patch contains a valid JSON string value for "/spec/renderedImagePushSpec".

In `@test/extended-priv/mco_ocb.go`:
- Around line 418-429: The goroutine closure captures the loop variable mosc
which gets reassigned each iteration, causing races; to fix, ensure the
goroutine captures a stable reference by passing mosc as a parameter or copying
it to a new local before launching the goroutine (e.g., change the launch to go
func(moscCopy /*type optional*/){ defer g.GinkgoRecover(); defer wg.Done();
ValidateSuccessfulMOSC(moscCopy, nil) }(mosc)); update the closure around
ValidateSuccessfulMOSC, keeping wg.Add(1) and g.GinkgoRecover() as-is.

In `@test/extended-priv/mco.go`:
- Around line 320-328: The loop in function checkUpdatedLists uses "for i := 0;
i < len(l); i += step" which will hang or behave incorrectly when step <= 0; add
a guard at the start of checkUpdatedLists validating that step is > 0 (return
false and log an error via logger.Errorf if not) so the loop cannot run with
zero or negative increments, ensuring safe iteration over l and r.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a4781355-f7a0-4a91-b6e6-d142ebe844b8

📥 Commits

Reviewing files that changed from the base of the PR and between 9bd4e4f and 56ad443.

📒 Files selected for processing (6)
  • test/extended-priv/machineconfigpool.go
  • test/extended-priv/machineosconfig.go
  • test/extended-priv/mco.go
  • test/extended-priv/mco_ocb.go
  • test/extended-priv/util.go
  • test/extended-priv/util/pods.go

Comment thread test/extended-priv/machineosconfig.go
Comment on lines +418 to +429
mosc, err := CreateMachineOSConfigUsingExternalOrInternalRegistry(oc.AsAdmin(), MachineConfigNamespace, moscName, infraMcp.GetName(), nil)
defer mosc.CleanupAndDelete()
o.Expect(err).NotTo(o.HaveOccurred(), "Error creating the MachineOSConfig resource")
moscList = append(moscList, mosc)

wg.Add(1)
go func() {
defer g.GinkgoRecover()
defer wg.Done()

ValidateSuccessfulMOSC(mosc, nil)
}()
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify goroutines in loops that capture outer variables directly.
rg -n -C3 'for .*range .*{|\bgo func\(' test/extended-priv/mco_ocb.go

Repository: openshift/machine-config-operator

Length of output: 202


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Show the exact lines around 418-429
nl -ba test/extended-priv/mco_ocb.go | sed -n '400,460p'

# Also find the nearest 'for' loop and subsequent goroutines in that region
rg -n --context 3 'go func\(' test/extended-priv/mco_ocb.go | head -n 50

Repository: openshift/machine-config-operator

Length of output: 120


🏁 Script executed:

#!/bin/bash
set -euo pipefail

nl -ba test/extended-priv/mco_ocb.go | sed -n '380,460p'

echo '--- goroutines in file (first 50 matches) ---'
rg -n --context 3 'go func\(' test/extended-priv/mco_ocb.go | head -n 50

echo '--- potential loop variable closures around MOSC validation ---'
rg -n --context 5 'ValidateSuccessfulMOSC\(mosc' test/extended-priv/mco_ocb.go

Repository: openshift/machine-config-operator

Length of output: 120


🏁 Script executed:

#!/bin/bash
set -euo pipefail

nl -ba test/extended-priv/mco_ocb.go | sed -n '380,460p'

echo '--- goroutines in file (first 50 matches) ---'
rg -n --context 3 'go func\(' test/extended-priv/mco_ocb.go | head -n 50

echo '--- potential loop variable closures around MOSC validation ---'
rg -n --context 5 'ValidateSuccessfulMOSC\(mosc' test/extended-priv/mco_ocb.go

Repository: openshift/machine-config-operator

Length of output: 120


🏁 Script executed:

#!/bin/bash
set -euo pipefail

nl -ba test/extended-priv/mco_ocb.go | sed -n '380,460p'

echo '--- goroutines in file (first 50 matches) ---'
rg -n --context 3 'go func\(' test/extended-priv/mco_ocb.go | head -n 50

echo '--- validate successful mosc goroutine blocks ---'
rg -n --context 6 'ValidateSuccessfulMOSC\(' test/extended-priv/mco_ocb.go

Repository: openshift/machine-config-operator

Length of output: 120


🏁 Script executed:

#!/bin/bash
set -euo pipefail

nl -ba test/extended-priv/mco_ocb.go | sed -n '380,460p'

echo '--- goroutines in file (first 50 matches) ---'
rg -n --context 3 'go func\(' test/extended-priv/mco_ocb.go | head -n 50

echo '--- validate successful mosc goroutine blocks ---'
rg -n --context 6 'ValidateSuccessfulMOSC\(' test/extended-priv/mco_ocb.go

Repository: openshift/machine-config-operator

Length of output: 120


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Print the relevant region without `nl` (not available in this sandbox).
echo "=== File excerpt: lines 400-450 ==="
sed -n '400,450p' test/extended-priv/mco_ocb.go

echo
echo "=== Line numbers for ValidateSuccessfulMOSC occurrences ==="
rg -n 'ValidateSuccessfulMOSC\(' test/extended-priv/mco_ocb.go

echo
echo "=== Line numbers for go func() blocks near ValidateSuccessfulMOSC ==="
rg -n --context 2 'go func\(\)' test/extended-priv/mco_ocb.go

echo
echo "=== Exact snippet around 'ValidateSuccessfulMOSC(mosc' if present ==="
rg -n --context 6 'ValidateSuccessfulMOSC\(mosc' test/extended-priv/mco_ocb.go || true

Repository: openshift/machine-config-operator

Length of output: 9708


Fix goroutine closure capturing mosc in MOSC validation loop

In test/extended-priv/mco_ocb.go, the goroutine calls ValidateSuccessfulMOSC(mosc, nil) while mosc is reassigned each loop iteration, so goroutines may validate the wrong MOSC. Pass mosc into the goroutine as a parameter.

Proposed fix
-			wg.Add(1)
-			go func() {
+			moscForValidation := mosc
+			wg.Add(1)
+			go func(m *MachineOSConfig) {
 				defer g.GinkgoRecover()
 				defer wg.Done()
 
-				ValidateSuccessfulMOSC(mosc, nil)
-			}()
+				ValidateSuccessfulMOSC(m, nil)
+			}(moscForValidation)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
mosc, err := CreateMachineOSConfigUsingExternalOrInternalRegistry(oc.AsAdmin(), MachineConfigNamespace, moscName, infraMcp.GetName(), nil)
defer mosc.CleanupAndDelete()
o.Expect(err).NotTo(o.HaveOccurred(), "Error creating the MachineOSConfig resource")
moscList = append(moscList, mosc)
wg.Add(1)
go func() {
defer g.GinkgoRecover()
defer wg.Done()
ValidateSuccessfulMOSC(mosc, nil)
}()
mosc, err := CreateMachineOSConfigUsingExternalOrInternalRegistry(oc.AsAdmin(), MachineConfigNamespace, moscName, infraMcp.GetName(), nil)
defer mosc.CleanupAndDelete()
o.Expect(err).NotTo(o.HaveOccurred(), "Error creating the MachineOSConfig resource")
moscList = append(moscList, mosc)
moscForValidation := mosc
wg.Add(1)
go func(m *MachineOSConfig) {
defer g.GinkgoRecover()
defer wg.Done()
ValidateSuccessfulMOSC(m, nil)
}(moscForValidation)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended-priv/mco_ocb.go` around lines 418 - 429, The goroutine closure
captures the loop variable mosc which gets reassigned each iteration, causing
races; to fix, ensure the goroutine captures a stable reference by passing mosc
as a parameter or copying it to a new local before launching the goroutine
(e.g., change the launch to go func(moscCopy /*type optional*/){ defer
g.GinkgoRecover(); defer wg.Done(); ValidateSuccessfulMOSC(moscCopy, nil)
}(mosc)); update the closure around ValidateSuccessfulMOSC, keeping wg.Add(1)
and g.GinkgoRecover() as-is.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wont this change the functionality or usage?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ptalgulk01 — Good catch questioning this. The fix would not change functionality, but more importantly, the original code is actually correct — my review comment was a false positive.

Here's why:

The mosc variable is declared inside the loop body with :=:

for _, infraMcp := range mcpList {
    moscName := infraMcp.GetName()
    mosc, err := CreateMachineOSConfigUsingExternalOrInternalRegistry(...)  // <-- new variable each iteration
    ...
    go func() {
        ValidateSuccessfulMOSC(mosc, nil)   // captures THIS iteration's mosc
    }()
}

The classic closure-over-loop-variable bug only applies when a goroutine captures a variable declared in the for clause itself (e.g., for _, v := range slice capturing v). In that case, all goroutines share the same v slot that gets mutated each iteration.

Since mosc is a fresh := declaration on every loop iteration, each goroutine already captures its own distinct mosc pointer — no race condition exists. The proposed fix (passing mosc as a parameter) produces identical behavior and is unnecessary.

You can safely keep the code as-is. I'll mark this comment as resolved.


🧠 Learnings used
Learnt from: HarshwardhanPatil07
Repo: openshift/machine-config-operator PR: 6034
File: test/extended-priv/mco_security.go:1011-1015
Timestamp: 2026-05-13T07:41:05.710Z
Learning: In the extended-priv Go/Ginkgo test suite, it’s safe to use synchronous `o.Expect` assertions on node state immediately after calling `mc.create()`.

Reason: `MachineConfig.create()` internally waits for the MachineConfigPool to complete (`mcp.waitForComplete()`) before returning unless `mc.skipWaitForMcp` is explicitly set to `true`. It also adjusts wait timeouts when extensions are present (`HasExtensionsSafe()`) or when kernel type changes (`GetKernelTypeSafe()`).

So, for node-state checks placed directly after `mc.create()` (and with `skipWaitForMcp` not enabled), you should not require `o.Eventually` for correctness. If `mc.skipWaitForMcp` is set to `true`, then `o.Eventually` may be needed.

Learnt from: ptalgulk01
Repo: openshift/machine-config-operator PR: 6081
File: test/extended-priv/mco_layering.go:932-1078
Timestamp: 2026-05-25T15:37:54.265Z
Learning: In this repository, for migration PRs that port existing tests from openshift-tests-private into this codebase (e.g., OS layering tests landing under test/extended-priv), keep the PR strictly focused on migrating the test logic. Do not add new conditional guards, skip logic, or supporting infrastructure/checks (for example, MicroShift-related skip checks) as part of the migration—those should be handled in separate follow-up PRs.

Comment thread test/extended-priv/mco.go
Comment on lines +320 to +328
func checkUpdatedLists(l, r []*Node, step int) bool {
if len(l) != len(r) {
logger.Errorf("Compared lists have different size")
return false
}

indexStart := 0
for i := 0; i < len(l); i += step {
indexEnd := i + step
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard against step <= 0 before iterating.

At Line 327, i += step will never progress when step is 0 (or goes backwards if negative), causing a hang.

Proposed fix
 func checkUpdatedLists(l, r []*Node, step int) bool {
+	if step <= 0 {
+		logger.Errorf("step must be > 0, got %d", step)
+		return false
+	}
 	if len(l) != len(r) {
 		logger.Errorf("Compared lists have different size")
 		return false
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func checkUpdatedLists(l, r []*Node, step int) bool {
if len(l) != len(r) {
logger.Errorf("Compared lists have different size")
return false
}
indexStart := 0
for i := 0; i < len(l); i += step {
indexEnd := i + step
func checkUpdatedLists(l, r []*Node, step int) bool {
if step <= 0 {
logger.Errorf("step must be > 0, got %d", step)
return false
}
if len(l) != len(r) {
logger.Errorf("Compared lists have different size")
return false
}
indexStart := 0
for i := 0; i < len(l); i += step {
indexEnd := i + step
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended-priv/mco.go` around lines 320 - 328, The loop in function
checkUpdatedLists uses "for i := 0; i < len(l); i += step" which will hang or
behave incorrectly when step <= 0; add a guard at the start of checkUpdatedLists
validating that step is > 0 (return false and log an error via logger.Errorf if
not) so the loop cannot run with zero or negative increments, ensuring safe
iteration over l and r.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix is not required and we dont need to check the step count here

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ptalgulk01, understood — acknowledged and moving on.


✏️ Learnings added
Learnt from: ptalgulk01
Repo: openshift/machine-config-operator PR: 6080
File: test/extended-priv/mco.go:320-328
Timestamp: 2026-05-25T17:06:17.684Z
Learning: In `test/extended-priv/mco.go`, the `checkUpdatedLists(l, r []*Node, step int)` function does not need a `step <= 0` guard. The `step` parameter always comes from `maxUnavailable` (an MCP configuration value), which is guaranteed to be positive by the time it reaches this function. The maintainer has explicitly confirmed this guard is not required.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: HarshwardhanPatil07
Repo: openshift/machine-config-operator PR: 6034
File: test/extended-priv/mco_security.go:1011-1015
Timestamp: 2026-05-13T07:41:05.710Z
Learning: In the extended-priv Go/Ginkgo test suite, it’s safe to use synchronous `o.Expect` assertions on node state immediately after calling `mc.create()`.

Reason: `MachineConfig.create()` internally waits for the MachineConfigPool to complete (`mcp.waitForComplete()`) before returning unless `mc.skipWaitForMcp` is explicitly set to `true`. It also adjusts wait timeouts when extensions are present (`HasExtensionsSafe()`) or when kernel type changes (`GetKernelTypeSafe()`).

So, for node-state checks placed directly after `mc.create()` (and with `skipWaitForMcp` not enabled), you should not require `o.Eventually` for correctness. If `mc.skipWaitForMcp` is set to `true`, then `o.Eventually` may be needed.

Learnt from: ptalgulk01
Repo: openshift/machine-config-operator PR: 6081
File: test/extended-priv/mco_layering.go:932-1078
Timestamp: 2026-05-25T15:37:54.265Z
Learning: In this repository, for migration PRs that port existing tests from openshift-tests-private into this codebase (e.g., OS layering tests landing under test/extended-priv), keep the PR strictly focused on migrating the test logic. Do not add new conditional guards, skip logic, or supporting infrastructure/checks (for example, MicroShift-related skip checks) as part of the migration—those should be handled in separate follow-up PRs.

- Add tests: 77497, 77498, 77576, 77977, 78001, 78196, 79137, 79172, 82536, 83136, 83137, 83139, 83755, 85843, 85980, 87176, 88801
- Add supporting functions: checkUpdatedLists, SkipIfCompactOrSNO, GetAllApplicableExtensionsToMCPOrFail, AssertAllNonJobPodsToBeReadyWithPollerParams
- Add MCP methods: SetMaxUnavailable, RemoveMaxUnavailable, GetSortedUpdatedNodes
- Add MOSC methods: SetRenderedImagePushspec, GetRenderedImagePushspec, IsUsingInternalRegistry
- Preserve function ordering to match otp3 source
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
test/extended-priv/machineconfigpool.go (1)

564-564: 💤 Low value

Typo in comment: "thatn" should be "than"

-		// Check that there aren't more thatn maxUpdatingNodes updating at the same time
+		// Check that there aren't more than maxUpdatingNodes updating at the same time
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended-priv/machineconfigpool.go` at line 564, Fix the typo in the
comment that currently reads "// Check that there aren't more thatn
maxUpdatingNodes updating at the same time" by changing "thatn" to "than" so it
reads "// Check that there aren't more than maxUpdatingNodes updating at the
same time"; this comment is associated with the maxUpdatingNodes check in the
machineconfigpool test logic.
test/extended-priv/mco_ocb.go (1)

13-16: 💤 Low value

Imports are positioned after other imports, breaking standard grouping

The new imports (os, path/filepath, sync, time) are placed after the package imports block, which breaks the conventional import grouping (stdlib first, then external packages).

 import (
 	"fmt"
+	"os"
+	"path/filepath"
 	"regexp"
 	"slices"
 	"strings"
+	"sync"
+	"time"

 	g "github.com/onsi/ginkgo/v2"
 	o "github.com/onsi/gomega"
 	exutil "github.com/openshift/machine-config-operator/test/extended-priv/util"
 	logger "github.com/openshift/machine-config-operator/test/extended-priv/util/logext"
-	"os"
-	"path/filepath"
-	"sync"
-	"time"
 )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended-priv/mco_ocb.go` around lines 13 - 16, The import statements
for os, path/filepath, sync, and time are placed after the package-import block;
move these standard-library imports into the main import block and group imports
in canonical order (standard library first, then external packages) so the
import list in mco_ocb.go contains os, path/filepath, sync, time together with
other stdlib imports at the top of the import block.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended-priv/mco_ocb.go`:
- Around line 1104-1183: The test incorrectly assumes MachineSet.GetNodes()
returns nodes sorted by creation time; update the test to explicitly select the
most-recent node instead of using the last slice element: after calling
existingMS.GetNodes() and duplicateMS.GetNodes() (used in existingNode /
duplicateMSNode), iterate the returned nodes and pick the one with the latest
creation timestamp (use the node object's creation timestamp accessor) or update
MachineSet.GetNodes() implementation to sort by creationTimestamp before
returning; replace usages that index the last element with this explicit
selection to ensure the newest node is validated regardless of GetNodes()
ordering.

---

Nitpick comments:
In `@test/extended-priv/machineconfigpool.go`:
- Line 564: Fix the typo in the comment that currently reads "// Check that
there aren't more thatn maxUpdatingNodes updating at the same time" by changing
"thatn" to "than" so it reads "// Check that there aren't more than
maxUpdatingNodes updating at the same time"; this comment is associated with the
maxUpdatingNodes check in the machineconfigpool test logic.

In `@test/extended-priv/mco_ocb.go`:
- Around line 13-16: The import statements for os, path/filepath, sync, and time
are placed after the package-import block; move these standard-library imports
into the main import block and group imports in canonical order (standard
library first, then external packages) so the import list in mco_ocb.go contains
os, path/filepath, sync, time together with other stdlib imports at the top of
the import block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a5285284-556e-4ba9-b618-0863cd2fe7e4

📥 Commits

Reviewing files that changed from the base of the PR and between 56ad443 and a4634f5.

📒 Files selected for processing (6)
  • test/extended-priv/machineconfigpool.go
  • test/extended-priv/machineosconfig.go
  • test/extended-priv/mco.go
  • test/extended-priv/mco_ocb.go
  • test/extended-priv/util.go
  • test/extended-priv/util/pods.go

Comment on lines +1104 to +1183
func ValidateNewNodesBootDirectlyWithOCLImage(oc *exutil.CLI, mosc *MachineOSConfig, mcp *MachineConfigPool) {
// First validate MOSC build and application
ValidateSuccessfulMOSC(mosc, nil)

exutil.By("Get the OCL image")
oclImage := OrFail[string](mosc.GetStatusCurrentImagePullSpec())
logger.Infof("OCL image: %s\n", oclImage)

// Detect if using internal registry - internal registry always requires 2 reboots
isInternalRegistry := OrFail[bool](mosc.IsUsingInternalRegistry())

exutil.By("Check able to scale the node from existing Machineset")
msl, err := NewMachineSetList(oc.AsAdmin(), MachineAPINamespace).GetAll()
o.Expect(err).NotTo(o.HaveOccurred(), "Get machinesets failed")
o.Expect(msl).ShouldNot(o.BeEmpty(), "Machineset list is empty")
existingMS := msl[0]

o.Expect(existingMS.AddToScale(1)).NotTo(o.HaveOccurred())

defer func() {
exutil.By("Scale down the node from existing Machineset")
existingMS.AddToScale(-1)
mcp.waitForComplete()
}()

exutil.By("Create duplicate machineset and scale new node")
machineset := OrFail[*MachineSet](GetScalableMachineSet(oc.AsAdmin()))
duplicateMSName := machineset.GetName() + "-ocl"
duplicateMS, err := machineset.Duplicate(duplicateMSName)
o.Expect(err).NotTo(o.HaveOccurred())

defer func() {
if duplicateMS.Exists() {
logger.Infof("Deleting duplicate machineset %s and scale down the node", duplicateMS.GetName())
o.Expect(duplicateMS.ScaleTo(0)).To(o.Succeed())
o.Expect(duplicateMS.WaitUntilReady("10m")).To(o.Succeed())
o.Expect(duplicateMS.Delete()).To(o.Succeed())
}
}()
o.Expect(duplicateMS.ScaleTo(1)).To(o.Succeed())

exutil.By("Wait for both existing new node added and for duplicate new node added to get ready")

o.Eventually(func(gm o.Gomega) {
gm.Expect(existingMS.GetIsReady()).To(o.BeTrue(), "MachineSet %s is not ready", existingMS.GetName())
gm.Expect(duplicateMS.GetIsReady()).To(o.BeTrue(), "MachineSet %s is not ready", duplicateMS.GetName())
}, "20m", "2m").Should(o.Succeed(), "MachineSets are not ready")

// For internal registry, wait for MCP to complete as it always requires 2 reboots
if isInternalRegistry {
logger.Infof("Internal registry detected, waiting for MCP to complete (requires 2 reboots)")
mcp.waitForComplete()
}
logger.Infof("OK!\n")

exutil.By("Get the new node created by the machine set")

existingMSNodes, nErr := existingMS.GetNodes()
o.Expect(nErr).NotTo(o.HaveOccurred(), "Error getting the nodes created by MachineSet %s", existingMS.GetName())
// Get the most recently created node (last in the sorted list)
existingNode := existingMSNodes[len(existingMSNodes)-1]
logger.Infof("Existing node: %s\n", existingNode.GetName())
logger.Infof("OK!\n")

duplicateMSNodes, nErr := duplicateMS.GetNodes()
o.Expect(nErr).NotTo(o.HaveOccurred(), "Error getting the nodes created by MachineSet %s", duplicateMS.GetName())
// Get the most recently created node (last in the sorted list)
duplicateMSNode := duplicateMSNodes[len(duplicateMSNodes)-1]
logger.Infof("Duplicate node: %s\n", duplicateMSNode.GetName())
logger.Infof("OK!\n")

getNewNodeRebootValueForOCL(duplicateMSNode, oclImage)
getNewNodeRebootValueForOCL(existingNode, oclImage)

exutil.By("Wait for MCP to complete before scaling down nodes")
// This wait is necessary to prevent MCP from becoming temporarily degraded when using external registry
// (internal registry already waits via waitForComplete earlier in the test)
mcp.waitForComplete()
logger.Infof("OK!\n")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Verify assumption that GetNodes returns nodes sorted by creation time

At lines 1164 and 1171, the code assumes that GetNodes() returns nodes sorted by creation timestamp so that the last element is the most recently created node. If this assumption is incorrect, the test may validate the wrong node.

#!/bin/bash
# Verify that MachineSet.GetNodes() returns nodes sorted by creation time
# Check the GetNodes implementation in machineset.go

ast-grep --pattern 'func ($_ *MachineSet) GetNodes() $_ {
  $$$
}'

# Also check if there's any sorting applied
rg -n -C5 'GetNodes\(\)' test/extended-priv/machineset.go
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended-priv/mco_ocb.go` around lines 1104 - 1183, The test incorrectly
assumes MachineSet.GetNodes() returns nodes sorted by creation time; update the
test to explicitly select the most-recent node instead of using the last slice
element: after calling existingMS.GetNodes() and duplicateMS.GetNodes() (used in
existingNode / duplicateMSNode), iterate the returned nodes and pick the one
with the latest creation timestamp (use the node object's creation timestamp
accessor) or update MachineSet.GetNodes() implementation to sort by
creationTimestamp before returning; replace usages that index the last element
with this explicit selection to ensure the newest node is validated regardless
of GetNodes() ordering.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 25, 2026

@ptalgulk01: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/verify a4634f5 link true /test verify
ci/prow/unit a4634f5 link true /test unit

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants