Skip to content

[MCO-1997] Add test for osImageStream#5881

Open
ptalgulk01 wants to merge 1 commit into
openshift:mainfrom
ptalgulk01:add-osimagestream-tests
Open

[MCO-1997] Add test for osImageStream#5881
ptalgulk01 wants to merge 1 commit into
openshift:mainfrom
ptalgulk01:add-osimagestream-tests

Conversation

@ptalgulk01
Copy link
Copy Markdown
Contributor

@ptalgulk01 ptalgulk01 commented Apr 24, 2026

Test Cases Added

OCP-88122: Validate osImageStream inheritance for custom MCPs
Validates dynamic inheritance behavior in custom MachineConfigPools:

  • Case 1: Custom MCPs inherit worker pool's default osImageStream (rhel-9)
  • Case 2: Dynamic inheritance - infra MCP created before worker osImageStream change inherits the new value (rhel-10)
  • Case 3: Custom MCPs can override with explicit osImageStream values

OCP-88708: Verify osstream logs for boot image controller
Validates boot image controller logging for unsupported osImageStream values:

  • Verifies controller logs appropriate warning messages
  • Changes to supported stream and verifies logs stop
  • Tests CPMS integration with managed boot images

OCP-88203: Verify MOSB triggered when osImageStream is patched
Validates MachineOSBuild resource creation when osImageStream changes:

  • Patches custom MCP osImageStream to rhel-10
  • Verifies MOSB resource is created and completes successfully
  • Validates node uses correct osImage from new stream

OCP-88365: Verify MCP degraded when osstream and osImageURL MC applied
Validates error handling for conflicting osstream and osImageURL configurations:

  • Applies MachineConfig with both osstream and osImageURL fields
  • Verifies MCP enters degraded state with appropriate error message
  • Tests recovery after removing conflicting configuration

Implementation Details

  • Uses custom-machine-config-pool.yaml template for MCPs without osImageStream field (dynamic inheritance)
  • Adds AddNodeToPool() helper method to simplify node addition to custom pools
  • Uses NewController().GetLogs() for boot image controller log verification
  • Validates osImage on nodes matches expected stream values

Summary by CodeRabbit

  • Tests
    • Added disruptive tests for OS image stream behavior: inheritance across pools, image-mode updates triggering builds, conflict detection/recovery, and controller logging scenarios (including CPMS cases).
  • Tests / Helpers
    • Added shared helper to add/label nodes and wait for pool readiness, a MachineSet label helper for OS stream tests, and a utility to skip SNO/Compact clusters.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 24, 2026

Caution

Review failed

Failed to post review comments

Walkthrough

Adds a reusable helper to label nodes into custom MachineConfigPools and wait for MCP status; updates MCP creation helpers to optionally omit spec.osImageStream (so pools can inherit worker osImageStream); adds a MachineSet osstream label helper, a test skip helper for Compact/SNO clusters, and five disruptive Ginkgo tests validating osImageStream behaviors and controller logging.

Changes

MachineConfigPool helpers, MachineSet helper, and osImageStream tests

Layer / File(s) Summary
Core helper: label nodes & MCP readiness
test/extended-priv/machineconfigpool.go
Adds exported AddNodesToMachineConfigPool(oc *exutil.CLI, mcpName string, nodes []*Node) error which labels nodes with the MCP role label, waits for status.machineCount to match expected total, then waits for status.conditions[type=Updated].status==True.
MCP creation & centralized node-labeling
test/extended-priv/machineconfigpool.go
CreateCustomMCP delegates to CreateCustomMCPWithStream with an empty osstream so spec.osImageStream isn't set; CreateCustomMCPWithStreamByNodes selects templates only when osstream is non-empty and delegates node labeling/waits to AddNodesToMachineConfigPool; architecture-specific MCP creation now passes empty osstream.
MachineSet helper: set osstream label
test/extended-priv/machineset.go
Adds MachineSet.AddOSStreamLabel(streamName string) error which patches metadata.labels["machineconfiguration.openshift.io/osstream"] to the provided value.
Test skip helper
test/extended-priv/util.go
Adds exported SkipIfCompactOrSNO(oc *exutil.CLI) that calls IsCompactOrSNOCluster(oc) and invokes g.Skip with a fixed message when true.
Disruptive Ginkgo tests for osImageStream and logging
test/extended-priv/mco_osimagestream.go
Adds five disruptive tests under OCPFeatureGate:OSStreams validating: custom MCP osImageStream inheritance and per-pool behavior (Polarion 88122); MOSB triggering on MCP osImageStream patch in image-mode (88203); degradation on osImageURL/osImageStream conflict and recovery (88365); boot-image-controller logs for unsupported osstream and recovery when switched (88708); and controller logging behavior when CPMS-managed boot images are enabled (88995).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

lgtm, approved, jira/valid-reference, verified

Suggested reviewers

  • umohnani8
  • pablintino
🚥 Pre-merge checks | ✅ 9 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Test Structure And Quality ⚠️ Warning Test 88122 violates single responsibility by testing three scenarios (inheritance, dynamic inheritance, override) in one It block with multiple sequential assertions. Split test 88122 into separate tests for each scenario (inheritance, dynamic inheritance, override). This aligns with Ginkgo best practices where each test validates a single behavior.
Single Node Openshift (Sno) Test Compatibility ⚠️ Warning Test PolarionID:88995 assumes worker MachineSets exist without SNO guard; will fail on SNO clusters where only master pool exists. Add SkipIfCompactOrSNO(oc.AsAdmin()) at start of test 88995, or change to get all MachineSets from any namespace and handle empty list gracefully.
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning Test [PolarionID:88203] calls CreateMachineOSConfigUsingExternalOrInternalRegistry which falls back to quay.io/mcoqe/layering, requiring external connectivity. Missing [Skipped:Disconnected] tag. Add [Skipped:Disconnected] tag to test [PolarionID:88203] to skip in disconnected environments, or ensure graceful handling of external registry fallbacks.
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[MCO-1997] Add test for osImageStream' directly aligns with the main objective of this PR, which is to add comprehensive end-to-end tests for osImageStream behavior across multiple test cases and scenarios.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All 5 new Ginkgo test cases in mco_osimagestream.go have stable, deterministic test names with no dynamic content (no fmt.Sprintf, variables, pod/node names, timestamps, UUIDs, or IP addresses).
Microshift Test Compatibility ✅ Passed All new tests properly tag machine.openshift.io usage with [apigroup:machine.openshift.io], allowing MicroShift CI to automatically skip incompatible tests.
Topology-Aware Scheduling Compatibility ✅ Passed This PR adds osImageStream E2E tests in test/extended-priv. Multi-node tests use SkipIfCompactOrSNO(). No operator manifests, no scheduling constraints added.
Ote Binary Stdout Contract ✅ Passed No OTE Binary Stdout Contract violations found. All logging within test blocks; new helpers properly encapsulated; no process-level stdout writes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
test/extended-priv/mco_osimagestream.go (2)

317-319: Redundant waitForComplete after AddNodeToPool.

AddNodeToPool already calls WaitImmediateForUpdatedStatus internally, so this explicit waitForComplete adds only additional wait time without additional verification. If the extra wait is intentional to guard against the TDZ-style race flagged in AddNodeToPool, prefer hardening the helper instead of layering waits at each call site.

♻️ Proposed trim
 		exutil.By("Add a node to infra pool")
 		err = infraMcp.AddNodeToPool(mcp.GetSortedNodesOrFail()[0])
 		o.Expect(err).NotTo(o.HaveOccurred(), "Error adding node to infra pool")
 		logger.Infof("OK!\n")
 
-		exutil.By("Wait for infra pool to complete update with the new node")
-		infraMcp.waitForComplete()
-		logger.Infof("OK!\n")
-
 		exutil.By("Verify node in infra pool inherited rhel-10 from worker pool")
 		validateOsImageStreamInPool(infraMcp, osis, OSImageStreamRHEL10)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended-priv/mco_osimagestream.go` around lines 317 - 319, The call
site uses infraMcp.waitForComplete() immediately after AddNodeToPool, which is
redundant because AddNodeToPool already invokes WaitImmediateForUpdatedStatus;
remove the extra infraMcp.waitForComplete() call (or the line invoking
logger.Infof("OK!\n") should remain but without the redundant wait) and instead,
if guarding against the TDZ-style race is required, harden
AddNodeToPool/WaitImmediateForUpdatedStatus itself (update AddNodeToPool to
perform any additional verification or retry logic around
WaitImmediateForUpdatedStatus) so callers like this test (using infraMcp and
AddNodeToPool) do not need to layer waits.

469-481: Tighten test robustness against controller log message changes.

The test hardcodes "machineset tc-%s-dup has unsupported stream" at line 475, while the actual controller log is "machineset %s has unsupported stream: %v, skipping boot image update" (per pkg/controller/bootimage/ms_helpers.go:126). Using ContainSubstring at lines 505/518 masks this coupling—the test currently passes because its substring is contained in the actual log, but if the controller reformats the message (capitalization, phrasing, additional fields), the test becomes brittle and may silently break.

For consistency with line 541, which already uses the looser "has unsupported stream" substring, consider updating lines 475/505/518 to match on a stable substring anchor (e.g., duplicateMsName + " has unsupported stream" or just "has unsupported stream") rather than the full hardcoded message.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended-priv/mco_osimagestream.go` around lines 469 - 481, Replace the
overly specific expected log string in the test and assertions with a stable
substring anchor: update the variable unsupportedStreamLogMsg (and any
assertions referencing it) to use a looser match such as duplicateMsName + " has
unsupported stream" or simply "has unsupported stream", and ensure the
ContainSubstring checks against controller log output (where NewController,
duplicateMsName and unsupportedStreamLogMsg are used) use that substring to
avoid brittle exact-format coupling with ms_helpers.go log text.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/extended-priv/machineconfigpool.go`:
- Around line 1552-1577: In AddNodeToPool, avoid the race where a pre-labeled
node inflates currentNodes and causes WaitForMachineCount to hang: first call
node.GetLabel("node-role.kubernetes.io/<pool>") (use mcp.GetName() to build the
key) and only call node.AddLabel(...) if the label is absent; after successfully
adding the label (or if you detect a fresh addition), wait for the MCP to see
the new machine with mcp.WaitForMachineCount(expectedCount, ...), then instead
of immediately calling WaitImmediateForUpdatedStatus, poll the MCP for
Updating==True (or assert updatedMachineCount == machineCount) as a handshake
before asserting Updated==True (so replace or precede the
WaitImmediateForUpdatedStatus call with a short WaitForUpdating/condition check
to ensure reconciliation started).

In `@test/extended-priv/mco_osimagestream.go`:
- Around line 456-466: After asserting the MCP is degraded, remove the
conflicting MachineConfig (call the same removal used in cleanup, e.g. invoke
mc.DeleteWithWait() or DeleteCustomMCP()) and then explicitly assert recovery by
polling the customMcp: use o.Eventually(customMcp, "5m",
"20s").Should(HaveConditionField("RenderDegraded", "status", FalseString)) and
also check Updated condition true (e.g. HaveConditionField("Updated", "status",
TrueString)) to ensure the MCP transitions back to healthy; place these
assertions after deleting the MC and before relying on deferred cleanup to catch
regressions.
- Around line 517-521: The test currently uses Eventually(controller.GetLogs,
"2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) which can
pass immediately because controller.IgnoreLogsBeforeNowOrFail() clears logs;
change this to Consistently(controller.GetLogs, "2m",
"10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) so the assertion
verifies the unsupportedStreamLogMsg never appears for the entire duration;
update the call site where controller.GetLogs and unsupportedStreamLogMsg are
referenced and ensure the same polling interval arguments are used with
Consistently instead of Eventually.
- Around line 483-492: The code selects the first MachineSet from existingMsList
and doesn't handle delete errors; update the selection to filter only worker
MachineSets by calling
ByLabel("machine.openshift.io/cluster-api-machine-role=worker") on the
NewMachineSetList(oc.AsAdmin(), MachineAPINamespace) result before using
existingMsList[0], and change the deferred cleanup to call duplicateMs.Delete()
and log any returned error (e.g., check the error returned by
duplicateMs.Delete() and logger.Errorf with duplicateMsName and the error)
instead of discarding it.

---

Nitpick comments:
In `@test/extended-priv/mco_osimagestream.go`:
- Around line 317-319: The call site uses infraMcp.waitForComplete() immediately
after AddNodeToPool, which is redundant because AddNodeToPool already invokes
WaitImmediateForUpdatedStatus; remove the extra infraMcp.waitForComplete() call
(or the line invoking logger.Infof("OK!\n") should remain but without the
redundant wait) and instead, if guarding against the TDZ-style race is required,
harden AddNodeToPool/WaitImmediateForUpdatedStatus itself (update AddNodeToPool
to perform any additional verification or retry logic around
WaitImmediateForUpdatedStatus) so callers like this test (using infraMcp and
AddNodeToPool) do not need to layer waits.
- Around line 469-481: Replace the overly specific expected log string in the
test and assertions with a stable substring anchor: update the variable
unsupportedStreamLogMsg (and any assertions referencing it) to use a looser
match such as duplicateMsName + " has unsupported stream" or simply "has
unsupported stream", and ensure the ContainSubstring checks against controller
log output (where NewController, duplicateMsName and unsupportedStreamLogMsg are
used) use that substring to avoid brittle exact-format coupling with
ms_helpers.go log text.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d3499c12-6325-4124-af9e-d28cce7052b6

📥 Commits

Reviewing files that changed from the base of the PR and between 5f1b7ad and 043492b.

📒 Files selected for processing (2)
  • test/extended-priv/machineconfigpool.go
  • test/extended-priv/mco_osimagestream.go

Comment thread test/extended-priv/machineconfigpool.go Outdated
Comment on lines +1552 to +1577
// AddNodeToPool adds a node to the MCP and waits for the MCP to recognize it
func (mcp *MachineConfigPool) AddNodeToPool(node *Node) error {
currentNodes, err := mcp.GetNodes()
if err != nil {
return fmt.Errorf("error getting current nodes from %s: %w", mcp.GetName(), err)
}
expectedCount := len(currentNodes) + 1

err = node.AddLabel(fmt.Sprintf("node-role.kubernetes.io/%s", mcp.GetName()), "")
if err != nil {
return fmt.Errorf("error labeling node %s to add it to pool %s: %w", node.GetName(), mcp.GetName(), err)
}
logger.Infof("Node %s added to pool %s", node.GetName(), mcp.GetName())

err = mcp.WaitForMachineCount(expectedCount, 5*time.Minute)
if err != nil {
return fmt.Errorf("the %s MCP is not reporting the expected machine count: %w", mcp.GetName(), err)
}

err = mcp.WaitImmediateForUpdatedStatus()
if err != nil {
return fmt.Errorf("the %s MCP is not updated: %w", mcp.GetName(), err)
}

return nil
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
ast-grep --pattern $'func (n *Node) HasLabel($_, $_) $_'
rg -nP --type=go -C2 '\bfunc \(.*Node\) (Has|Get)Label\b'

Repository: openshift/machine-config-operator

Length of output: 59


🏁 Script executed:

# Find the Node type definition and its methods
fd -e go test/extended-priv/
head -100 test/extended-priv/machineconfigpool.go

# Search for Node type definition
rg -A 20 "^type Node struct" test/extended-priv/

# Search for any label-related methods on Node
rg "func.*Node.*Label" test/extended-priv/

# Check what methods Node actually has
rg "func \(.*\*?Node\)" test/extended-priv/ | head -30

Repository: openshift/machine-config-operator

Length of output: 7984


🏁 Script executed:

# Also search for where AddLabel is implemented/used elsewhere
rg -B 5 -A 10 "AddLabel" test/extended-priv/ | head -80

Repository: openshift/machine-config-operator

Length of output: 5884


🏁 Script executed:

# Check how WaitImmediateForUpdatedStatus is implemented
rg -A 15 "func.*WaitImmediateForUpdatedStatus" test/extended-priv/

# Check WaitForMachineCount implementation  
rg -A 15 "func.*WaitForMachineCount" test/extended-priv/

# Check GetNodes to understand what it returns
rg -A 10 "func.*GetNodes\b" test/extended-priv/

Repository: openshift/machine-config-operator

Length of output: 5386


Consider the race condition with immediate polling and pre-existing node membership.

Two minor concerns with this helper:

  1. Pre-existing membership timeout: If the node already has the node-role.kubernetes.io/<pool> label, it will already be included in currentNodes and the expectation of len(currentNodes) + 1 machines will never be satisfied, causing WaitForMachineCount to time out. A check using GetLabel() before labeling would avoid this unnecessary wait.

  2. Race with WaitImmediateForUpdatedStatus: The immediate polling (immediate=true) means WaitImmediateForUpdatedStatus can observe the stale Updated=True status from a prior quiesced state, returning success before the MCP controller has actually begun reconciling the newly-added node. While WaitForMachineCount helps bridge this, consider explicitly waiting for Updating=True first as a handshake before checking for Updated=True, or use updatedMachineCount == machineCount as a synchronization signal.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended-priv/machineconfigpool.go` around lines 1552 - 1577, In
AddNodeToPool, avoid the race where a pre-labeled node inflates currentNodes and
causes WaitForMachineCount to hang: first call
node.GetLabel("node-role.kubernetes.io/<pool>") (use mcp.GetName() to build the
key) and only call node.AddLabel(...) if the label is absent; after successfully
adding the label (or if you detect a fresh addition), wait for the MCP to see
the new machine with mcp.WaitForMachineCount(expectedCount, ...), then instead
of immediately calling WaitImmediateForUpdatedStatus, poll the MCP for
Updating==True (or assert updatedMachineCount == machineCount) as a handshake
before asserting Updated==True (so replace or precede the
WaitImmediateForUpdatedStatus call with a short WaitForUpdating/condition check
to ensure reconciliation started).

Comment thread test/extended-priv/mco_osimagestream.go
Comment thread test/extended-priv/mco_osimagestream.go Outdated
Comment on lines +483 to +492
exutil.By(fmt.Sprintf("Create duplicate MachineSet %s from existing worker MachineSet", duplicateMsName))
existingMsList := NewMachineSetList(oc.AsAdmin(), MachineAPINamespace).GetAllOrFail()
o.Expect(len(existingMsList)).To(o.BeNumerically(">", 0), "At least one MachineSet should exist")
duplicateMs, err = existingMsList[0].Duplicate(duplicateMsName)
o.Expect(err).NotTo(o.HaveOccurred(), "Error duplicating MachineSet")
defer func() {
logger.Infof("Cleaning up duplicate MachineSet %s", duplicateMsName)
duplicateMs.Delete()
}()
logger.Infof("OK!\n")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find how MachineSet role is inspected elsewhere
rg -nP --type=go -C2 '(cluster-api-machine-role|machine\.openshift\.io/cluster-api-machine-role)'
ast-grep --pattern $'func ($_ *MachineSet) $_() $$$'

Repository: openshift/machine-config-operator

Length of output: 6496


🏁 Script executed:

#!/bin/bash
# Search for MachineSet.Get() usages to understand field path format
rg -nP --type=go 'ms\.Get\(' -A2 -B2
# Search for worker role references
rg -nP --type=go '(worker|cluster-api-machine-role)' --glob='*test*' | head -30

Repository: openshift/machine-config-operator

Length of output: 11298


🏁 Script executed:

#!/bin/bash
# Find MachineSet.Duplicate() and Delete() methods
rg -nP --type=go 'func \(.*MachineSet.*\) (Duplicate|Delete)' -A3
# Search for existing worker filter patterns or helper functions
rg -nP --type=go 'NewMachineSetList|GetAllOrFail|ByLabel' -A2 -B2 | head -50

Repository: openshift/machine-config-operator

Length of output: 3954


🏁 Script executed:

#!/bin/bash
# Find the Delete() method for MachineSet
rg -nP --type=go 'func \(.*MachineSet.*\) Delete' -A3
# Search for ByLabel method on MachineSetList
rg -nP --type=go 'func.*MachineSetList.*ByLabel' -A3
# Check CloneResource to understand Duplicate's implementation
rg -nP --type=go 'func CloneResource' -A10

Repository: openshift/machine-config-operator

Length of output: 852


🏁 Script executed:

#!/bin/bash
# Find Delete method more broadly
rg -nP --type=go 'Delete.*\(\)' test/extended-priv/machineset.go -B2 -A5
# Also search in resource.go for Delete method
rg -nP --type=go 'func \(.*\) Delete' test/extended-priv/resource.go -B2 -A5
# Look for MachineSetList ByLabel
rg -nP --type=go '\.ByLabel' test/extended-priv/machineset.go -B2 -A2
# Check if Delete is called elsewhere in the codebase
rg -nP --type=go 'duplicateMs\.Delete\(\)|\.Delete\(\)' test/extended-priv/ -B1 -A1

Repository: openshift/machine-config-operator

Length of output: 16087


🏁 Script executed:

#!/bin/bash
# Find ByLabel implementation on ResourceList or MachineSetList
rg -nP --type=go 'func.*ByLabel' test/extended-priv/ -A3
# Check what labels are used to identify worker MachineSets
rg -nP --type=go 'worker.*MachineSet|MachineSet.*worker' test/extended-priv/ -B2 -A2
# Look for role-based filtering in MachineSet selection
rg -nP --type=go 'cluster-api-machine-role' test/extended-priv/ -B2 -A2

Repository: openshift/machine-config-operator

Length of output: 2623


🏁 Script executed:

#!/bin/bash
# Check if there are other instances of filtering worker MachineSets
rg -nP --type=go 'MachineSetList.*GetAll|NewMachineSetList' test/extended-priv/ -A5 | head -40
# Look for worker pool constants
rg -nP --type=go 'MachineConfigPoolWorker|MachineAPINamespace' test/extended-priv/const.go -B2 -A2

Repository: openshift/machine-config-operator

Length of output: 3399


Filter MachineSet selection to worker role and handle the cleanup deletion error.

Two issues in this block:

  1. existingMsList[0] picks the first MachineSet in API order, not necessarily a worker. Clusters with infra or other MachineSets will cause subsequent assertions tied to osstream labels to fail or produce misleading results. The comment itself says "from existing worker MachineSet"—filter explicitly using:

    existingMsList.ByLabel("machine.openshift.io/cluster-api-machine-role=worker")
  2. The deferred duplicateMs.Delete() silently discards errors. If deletion fails, the duplicate MachineSet and its provisioned machines leak into subsequent tests. Log the error:

    if derr := duplicateMs.Delete(); derr != nil {
        logger.Errorf("Failed to delete duplicate MachineSet %s: %v", duplicateMsName, derr)
    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended-priv/mco_osimagestream.go` around lines 483 - 492, The code
selects the first MachineSet from existingMsList and doesn't handle delete
errors; update the selection to filter only worker MachineSets by calling
ByLabel("machine.openshift.io/cluster-api-machine-role=worker") on the
NewMachineSetList(oc.AsAdmin(), MachineAPINamespace) result before using
existingMsList[0], and change the deferred cleanup to call duplicateMs.Delete()
and log any returned error (e.g., check the error returned by
duplicateMs.Delete() and logger.Errorf with duplicateMsName and the error)
instead of discarding it.

Comment thread test/extended-priv/mco_osimagestream.go
Comment thread test/extended-priv/machineconfigpool.go Outdated
}

// AddNodeToPool adds a node to the MCP and waits for the MCP to recognize it
func (mcp *MachineConfigPool) AddNodeToPool(node *Node) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't add nodes to a master pool, for example. So we can't define this function as a general method in a MCP.

It would be better if we write it as an independent function accepting a MCP as parameter, and not as a method in the MCP struct.

This code is very similar to the one in CreateCustomMCPWithStreamByNodes

We can make the function to add several nodes, not only one, to the poo and reuse it in CreateCustomMCPWithStreamByNodes avoiding duplicated code.

Comment thread test/extended-priv/mco_osimagestream.go Outdated
Comment on lines +240 to +241
g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters")
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the SkipIfCompactOrSNO function that is used in the private repo. It seems it hasn't been migrated yet, but it should be migrated anyway.

// SkipIfCompactOrSNO skips the test case if the cluster is a compact or SNO cluster
func SkipIfCompactOrSNO(oc *exutil.CLI) {
        if IsCompactOrSNOCluster(oc) {
                g.Skip("The test is not supported in Compact or SNO clusters")
        } 
}

Comment thread test/extended-priv/mco_osimagestream.go Outdated
Comment on lines +348 to +350
if IsCompactOrSNOCluster(oc.AsAdmin()) {
g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters")
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, let's migrate and use the SkipIfCompactOrSNO function

Comment thread test/extended-priv/mco_osimagestream.go Outdated
Comment on lines +393 to +399
o.Eventually(func() string {
currentMOSB, err := mosc.GetCurrentMachineOSBuild()
if err != nil {
return ""
}
return currentMOSB.GetName()
}, "5m", "20s").ShouldNot(o.Equal(initialMOSBName), "A new MachineOSBuild should be triggered after osImageStream change")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is cleaner if we don't swallow the error silently

  o.Eventually(func() (string, error) {                                                                                                                                                                            
      mosb, err := mosc.GetCurrentMachineOSBuild()                                                                                                                                                                 
      if err != nil {                                                                                                                                                                                              
          return "", err
      }                                                                                                                                                                                                            
      return mosb.GetName(), nil
  }, "5m", "20s").ShouldNot(o.Equal(initialMOSBName),
      "A new MachineOSBuild should be triggered after osImageStream change")

Comment thread test/extended-priv/mco_osimagestream.go Outdated
Comment on lines +408 to +410
if IsCompactOrSNOCluster(oc.AsAdmin()) {
g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters")
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same let's use the SkipIfCompactOrSNO function

Comment thread test/extended-priv/mco_osimagestream.go Outdated
logger.Infof("OK!\n")

exutil.By(fmt.Sprintf("Set osstream label to %s on MachineSet", OSImageStreamRHEL10))
err = duplicateMs.Patch("merge", fmt.Sprintf(`{"metadata":{"labels":{"machineconfiguration.openshift.io/osstream":"%s"}}}`, OSImageStreamRHEL10))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a method for this, so that it can be reused in future tests.

We can use AdLabel instead. But I think it would be better to use even a more specific method so that we don't have to care about the name of the label.

duplicateMs.AddOSStreamLabel(...)

Then this function can call "AddLabel".

Comment thread test/extended-priv/mco_osimagestream.go Outdated
// Reset log checkpoint before changing to supported stream to verify the new behavior
exutil.By("Change osstream label to rhel-9 and verify no unsupported stream log is triggered")
controller.IgnoreLogsBeforeNowOrFail()
err = duplicateMs.Patch("merge", fmt.Sprintf(`{"metadata":{"labels":{"machineconfiguration.openshift.io/osstream":"%s"}}}`, OSImageStreamRHEL9))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Comment thread test/extended-priv/mco_osimagestream.go Outdated
Comment on lines +523 to +534
exutil.By("Enable CPMS for managed boot images")
cpms = NewControlPlaneMachineSet(oc.AsAdmin(), MachineAPINamespace, ControlPlaneMachineSetName)
o.Expect(cpms.Exists()).To(o.BeTrue(), "ControlPlaneMachineSet should exist for this test")

machineConfig = GetMachineConfiguration(oc.AsAdmin())
defer machineConfig.RemoveManagedBootImagesConfig()

o.Expect(
machineConfig.SetAllManagedBootImagesConfig(ControlPlaneMachineSetResource),
).To(o.Succeed(), "Error enabling CPMS for managed boot images")
logger.Infof("OK!\n")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPMS and Machinesets are two different functionalities.

We should use a different test case for the CPMS

Comment thread test/extended-priv/mco_osimagestream.go Outdated
Comment on lines +300 to +302
err = NewMCOTemplate(oc.AsAdmin(), "custom-machine-config-pool.yaml").Create(
"-p", fmt.Sprintf("NAME=%s", infraName),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call CreateCustomMCP


exutil.By("Create custom MCP with rhel-9 configured")
defer DeleteCustomMCP(oc.AsAdmin(), mcp1Name)
mcp1, err := CreateCustomMCP(oc.AsAdmin(), mcp1Name, 1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateCustomMCPWithStream(oc.AsAdmin(), mcp2Name, OSImageStreamRHEL9, 1)

// No osstream field is set - MCP will inherit from worker pool automatically
// Nodes will be taken from the worker pool
// The Custom pool will be created d using the same osstream configured in the worker pool
func CreateCustomMCP(oc *exutil.CLI, name string, numNodes int) (*MachineConfigPool, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original CreateCustomMCP was like this

func CreateCustomMCP(oc *exutil.CLI, name string, numNodes int) (*MachineConfigPool, error) {
        var (
                wMcp = NewMachineConfigPool(oc.AsAdmin(), MachineConfigPoolWorker)
        )

        workerNodes, err := wMcp.GetNodes()
        if err != nil {
                return NewMachineConfigPool(oc, name), err
        }

        if numNodes > len(workerNodes) {
                return NewMachineConfigPool(oc, name), fmt.Errorf("A %d nodes custom pool cannot be created because there are only %d nodes in the %s pool",
                        numNodes, len(workerNodes), wMcp.GetName())
        }

        return CreateCustomMCPByNodes(oc, name, workerNodes[0:numNodes])
}

Now instead of CreateCustomMCPByNodes we have CreateCustomMCPWithStreamByNodes.

Let's CreateCustomMCPWithStreamByNodes decide which template to use depending on if the stream is empty or it has a value.

"Controller should not log unsupported stream message after changing to supported stream")
logger.Infof("Controller correctly stopped logging unsupported stream for %s", duplicateMsName)
logger.Infof("OK!\n")
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the CPMS check be done in another test case? Is this new test case created in polarion?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OCP-88995 added this TC

@ptalgulk01 ptalgulk01 force-pushed the add-osimagestream-tests branch 2 times, most recently from fade1cc to 81f5ab9 Compare May 8, 2026 12:27
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended-priv/machineconfigpool.go`:
- Around line 1229-1263: CreateCustomMCP currently returns immediately after
NewMCOTemplate(...).Create when numNodes == 0, so the MCP may not be reconciled;
update CreateCustomMCP to wait for the MCP to reach the reconciled/Updated=True
state after creating the resource by invoking the MachineConfigPool wait helper
(e.g., call customMcp.WaitForComplete() or the existing wait method on the
MachineConfigPool returned by NewMachineConfigPool) and return any error from
that wait before returning the customMcp; keep the existing behavior when
numNodes > 0 (where subsequent AddNodesToMachineConfigPool covers
reconciliation).

In `@test/extended-priv/mco_osimagestream.go`:
- Around line 531-532: The test uses
Eventually(controller.GetLogs...).ShouldNot(ContainSubstring(unsupportedStreamLogMsg))
which can pass trivially because controller.IgnoreLogsBeforeNowOrFail() clears
the cursor; replace the Eventually assertion with Consistently to assert absence
over time: call Consistently(controller.GetLogs, "2m",
"10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) (keep the existing
controller.IgnoreLogsBeforeNowOrFail() and the same duration/interval values) so
the test actually verifies the message never appears during the 2-minute window.
- Around line 524-534: Filter the MachineSet list returned by
NewMachineSetList(oc.AsAdmin(), MachineAPINamespace).GetAllOrFail() to pick a
worker MachineSet (match label
"machine.openshift.io/cluster-api-machine-role=worker") instead of using
existingMsList[0], and after calling
targetMs.AddOSStreamLabel(OSImageStreamRHEL9) register a cleanup to remove that
label (e.g., defer calling the corresponding remove/unset method on targetMs or
using targetMs.DeleteLabel/Update to clear
machineconfiguration.openshift.io/osstream) so the label does not persist after
the test; reference NewMachineSetList, GetAllOrFail, existingMsList, targetMs,
and AddOSStreamLabel to locate the changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a29a1dde-57c9-4240-8748-597180a46e0f

📥 Commits

Reviewing files that changed from the base of the PR and between 043492b and 81f5ab9.

📒 Files selected for processing (4)
  • test/extended-priv/machineconfigpool.go
  • test/extended-priv/machineset.go
  • test/extended-priv/mco_osimagestream.go
  • test/extended-priv/util.go

Comment thread test/extended-priv/machineconfigpool.go
Comment thread test/extended-priv/mco_osimagestream.go
Comment thread test/extended-priv/mco_osimagestream.go Outdated
@ptalgulk01 ptalgulk01 force-pushed the add-osimagestream-tests branch from 81f5ab9 to 707a9aa Compare May 12, 2026 08:46
@ptalgulk01 ptalgulk01 force-pushed the add-osimagestream-tests branch from 707a9aa to 01678a9 Compare May 25, 2026 15:30
Comment thread test/extended-priv/machineconfigpool.go Outdated
// No nodes to add; wait for the MCP to reach Updated=True before returning
if err := customMcp.WaitImmediateForUpdatedStatus(); err != nil {
return customMcp, fmt.Errorf("the %s MCP did not reach Updated=True after creation: %w", name, err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid duplicated code by just calling CreateCustomMCPWithStream with an empty osstream

This function called CreateCustomMCPWithStreamByNodes, was it a problem?

Comment thread test/extended-priv/machineconfigpool.go Outdated
Comment on lines +1256 to +1268
// Choose template based on whether osstream is provided
if osstream == "" {
// Use basic template without osImageStream
err = NewMCOTemplate(oc, "custom-machine-config-pool.yaml").Create(
"-p", fmt.Sprintf("NAME=%s", name),
)
} else {
// Use template with osImageStream
err = NewMCOTemplate(oc, "custom-machine-config-pool-osimagestream.yaml").Create(
"-p", fmt.Sprintf("NAME=%s", name),
"-p", fmt.Sprintf("OSIMAGESTREAM=%s", osstream),
)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already done in CreateCustomMCPWithStreamByNodes, right?

We should delay the template until the very end

logger.Infof("OK!\n")

exutil.By("Validate the new MachineOSBuild succeeds and changes are applied")
ValidateSuccessfulMOSC(mosc, nil)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check that the rhel10 OS is used?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, thankyou for pointing out

Comment thread test/extended-priv/mco_osimagestream.go Outdated
)

exutil.By("Create custom pool")
customMcp, err := CreateCustomMCP(oc.AsAdmin(), mcpName, 1)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe if we use this function this test can be run in SNO and HA clusters

GetCompactCompatibleOrCustomPool

Comment thread test/extended-priv/mco_osimagestream.go Outdated
)

exutil.By("Enable CPMS for managed boot images")
defer machineConfig.RemoveManagedBootImagesConfig()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should better restart the original configuraiton instead of removing what we added.

Comment thread test/extended-priv/mco_osimagestream.go Outdated
})

// AI-assisted: Test case to verify boot image controller logs for unsupported osstream
g.It("[PolarionID:88708][OTP] Verify osstream logs triggered for the boot image controller [Disruptive] [apigroup:machineconfiguration.openshift.io] [apigroup:machine.openshift.io]", g.Label("Platform:aws"), func() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need g.Label("Platform:aws")? Is this test case only supported on AWS?

If so we should add the skip function too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked it is not needed to have specific to AWS, I think during automation claude might have added it.

Comment thread test/extended-priv/mco_osimagestream.go Outdated
controller.IgnoreLogsBeforeNowOrFail()
o.Expect(targetMs.AddOSStreamLabel(OSImageStreamRHEL9)).To(o.Succeed(), "Error setting osstream label on MachineSet")
defer func() {
if err := targetMs.RemoveLabel("machineconfiguration.openshift.io/osstream"); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to restore the initial content. Maybe the MS had this same label with a different value.

@ptalgulk01 ptalgulk01 force-pushed the add-osimagestream-tests branch from 01678a9 to 53e3e41 Compare May 26, 2026 10:55
Comment thread test/extended-priv/mco_osimagestream.go Outdated
exutil.By("Enable CPMS for managed boot images")
initialManagedBootImages, err := machineConfig.Get(`{.spec.managedBootImages}`)
o.Expect(err).NotTo(o.HaveOccurred(), "Error getting initial managedBootImages config")
defer machineConfig.Patch("merge", fmt.Sprintf(`{"spec":{"managedBootImages":%s}}`, initialManagedBootImages))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easier if we do

defer machineConfiguration.SetSpec(machineConfiguration.GetSpecOrFail())

Add OCP-88122 test case to validate that custom MachineConfigPools
dynamically inherit osImageStream from worker pool when not explicitly
configured.

Test validates three scenarios:
1. Custom MCPs inherit worker pool's default osImageStream (rhel-9)
2. Custom MCP created before worker pool osImageStream change inherits
   the new value (rhel-10) dynamically
3. Custom MCPs can override with explicit osImageStream values

Also adds AddNodeToPool helper method to simplify adding nodes to
custom pools during testing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@ptalgulk01 ptalgulk01 force-pushed the add-osimagestream-tests branch from 53e3e41 to d7bd2b1 Compare May 26, 2026 11:54
@sergiordlr
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 26, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aws-ovn
/test e2e-aws-ovn-upgrade
/test e2e-gcp-op-ocl-part1
/test e2e-gcp-op-ocl-part2
/test e2e-gcp-op-part1
/test e2e-gcp-op-part2
/test e2e-gcp-op-single-node
/test e2e-hypershift

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ptalgulk01, sergiordlr
Once this PR has been reviewed and has the lgtm label, please assign dkhater-redhat for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

@ptalgulk01: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants