[MCO-1997] Add test for osImageStream#5881
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Caution Review failedFailed to post review comments WalkthroughAdds a reusable helper to label nodes into custom MachineConfigPools and wait for MCP status; updates MCP creation helpers to optionally omit spec.osImageStream (so pools can inherit worker osImageStream); adds a MachineSet osstream label helper, a test skip helper for Compact/SNO clusters, and five disruptive Ginkgo tests validating osImageStream behaviors and controller logging. ChangesMachineConfigPool helpers, MachineSet helper, and osImageStream tests
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 9 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (9 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
test/extended-priv/mco_osimagestream.go (2)
317-319: RedundantwaitForCompleteafterAddNodeToPool.
AddNodeToPoolalready callsWaitImmediateForUpdatedStatusinternally, so this explicitwaitForCompleteadds only additional wait time without additional verification. If the extra wait is intentional to guard against the TDZ-style race flagged inAddNodeToPool, prefer hardening the helper instead of layering waits at each call site.♻️ Proposed trim
exutil.By("Add a node to infra pool") err = infraMcp.AddNodeToPool(mcp.GetSortedNodesOrFail()[0]) o.Expect(err).NotTo(o.HaveOccurred(), "Error adding node to infra pool") logger.Infof("OK!\n") - exutil.By("Wait for infra pool to complete update with the new node") - infraMcp.waitForComplete() - logger.Infof("OK!\n") - exutil.By("Verify node in infra pool inherited rhel-10 from worker pool") validateOsImageStreamInPool(infraMcp, osis, OSImageStreamRHEL10)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended-priv/mco_osimagestream.go` around lines 317 - 319, The call site uses infraMcp.waitForComplete() immediately after AddNodeToPool, which is redundant because AddNodeToPool already invokes WaitImmediateForUpdatedStatus; remove the extra infraMcp.waitForComplete() call (or the line invoking logger.Infof("OK!\n") should remain but without the redundant wait) and instead, if guarding against the TDZ-style race is required, harden AddNodeToPool/WaitImmediateForUpdatedStatus itself (update AddNodeToPool to perform any additional verification or retry logic around WaitImmediateForUpdatedStatus) so callers like this test (using infraMcp and AddNodeToPool) do not need to layer waits.
469-481: Tighten test robustness against controller log message changes.The test hardcodes
"machineset tc-%s-dup has unsupported stream"at line 475, while the actual controller log is"machineset %s has unsupported stream: %v, skipping boot image update"(perpkg/controller/bootimage/ms_helpers.go:126). UsingContainSubstringat lines 505/518 masks this coupling—the test currently passes because its substring is contained in the actual log, but if the controller reformats the message (capitalization, phrasing, additional fields), the test becomes brittle and may silently break.For consistency with line 541, which already uses the looser
"has unsupported stream"substring, consider updating lines 475/505/518 to match on a stable substring anchor (e.g.,duplicateMsName + " has unsupported stream"or just"has unsupported stream") rather than the full hardcoded message.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended-priv/mco_osimagestream.go` around lines 469 - 481, Replace the overly specific expected log string in the test and assertions with a stable substring anchor: update the variable unsupportedStreamLogMsg (and any assertions referencing it) to use a looser match such as duplicateMsName + " has unsupported stream" or simply "has unsupported stream", and ensure the ContainSubstring checks against controller log output (where NewController, duplicateMsName and unsupportedStreamLogMsg are used) use that substring to avoid brittle exact-format coupling with ms_helpers.go log text.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/extended-priv/machineconfigpool.go`:
- Around line 1552-1577: In AddNodeToPool, avoid the race where a pre-labeled
node inflates currentNodes and causes WaitForMachineCount to hang: first call
node.GetLabel("node-role.kubernetes.io/<pool>") (use mcp.GetName() to build the
key) and only call node.AddLabel(...) if the label is absent; after successfully
adding the label (or if you detect a fresh addition), wait for the MCP to see
the new machine with mcp.WaitForMachineCount(expectedCount, ...), then instead
of immediately calling WaitImmediateForUpdatedStatus, poll the MCP for
Updating==True (or assert updatedMachineCount == machineCount) as a handshake
before asserting Updated==True (so replace or precede the
WaitImmediateForUpdatedStatus call with a short WaitForUpdating/condition check
to ensure reconciliation started).
In `@test/extended-priv/mco_osimagestream.go`:
- Around line 456-466: After asserting the MCP is degraded, remove the
conflicting MachineConfig (call the same removal used in cleanup, e.g. invoke
mc.DeleteWithWait() or DeleteCustomMCP()) and then explicitly assert recovery by
polling the customMcp: use o.Eventually(customMcp, "5m",
"20s").Should(HaveConditionField("RenderDegraded", "status", FalseString)) and
also check Updated condition true (e.g. HaveConditionField("Updated", "status",
TrueString)) to ensure the MCP transitions back to healthy; place these
assertions after deleting the MC and before relying on deferred cleanup to catch
regressions.
- Around line 517-521: The test currently uses Eventually(controller.GetLogs,
"2m", "10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) which can
pass immediately because controller.IgnoreLogsBeforeNowOrFail() clears logs;
change this to Consistently(controller.GetLogs, "2m",
"10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) so the assertion
verifies the unsupportedStreamLogMsg never appears for the entire duration;
update the call site where controller.GetLogs and unsupportedStreamLogMsg are
referenced and ensure the same polling interval arguments are used with
Consistently instead of Eventually.
- Around line 483-492: The code selects the first MachineSet from existingMsList
and doesn't handle delete errors; update the selection to filter only worker
MachineSets by calling
ByLabel("machine.openshift.io/cluster-api-machine-role=worker") on the
NewMachineSetList(oc.AsAdmin(), MachineAPINamespace) result before using
existingMsList[0], and change the deferred cleanup to call duplicateMs.Delete()
and log any returned error (e.g., check the error returned by
duplicateMs.Delete() and logger.Errorf with duplicateMsName and the error)
instead of discarding it.
---
Nitpick comments:
In `@test/extended-priv/mco_osimagestream.go`:
- Around line 317-319: The call site uses infraMcp.waitForComplete() immediately
after AddNodeToPool, which is redundant because AddNodeToPool already invokes
WaitImmediateForUpdatedStatus; remove the extra infraMcp.waitForComplete() call
(or the line invoking logger.Infof("OK!\n") should remain but without the
redundant wait) and instead, if guarding against the TDZ-style race is required,
harden AddNodeToPool/WaitImmediateForUpdatedStatus itself (update AddNodeToPool
to perform any additional verification or retry logic around
WaitImmediateForUpdatedStatus) so callers like this test (using infraMcp and
AddNodeToPool) do not need to layer waits.
- Around line 469-481: Replace the overly specific expected log string in the
test and assertions with a stable substring anchor: update the variable
unsupportedStreamLogMsg (and any assertions referencing it) to use a looser
match such as duplicateMsName + " has unsupported stream" or simply "has
unsupported stream", and ensure the ContainSubstring checks against controller
log output (where NewController, duplicateMsName and unsupportedStreamLogMsg are
used) use that substring to avoid brittle exact-format coupling with
ms_helpers.go log text.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: d3499c12-6325-4124-af9e-d28cce7052b6
📒 Files selected for processing (2)
test/extended-priv/machineconfigpool.gotest/extended-priv/mco_osimagestream.go
| // AddNodeToPool adds a node to the MCP and waits for the MCP to recognize it | ||
| func (mcp *MachineConfigPool) AddNodeToPool(node *Node) error { | ||
| currentNodes, err := mcp.GetNodes() | ||
| if err != nil { | ||
| return fmt.Errorf("error getting current nodes from %s: %w", mcp.GetName(), err) | ||
| } | ||
| expectedCount := len(currentNodes) + 1 | ||
|
|
||
| err = node.AddLabel(fmt.Sprintf("node-role.kubernetes.io/%s", mcp.GetName()), "") | ||
| if err != nil { | ||
| return fmt.Errorf("error labeling node %s to add it to pool %s: %w", node.GetName(), mcp.GetName(), err) | ||
| } | ||
| logger.Infof("Node %s added to pool %s", node.GetName(), mcp.GetName()) | ||
|
|
||
| err = mcp.WaitForMachineCount(expectedCount, 5*time.Minute) | ||
| if err != nil { | ||
| return fmt.Errorf("the %s MCP is not reporting the expected machine count: %w", mcp.GetName(), err) | ||
| } | ||
|
|
||
| err = mcp.WaitImmediateForUpdatedStatus() | ||
| if err != nil { | ||
| return fmt.Errorf("the %s MCP is not updated: %w", mcp.GetName(), err) | ||
| } | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
ast-grep --pattern $'func (n *Node) HasLabel($_, $_) $_'
rg -nP --type=go -C2 '\bfunc \(.*Node\) (Has|Get)Label\b'Repository: openshift/machine-config-operator
Length of output: 59
🏁 Script executed:
# Find the Node type definition and its methods
fd -e go test/extended-priv/
head -100 test/extended-priv/machineconfigpool.go
# Search for Node type definition
rg -A 20 "^type Node struct" test/extended-priv/
# Search for any label-related methods on Node
rg "func.*Node.*Label" test/extended-priv/
# Check what methods Node actually has
rg "func \(.*\*?Node\)" test/extended-priv/ | head -30Repository: openshift/machine-config-operator
Length of output: 7984
🏁 Script executed:
# Also search for where AddLabel is implemented/used elsewhere
rg -B 5 -A 10 "AddLabel" test/extended-priv/ | head -80Repository: openshift/machine-config-operator
Length of output: 5884
🏁 Script executed:
# Check how WaitImmediateForUpdatedStatus is implemented
rg -A 15 "func.*WaitImmediateForUpdatedStatus" test/extended-priv/
# Check WaitForMachineCount implementation
rg -A 15 "func.*WaitForMachineCount" test/extended-priv/
# Check GetNodes to understand what it returns
rg -A 10 "func.*GetNodes\b" test/extended-priv/Repository: openshift/machine-config-operator
Length of output: 5386
Consider the race condition with immediate polling and pre-existing node membership.
Two minor concerns with this helper:
-
Pre-existing membership timeout: If the node already has the
node-role.kubernetes.io/<pool>label, it will already be included incurrentNodesand the expectation oflen(currentNodes) + 1machines will never be satisfied, causingWaitForMachineCountto time out. A check usingGetLabel()before labeling would avoid this unnecessary wait. -
Race with
WaitImmediateForUpdatedStatus: The immediate polling (immediate=true) meansWaitImmediateForUpdatedStatuscan observe the staleUpdated=Truestatus from a prior quiesced state, returning success before the MCP controller has actually begun reconciling the newly-added node. WhileWaitForMachineCounthelps bridge this, consider explicitly waiting forUpdating=Truefirst as a handshake before checking forUpdated=True, or useupdatedMachineCount == machineCountas a synchronization signal.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended-priv/machineconfigpool.go` around lines 1552 - 1577, In
AddNodeToPool, avoid the race where a pre-labeled node inflates currentNodes and
causes WaitForMachineCount to hang: first call
node.GetLabel("node-role.kubernetes.io/<pool>") (use mcp.GetName() to build the
key) and only call node.AddLabel(...) if the label is absent; after successfully
adding the label (or if you detect a fresh addition), wait for the MCP to see
the new machine with mcp.WaitForMachineCount(expectedCount, ...), then instead
of immediately calling WaitImmediateForUpdatedStatus, poll the MCP for
Updating==True (or assert updatedMachineCount == machineCount) as a handshake
before asserting Updated==True (so replace or precede the
WaitImmediateForUpdatedStatus call with a short WaitForUpdating/condition check
to ensure reconciliation started).
| exutil.By(fmt.Sprintf("Create duplicate MachineSet %s from existing worker MachineSet", duplicateMsName)) | ||
| existingMsList := NewMachineSetList(oc.AsAdmin(), MachineAPINamespace).GetAllOrFail() | ||
| o.Expect(len(existingMsList)).To(o.BeNumerically(">", 0), "At least one MachineSet should exist") | ||
| duplicateMs, err = existingMsList[0].Duplicate(duplicateMsName) | ||
| o.Expect(err).NotTo(o.HaveOccurred(), "Error duplicating MachineSet") | ||
| defer func() { | ||
| logger.Infof("Cleaning up duplicate MachineSet %s", duplicateMsName) | ||
| duplicateMs.Delete() | ||
| }() | ||
| logger.Infof("OK!\n") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find how MachineSet role is inspected elsewhere
rg -nP --type=go -C2 '(cluster-api-machine-role|machine\.openshift\.io/cluster-api-machine-role)'
ast-grep --pattern $'func ($_ *MachineSet) $_() $$$'Repository: openshift/machine-config-operator
Length of output: 6496
🏁 Script executed:
#!/bin/bash
# Search for MachineSet.Get() usages to understand field path format
rg -nP --type=go 'ms\.Get\(' -A2 -B2
# Search for worker role references
rg -nP --type=go '(worker|cluster-api-machine-role)' --glob='*test*' | head -30Repository: openshift/machine-config-operator
Length of output: 11298
🏁 Script executed:
#!/bin/bash
# Find MachineSet.Duplicate() and Delete() methods
rg -nP --type=go 'func \(.*MachineSet.*\) (Duplicate|Delete)' -A3
# Search for existing worker filter patterns or helper functions
rg -nP --type=go 'NewMachineSetList|GetAllOrFail|ByLabel' -A2 -B2 | head -50Repository: openshift/machine-config-operator
Length of output: 3954
🏁 Script executed:
#!/bin/bash
# Find the Delete() method for MachineSet
rg -nP --type=go 'func \(.*MachineSet.*\) Delete' -A3
# Search for ByLabel method on MachineSetList
rg -nP --type=go 'func.*MachineSetList.*ByLabel' -A3
# Check CloneResource to understand Duplicate's implementation
rg -nP --type=go 'func CloneResource' -A10Repository: openshift/machine-config-operator
Length of output: 852
🏁 Script executed:
#!/bin/bash
# Find Delete method more broadly
rg -nP --type=go 'Delete.*\(\)' test/extended-priv/machineset.go -B2 -A5
# Also search in resource.go for Delete method
rg -nP --type=go 'func \(.*\) Delete' test/extended-priv/resource.go -B2 -A5
# Look for MachineSetList ByLabel
rg -nP --type=go '\.ByLabel' test/extended-priv/machineset.go -B2 -A2
# Check if Delete is called elsewhere in the codebase
rg -nP --type=go 'duplicateMs\.Delete\(\)|\.Delete\(\)' test/extended-priv/ -B1 -A1Repository: openshift/machine-config-operator
Length of output: 16087
🏁 Script executed:
#!/bin/bash
# Find ByLabel implementation on ResourceList or MachineSetList
rg -nP --type=go 'func.*ByLabel' test/extended-priv/ -A3
# Check what labels are used to identify worker MachineSets
rg -nP --type=go 'worker.*MachineSet|MachineSet.*worker' test/extended-priv/ -B2 -A2
# Look for role-based filtering in MachineSet selection
rg -nP --type=go 'cluster-api-machine-role' test/extended-priv/ -B2 -A2Repository: openshift/machine-config-operator
Length of output: 2623
🏁 Script executed:
#!/bin/bash
# Check if there are other instances of filtering worker MachineSets
rg -nP --type=go 'MachineSetList.*GetAll|NewMachineSetList' test/extended-priv/ -A5 | head -40
# Look for worker pool constants
rg -nP --type=go 'MachineConfigPoolWorker|MachineAPINamespace' test/extended-priv/const.go -B2 -A2Repository: openshift/machine-config-operator
Length of output: 3399
Filter MachineSet selection to worker role and handle the cleanup deletion error.
Two issues in this block:
-
existingMsList[0]picks the first MachineSet in API order, not necessarily a worker. Clusters with infra or other MachineSets will cause subsequent assertions tied to osstream labels to fail or produce misleading results. The comment itself says "from existing worker MachineSet"—filter explicitly using:existingMsList.ByLabel("machine.openshift.io/cluster-api-machine-role=worker")
-
The deferred
duplicateMs.Delete()silently discards errors. If deletion fails, the duplicate MachineSet and its provisioned machines leak into subsequent tests. Log the error:if derr := duplicateMs.Delete(); derr != nil { logger.Errorf("Failed to delete duplicate MachineSet %s: %v", duplicateMsName, derr) }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/extended-priv/mco_osimagestream.go` around lines 483 - 492, The code
selects the first MachineSet from existingMsList and doesn't handle delete
errors; update the selection to filter only worker MachineSets by calling
ByLabel("machine.openshift.io/cluster-api-machine-role=worker") on the
NewMachineSetList(oc.AsAdmin(), MachineAPINamespace) result before using
existingMsList[0], and change the deferred cleanup to call duplicateMs.Delete()
and log any returned error (e.g., check the error returned by
duplicateMs.Delete() and logger.Errorf with duplicateMsName and the error)
instead of discarding it.
| } | ||
|
|
||
| // AddNodeToPool adds a node to the MCP and waits for the MCP to recognize it | ||
| func (mcp *MachineConfigPool) AddNodeToPool(node *Node) error { |
There was a problem hiding this comment.
We can't add nodes to a master pool, for example. So we can't define this function as a general method in a MCP.
It would be better if we write it as an independent function accepting a MCP as parameter, and not as a method in the MCP struct.
This code is very similar to the one in CreateCustomMCPWithStreamByNodes
We can make the function to add several nodes, not only one, to the poo and reuse it in CreateCustomMCPWithStreamByNodes avoiding duplicated code.
| g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters") | ||
| } |
There was a problem hiding this comment.
Let's use the SkipIfCompactOrSNO function that is used in the private repo. It seems it hasn't been migrated yet, but it should be migrated anyway.
// SkipIfCompactOrSNO skips the test case if the cluster is a compact or SNO cluster
func SkipIfCompactOrSNO(oc *exutil.CLI) {
if IsCompactOrSNOCluster(oc) {
g.Skip("The test is not supported in Compact or SNO clusters")
}
}
| if IsCompactOrSNOCluster(oc.AsAdmin()) { | ||
| g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters") | ||
| } |
There was a problem hiding this comment.
Same, let's migrate and use the SkipIfCompactOrSNO function
| o.Eventually(func() string { | ||
| currentMOSB, err := mosc.GetCurrentMachineOSBuild() | ||
| if err != nil { | ||
| return "" | ||
| } | ||
| return currentMOSB.GetName() | ||
| }, "5m", "20s").ShouldNot(o.Equal(initialMOSBName), "A new MachineOSBuild should be triggered after osImageStream change") |
There was a problem hiding this comment.
It is cleaner if we don't swallow the error silently
o.Eventually(func() (string, error) {
mosb, err := mosc.GetCurrentMachineOSBuild()
if err != nil {
return "", err
}
return mosb.GetName(), nil
}, "5m", "20s").ShouldNot(o.Equal(initialMOSBName),
"A new MachineOSBuild should be triggered after osImageStream change")
| if IsCompactOrSNOCluster(oc.AsAdmin()) { | ||
| g.Skip("The cluster is SNO/Compact. This test cannot be executed in SNO/Compact clusters") | ||
| } |
There was a problem hiding this comment.
Same let's use the SkipIfCompactOrSNO function
| logger.Infof("OK!\n") | ||
|
|
||
| exutil.By(fmt.Sprintf("Set osstream label to %s on MachineSet", OSImageStreamRHEL10)) | ||
| err = duplicateMs.Patch("merge", fmt.Sprintf(`{"metadata":{"labels":{"machineconfiguration.openshift.io/osstream":"%s"}}}`, OSImageStreamRHEL10)) |
There was a problem hiding this comment.
We need a method for this, so that it can be reused in future tests.
We can use AdLabel instead. But I think it would be better to use even a more specific method so that we don't have to care about the name of the label.
duplicateMs.AddOSStreamLabel(...)
Then this function can call "AddLabel".
| // Reset log checkpoint before changing to supported stream to verify the new behavior | ||
| exutil.By("Change osstream label to rhel-9 and verify no unsupported stream log is triggered") | ||
| controller.IgnoreLogsBeforeNowOrFail() | ||
| err = duplicateMs.Patch("merge", fmt.Sprintf(`{"metadata":{"labels":{"machineconfiguration.openshift.io/osstream":"%s"}}}`, OSImageStreamRHEL9)) |
| exutil.By("Enable CPMS for managed boot images") | ||
| cpms = NewControlPlaneMachineSet(oc.AsAdmin(), MachineAPINamespace, ControlPlaneMachineSetName) | ||
| o.Expect(cpms.Exists()).To(o.BeTrue(), "ControlPlaneMachineSet should exist for this test") | ||
|
|
||
| machineConfig = GetMachineConfiguration(oc.AsAdmin()) | ||
| defer machineConfig.RemoveManagedBootImagesConfig() | ||
|
|
||
| o.Expect( | ||
| machineConfig.SetAllManagedBootImagesConfig(ControlPlaneMachineSetResource), | ||
| ).To(o.Succeed(), "Error enabling CPMS for managed boot images") | ||
| logger.Infof("OK!\n") | ||
|
|
There was a problem hiding this comment.
CPMS and Machinesets are two different functionalities.
We should use a different test case for the CPMS
| err = NewMCOTemplate(oc.AsAdmin(), "custom-machine-config-pool.yaml").Create( | ||
| "-p", fmt.Sprintf("NAME=%s", infraName), | ||
| ) |
|
|
||
| exutil.By("Create custom MCP with rhel-9 configured") | ||
| defer DeleteCustomMCP(oc.AsAdmin(), mcp1Name) | ||
| mcp1, err := CreateCustomMCP(oc.AsAdmin(), mcp1Name, 1) |
There was a problem hiding this comment.
CreateCustomMCPWithStream(oc.AsAdmin(), mcp2Name, OSImageStreamRHEL9, 1)
043492b to
e91a679
Compare
| // No osstream field is set - MCP will inherit from worker pool automatically | ||
| // Nodes will be taken from the worker pool | ||
| // The Custom pool will be created d using the same osstream configured in the worker pool | ||
| func CreateCustomMCP(oc *exutil.CLI, name string, numNodes int) (*MachineConfigPool, error) { |
There was a problem hiding this comment.
The original CreateCustomMCP was like this
func CreateCustomMCP(oc *exutil.CLI, name string, numNodes int) (*MachineConfigPool, error) {
var (
wMcp = NewMachineConfigPool(oc.AsAdmin(), MachineConfigPoolWorker)
)
workerNodes, err := wMcp.GetNodes()
if err != nil {
return NewMachineConfigPool(oc, name), err
}
if numNodes > len(workerNodes) {
return NewMachineConfigPool(oc, name), fmt.Errorf("A %d nodes custom pool cannot be created because there are only %d nodes in the %s pool",
numNodes, len(workerNodes), wMcp.GetName())
}
return CreateCustomMCPByNodes(oc, name, workerNodes[0:numNodes])
}
Now instead of CreateCustomMCPByNodes we have CreateCustomMCPWithStreamByNodes.
Let's CreateCustomMCPWithStreamByNodes decide which template to use depending on if the stream is empty or it has a value.
| "Controller should not log unsupported stream message after changing to supported stream") | ||
| logger.Infof("Controller correctly stopped logging unsupported stream for %s", duplicateMsName) | ||
| logger.Infof("OK!\n") | ||
| }) |
There was a problem hiding this comment.
Will the CPMS check be done in another test case? Is this new test case created in polarion?
There was a problem hiding this comment.
OCP-88995 added this TC
fade1cc to
81f5ab9
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/extended-priv/machineconfigpool.go`:
- Around line 1229-1263: CreateCustomMCP currently returns immediately after
NewMCOTemplate(...).Create when numNodes == 0, so the MCP may not be reconciled;
update CreateCustomMCP to wait for the MCP to reach the reconciled/Updated=True
state after creating the resource by invoking the MachineConfigPool wait helper
(e.g., call customMcp.WaitForComplete() or the existing wait method on the
MachineConfigPool returned by NewMachineConfigPool) and return any error from
that wait before returning the customMcp; keep the existing behavior when
numNodes > 0 (where subsequent AddNodesToMachineConfigPool covers
reconciliation).
In `@test/extended-priv/mco_osimagestream.go`:
- Around line 531-532: The test uses
Eventually(controller.GetLogs...).ShouldNot(ContainSubstring(unsupportedStreamLogMsg))
which can pass trivially because controller.IgnoreLogsBeforeNowOrFail() clears
the cursor; replace the Eventually assertion with Consistently to assert absence
over time: call Consistently(controller.GetLogs, "2m",
"10s").ShouldNot(o.ContainSubstring(unsupportedStreamLogMsg)) (keep the existing
controller.IgnoreLogsBeforeNowOrFail() and the same duration/interval values) so
the test actually verifies the message never appears during the 2-minute window.
- Around line 524-534: Filter the MachineSet list returned by
NewMachineSetList(oc.AsAdmin(), MachineAPINamespace).GetAllOrFail() to pick a
worker MachineSet (match label
"machine.openshift.io/cluster-api-machine-role=worker") instead of using
existingMsList[0], and after calling
targetMs.AddOSStreamLabel(OSImageStreamRHEL9) register a cleanup to remove that
label (e.g., defer calling the corresponding remove/unset method on targetMs or
using targetMs.DeleteLabel/Update to clear
machineconfiguration.openshift.io/osstream) so the label does not persist after
the test; reference NewMachineSetList, GetAllOrFail, existingMsList, targetMs,
and AddOSStreamLabel to locate the changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: a29a1dde-57c9-4240-8748-597180a46e0f
📒 Files selected for processing (4)
test/extended-priv/machineconfigpool.gotest/extended-priv/machineset.gotest/extended-priv/mco_osimagestream.gotest/extended-priv/util.go
81f5ab9 to
707a9aa
Compare
707a9aa to
01678a9
Compare
| // No nodes to add; wait for the MCP to reach Updated=True before returning | ||
| if err := customMcp.WaitImmediateForUpdatedStatus(); err != nil { | ||
| return customMcp, fmt.Errorf("the %s MCP did not reach Updated=True after creation: %w", name, err) | ||
| } |
There was a problem hiding this comment.
Let's avoid duplicated code by just calling CreateCustomMCPWithStream with an empty osstream
This function called CreateCustomMCPWithStreamByNodes, was it a problem?
| // Choose template based on whether osstream is provided | ||
| if osstream == "" { | ||
| // Use basic template without osImageStream | ||
| err = NewMCOTemplate(oc, "custom-machine-config-pool.yaml").Create( | ||
| "-p", fmt.Sprintf("NAME=%s", name), | ||
| ) | ||
| } else { | ||
| // Use template with osImageStream | ||
| err = NewMCOTemplate(oc, "custom-machine-config-pool-osimagestream.yaml").Create( | ||
| "-p", fmt.Sprintf("NAME=%s", name), | ||
| "-p", fmt.Sprintf("OSIMAGESTREAM=%s", osstream), | ||
| ) | ||
| } |
There was a problem hiding this comment.
This is already done in CreateCustomMCPWithStreamByNodes, right?
We should delay the template until the very end
| logger.Infof("OK!\n") | ||
|
|
||
| exutil.By("Validate the new MachineOSBuild succeeds and changes are applied") | ||
| ValidateSuccessfulMOSC(mosc, nil) |
There was a problem hiding this comment.
Shouldn't we check that the rhel10 OS is used?
There was a problem hiding this comment.
agree, thankyou for pointing out
| ) | ||
|
|
||
| exutil.By("Create custom pool") | ||
| customMcp, err := CreateCustomMCP(oc.AsAdmin(), mcpName, 1) |
There was a problem hiding this comment.
Maybe if we use this function this test can be run in SNO and HA clusters
GetCompactCompatibleOrCustomPool
| ) | ||
|
|
||
| exutil.By("Enable CPMS for managed boot images") | ||
| defer machineConfig.RemoveManagedBootImagesConfig() |
There was a problem hiding this comment.
We should better restart the original configuraiton instead of removing what we added.
| }) | ||
|
|
||
| // AI-assisted: Test case to verify boot image controller logs for unsupported osstream | ||
| g.It("[PolarionID:88708][OTP] Verify osstream logs triggered for the boot image controller [Disruptive] [apigroup:machineconfiguration.openshift.io] [apigroup:machine.openshift.io]", g.Label("Platform:aws"), func() { |
There was a problem hiding this comment.
Do we need g.Label("Platform:aws")? Is this test case only supported on AWS?
If so we should add the skip function too.
There was a problem hiding this comment.
I checked it is not needed to have specific to AWS, I think during automation claude might have added it.
| controller.IgnoreLogsBeforeNowOrFail() | ||
| o.Expect(targetMs.AddOSStreamLabel(OSImageStreamRHEL9)).To(o.Succeed(), "Error setting osstream label on MachineSet") | ||
| defer func() { | ||
| if err := targetMs.RemoveLabel("machineconfiguration.openshift.io/osstream"); err != nil { |
There was a problem hiding this comment.
It is better to restore the initial content. Maybe the MS had this same label with a different value.
01678a9 to
53e3e41
Compare
| exutil.By("Enable CPMS for managed boot images") | ||
| initialManagedBootImages, err := machineConfig.Get(`{.spec.managedBootImages}`) | ||
| o.Expect(err).NotTo(o.HaveOccurred(), "Error getting initial managedBootImages config") | ||
| defer machineConfig.Patch("merge", fmt.Sprintf(`{"spec":{"managedBootImages":%s}}`, initialManagedBootImages)) |
There was a problem hiding this comment.
It's easier if we do
defer machineConfiguration.SetSpec(machineConfiguration.GetSpecOrFail())
Add OCP-88122 test case to validate that custom MachineConfigPools dynamically inherit osImageStream from worker pool when not explicitly configured. Test validates three scenarios: 1. Custom MCPs inherit worker pool's default osImageStream (rhel-9) 2. Custom MCP created before worker pool osImageStream change inherits the new value (rhel-10) dynamically 3. Custom MCPs can override with explicit osImageStream values Also adds AddNodeToPool helper method to simplify adding nodes to custom pools during testing. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
53e3e41 to
d7bd2b1
Compare
|
/lgtm |
|
Scheduling tests matching the |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ptalgulk01, sergiordlr The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@ptalgulk01: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Test Cases Added
OCP-88122: Validate osImageStream inheritance for custom MCPs
Validates dynamic inheritance behavior in custom MachineConfigPools:
OCP-88708: Verify osstream logs for boot image controller
Validates boot image controller logging for unsupported osImageStream values:
OCP-88203: Verify MOSB triggered when osImageStream is patched
Validates MachineOSBuild resource creation when osImageStream changes:
OCP-88365: Verify MCP degraded when osstream and osImageURL MC applied
Validates error handling for conflicting osstream and osImageURL configurations:
Implementation Details
custom-machine-config-pool.yamltemplate for MCPs without osImageStream field (dynamic inheritance)AddNodeToPool()helper method to simplify node addition to custom poolsNewController().GetLogs()for boot image controller log verificationSummary by CodeRabbit