OCPBUGS-86803: e2e: refactor multi-NUMA skip conditions in functests#1528
OCPBUGS-86803: e2e: refactor multi-NUMA skip conditions in functests#1528oblau wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
💤 Files with no reviewable changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughUpdate e2e test precondition gating: LLC-aware CPU pinning tests stop skipping when nodes report fewer than two NUMA nodes; offline-CPU tests adjust NUMA/core-based skips (one test now skips only on empty topology or first NUMA node with <20 cores; another removes the <2-NUMA skip; a third adds clarifying comment). ChangesPrerequisite Gating Updates
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)level=error msg="Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: inconsistent vendoring in :\n\tgithub.com/RHsyseng/operator-utils@v1.4.13: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/coreos/go-systemd@v0.0.0-20191104093116-d3cd4ed1dbcf: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/coreos/ignition@v0.35.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/coreos/ignition/v2@v2.26.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/docker/go-units@v0.5.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/go-logr/stdr@v1.2.2: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/google/go-cmp@v0.7.0 ... [truncated 19340 characters] ... is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/legacy-cloud-providers: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/metrics: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/mount-utils: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/pod-security-admission: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/sample-apiserver: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tgithub.com/onsi/ginkgo/v2: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\n\tTo ignore the vendor directory, use -mod=readonly or -mod=mod.\n\tTo sync the vendor directory, run:\n\t\tgo mod vendor\n" Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: oblau The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Actionable comments posted: 0 |
This is not a proxy for VM detection but rather detection of Baremetal |
|
@oblau: This pull request references Jira Issue OCPBUGS-86803, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@oblau: This pull request references Jira Issue OCPBUGS-86803, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified by oblau |
|
@oblau: The DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/verified by oblau |
|
@oblau: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Tal-or
left a comment
There was a problem hiding this comment.
I'm fine with the change in general if it has a benefit,
but please always explain in the PR+commit the rational for a change.
Right now, it might be clear why we went with a specific change, but as time goes by we eventually forget why we did what we did.
Hence, it become crucial to understand and document the reasons behind the decisions we're making now, for the sake of our future selves :)
| Skip("This test needs systems with at least 20 cores per socket") | ||
| onlineCPUs, err := nodes.GetOnlineCPUsSet(context.TODO(), &workerRTNodes[0]) | ||
| Expect(err).ToNot(HaveOccurred()) | ||
| if onlineCPUs.Size() < 24 { |
There was a problem hiding this comment.
Those numbers are seems completely random for reviewers, it would be better to add a comment and/or constant that explain the reason for this specific number.
There was a problem hiding this comment.
Reverted the cpu changes.
Now the scope of the pr is clear.
Thanks for the constructive criticism
| numaTopology := copyNumaCoreSiblings(numaCoreSiblings) | ||
| onlineCPUs, err := nodes.GetOnlineCPUsSet(context.TODO(), &workerRTNodes[0]) | ||
| Expect(err).ToNot(HaveOccurred()) | ||
| if onlineCPUs.Size() < 60 { |
454feb2 to
b299ffe
Compare
|
@oblau: This pull request references Jira Issue OCPBUGS-86803, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
were used for checking if vm or bm Signed-off-by: oblau <oblau@redhat.com>
b299ffe to
3474a34
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@test/e2e/performanceprofile/functests/2_performance_update/updating_profile.go`:
- Around line 780-781: The current precondition only checks numaCoreSiblings
length and may still yield a single contiguous offlinedSet; update the test to
ensure the generated offlined CPU selection actually contains multiple disjoint
ranges before proceeding: after constructing offlinedSet (the variable that
builds the offline CPU list) assert that its string form contains multiple
ranges (e.g., contains a comma or multiple '-' segments) or alternatively
strengthen the topology guard to require at least two disjoint NUMA/core ranges;
if the check fails, call Skip with a clear message so the test only runs when
offlinedSet will produce multiple ranges.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 784b4316-9523-4b23-8ea4-cacca66f4b12
📒 Files selected for processing (2)
test/e2e/performanceprofile/functests/13_llc/llc.gotest/e2e/performanceprofile/functests/2_performance_update/updating_profile.go
💤 Files with no reviewable changes (1)
- test/e2e/performanceprofile/functests/13_llc/llc.go
| if len(numaCoreSiblings) == 0 || len(numaCoreSiblings[0]) < 20 { | ||
| Skip("This test needs systems with at least 20 cores per socket") |
There was a problem hiding this comment.
The new skip no longer guarantees this test actually uses multiple offline ranges.
Line 780 only checks that NUMA node 0 has 20 cores, but the constructed offlinedSet can still collapse to a single contiguous range on single-socket or contiguous CPU-ID layouts. That means this test may pass without covering the “multiple ranges” case its name is asserting. Please either keep a topology precondition that guarantees disjoint ranges or add an explicit assertion that the generated offline CPU string contains multiple ranges before updating the profile.
Suggested guard
offlinedSet := performancev2.CPUSet(offlined.String())
+ Expect(strings.Count(string(offlinedSet), ",")).To(BeNumerically(">", 0),
+ "test setup must generate multiple offline CPU ranges")
profile, err := profiles.GetByNodeLabels(testutils.NodeSelectorLabels)
Expect(err).ToNot(HaveOccurred())🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@test/e2e/performanceprofile/functests/2_performance_update/updating_profile.go`
around lines 780 - 781, The current precondition only checks numaCoreSiblings
length and may still yield a single contiguous offlinedSet; update the test to
ensure the generated offlined CPU selection actually contains multiple disjoint
ranges before proceeding: after constructing offlinedSet (the variable that
builds the offline CPU list) assert that its string form contains multiple
ranges (e.g., contains a comma or multiple '-' segments) or alternatively
strengthen the topology guard to require at least two disjoint NUMA/core ranges;
if the check fails, call Skip with a clear message so the test only runs when
offlinedSet will produce multiple ranges.
|
/verified by oblau |
|
@oblau: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Tests used "skip if <2 NUMA nodes" as a heuristic to detect bare-metal.
Changes:
llc.go:
updating_profile.go:
Summary by CodeRabbit