Skip to content

OCPBUGS-86299: skip upstream LB tests on dual-stack and patch NLB service IPFamilyPolicy#466

Draft
mtulio wants to merge 1 commit into
openshift:mainfrom
mtulio:OCPBUGS-86299
Draft

OCPBUGS-86299: skip upstream LB tests on dual-stack and patch NLB service IPFamilyPolicy#466
mtulio wants to merge 1 commit into
openshift:mainfrom
mtulio:OCPBUGS-86299

Conversation

@mtulio
Copy link
Copy Markdown
Contributor

@mtulio mtulio commented May 21, 2026

Summary

Upstream cloud-provider-aws load balancer tests do not support dual-stack clusters — they create single-stack NLB services that fail when the cluster network requires dual-stack IPFamilyPolicy. This PR detects dual-stack configuration from the CCM cloud-config and adapts the test suite accordingly.

Changes

  • Exclude upstream LB tests on dual-stack clusters: Detect dual-stack at startup by reading the ipFamilies key from the cloud-config ConfigMap. When the cluster is dual-stack, upstream [cloud-provider-aws-e2e] loadbalancer tests are excluded from the spec selector. When detection fails, upstream LB tests are also excluded (fail-closed) to avoid false positives.

  • Patch NLB services for dual-stack: The downstream createServiceNLB helper now sets IPFamilyPolicy=RequireDualStack on NLB services when dual-stack is detected, so the AWSServiceLBNetworkSecurityGroup tests create services matching the cluster's network configuration.

  • Add IsDualStack helper: New common.IsDualStack() function parses the ipFamilies config key from the cloud-config ConfigMap and returns true when both IPv4 and IPv6 are present.

Dependencies

Blocked by #464 — this PR builds on top of the HyperShift e2e fixes which introduce GetCloudConfig(), IsConfigPresentCloudConfig(), and the topology-aware cloud-config retrieval required by the dual-stack detection.

Test plan

  • Verify upstream LB tests are excluded on dual-stack clusters (list tests should not contain [cloud-provider-aws-e2e] loadbalancer entries)
  • Verify upstream LB tests are included on single-stack clusters
  • Verify AWSServiceLBNetworkSecurityGroup tests create dual-stack NLB services on dual-stack clusters
  • Verify no behavior change on single-stack standalone and HyperShift clusters

🤖 Generated with Claude Code


Summary by CodeRabbit

  • New Features

    • Detects dual-stack clusters and selectively runs appropriate load balancer tests.
  • Bug Fixes & Improvements

    • Tests now use shared helpers to read cloud configuration and more reliably detect NLB security group mode.
    • Improved cloud-config parsing and dual-stack detection to reduce flaky load balancer tests.
  • Refactor

    • Consolidated cloud-config and cluster-check helpers to reduce duplication.
  • Chores

    • Promoted a dependency to a direct requirement.

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels May 21, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@mtulio: This pull request references Jira Issue OCPBUGS-86299, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds shared CCM cloud-config helpers and kube/OpenShift client builders, detects dual-stack at test startup to conditionally filter specs, and refactors AWS NLB tests to use helpers and set Service.IPFamilyPolicy for dual-stack clusters.

Changes

NLB test infrastructure refactoring for cloud-config and dual-stack support

Layer / File(s) Summary
Cloud-config and client helpers
openshift-tests/ccm-aws-tests/e2e/common/helper.go
Adds GetKubeClient(ctx) / GetOcClient(ctx) and GetCloudConfig, IsConfigPresentCloudConfig, IsNLBSecurityGroupModeManaged, and IsDualStack for parsing CCM cloud-config.
Test init and conditional spec selection
openshift-tests/ccm-aws-tests/main.go
Imports common, loads cloud-config at startup, sets package dual-stack flags via common.IsDualStack, and excludes upstream AWS loadbalancer specs when dual-stack is detected.
NLB tests updated to use shared helpers
openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
Imports common, removes local cloud-config constants, refactors NLBSecurityGroupMode test to use common.GetCloudConfig/IsNLBSecurityGroupModeManaged, and updates createServiceNLB to detect dual-stack and set IPFamilyPolicy = RequireDualStack before Service creation.
Dependency promotion
openshift-tests/ccm-aws-tests/go.mod
Promotes github.com/openshift/api from indirect to direct requirement.

Sequence Diagram

sequenceDiagram
  participant main as main (test runner)
  participant common as openshift-tests/ccm-aws-tests/e2e/common
  participant kube as Kubernetes API (ConfigMap/Service)
  participant tests as NLB test code

  main->>common: GetCloudConfig(ctx)
  common->>kube: GET ConfigMap openshift-cloud-controller-manager/cloud-conf
  kube-->>common: ConfigMap
  common->>common: IsDualStack(cm)
  common-->>main: isDualStack bool
  main->>tests: build spec selectors (exclude upstream loadbalancer when isDualStack)
  tests->>common: GetCloudConfig(ctx) / IsNLBSecurityGroupModeManaged(cm)
  tests->>common: IsDualStack(cm)
  tests->>kube: Services(...).Create(...) with IPFamilyPolicy=RequireDualStack (if dual-stack)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Suggested labels

lgtm

Suggested reviewers

  • racheljpg
  • theobarberbany
🚥 Pre-merge checks | ✅ 9 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Test Structure And Quality ⚠️ Warning Test quality issues: 5 typos ("creatomg" vs "creating") in By() statements; IsDualStack lacks case normalization for IPv4/IPv6, causing silent failures. Fix typos in By() statements (lines 111, 158, 246, 357, 440) and apply strings.ToLower() in IsDualStack switch for case-insensitive IP family matching.
Microshift Test Compatibility ⚠️ Warning Tests reference unavailable openshift-cloud-controller-manager namespace without MicroShift protection mechanisms (apigroup tag, skip label, or runtime checks). Add [apigroup:operator.openshift.io] tag to test name or add exutil.IsMicroShiftCluster() guard since openshift-cloud-controller-manager namespace does not exist on MicroShift.
Single Node Openshift (Sno) Test Compatibility ⚠️ Warning Six new Ginkgo e2e tests in loadbalancer.go lack SNO compatibility guards ([Skipped:SingleReplicaTopology] or runtime topology checks). Add [Skipped:SingleReplicaTopology] label to test names or guard with exutil.IsSingleNode() check if tests require multi-node clusters.
✅ Passed checks (9 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly references the ticket (OCPBUGS-86299) and clearly summarizes the two main changes: skipping upstream LB tests on dual-stack and patching NLB service IPFamilyPolicy.
Docstring Coverage ✅ Passed Docstring coverage is 88.24% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names use static constants and strings. No dynamic values, pod names, timestamps, UUIDs, IPs, or node names in any test definitions.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only E2E test code (openshift-tests/ccm-aws-tests/), not operator/controller code or deployment manifests. No topology-incompatible scheduling constraints are introduced.
Ote Binary Stdout Contract ✅ Passed All logging in process-level code (main.go) uses either logrus (which defaults to stderr) or framework.Logf (which uses ginkgo.GinkgoWriter). No stdout writes detected.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No new Ginkgo tests are added. Changes only modify existing tests and add helper functions without IPv4 assumptions or external connectivity.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 21, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 21, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mdbooth for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Copy Markdown

@mtulio: This pull request references Jira Issue OCPBUGS-86299, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary by CodeRabbit

  • Documentation

  • Updated AWS Cloud Controller Manager e2e test guide with prerequisites, build instructions, and test execution examples for standalone and HyperShift clusters.

  • New Features

  • Added dual-stack configuration detection and support for Network Load Balancers.

  • Added test-skipping capability for management-cluster-dependent tests in HyperShift environments.

  • Bug Fixes

  • Improved AWS region resolution with fallback to cluster Infrastructure resource.

  • Fixed regional endpoint handling for AWS client connectivity from management clusters.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio mtulio changed the title OCPBUGS-86299: e2e/ccm-aws-ote: evaluate DualStack scenarios OCPBUGS-86299: skip upstream LB tests on dual-stack and patch NLB service IPFamilyPolicy May 21, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@mtulio: This pull request references Jira Issue OCPBUGS-86299, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

Upstream cloud-provider-aws load balancer tests do not support dual-stack clusters — they create single-stack NLB services that fail when the cluster network requires dual-stack IPFamilyPolicy. This PR detects dual-stack configuration from the CCM cloud-config and adapts the test suite accordingly.

Changes

  • Exclude upstream LB tests on dual-stack clusters: Detect dual-stack at startup by reading the ipFamilies key from the cloud-config ConfigMap. When the cluster is dual-stack, upstream [cloud-provider-aws-e2e] loadbalancer tests are excluded from the spec selector. When detection fails, upstream LB tests are also excluded (fail-closed) to avoid false positives.

  • Patch NLB services for dual-stack: The downstream createServiceNLB helper now sets IPFamilyPolicy=RequireDualStack on NLB services when dual-stack is detected, so the AWSServiceLBNetworkSecurityGroup tests create services matching the cluster's network configuration.

  • Add IsDualStack helper: New common.IsDualStack() function parses the ipFamilies config key from the cloud-config ConfigMap and returns true when both IPv4 and IPv6 are present.

Dependencies

Blocked by #464 — this PR builds on top of the HyperShift e2e fixes which introduce GetCloudConfig(), IsConfigPresentCloudConfig(), and the topology-aware cloud-config retrieval required by the dual-stack detection.

Test plan

  • Verify upstream LB tests are excluded on dual-stack clusters (list tests should not contain [cloud-provider-aws-e2e] loadbalancer entries)
  • Verify upstream LB tests are included on single-stack clusters
  • Verify AWSServiceLBNetworkSecurityGroup tests create dual-stack NLB services on dual-stack clusters
  • Verify no behavior change on single-stack standalone and HyperShift clusters

🤖 Generated with Claude Code


Summary by CodeRabbit

  • Documentation

  • Updated AWS Cloud Controller Manager e2e test guide with prerequisites, build instructions, and test execution examples for standalone and HyperShift clusters.

  • New Features

  • Added dual-stack configuration detection and support for Network Load Balancers.

  • Added test-skipping capability for management-cluster-dependent tests in HyperShift environments.

  • Bug Fixes

  • Improved AWS region resolution with fallback to cluster Infrastructure resource.

  • Fixed regional endpoint handling for AWS client connectivity from management clusters.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 21, 2026

/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 21, 2026

@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/dfe82140-5551-11f1-976e-fc5230192bd3-0

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
openshift-tests/ccm-aws-tests/main.go (1)

50-65: 💤 Low value

Consider using a context with timeout for dual-stack detection.

The detection uses context.TODO() which has no timeout. If the cluster is unreachable or slow, this could block indefinitely during test initialization. A bounded context would make startup behavior more predictable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift-tests/ccm-aws-tests/main.go` around lines 50 - 65, Replace the
unbounded context.TODO() used in the dual-stack detection call chain with a
bounded context: create a context with timeout (e.g.
context.WithTimeout(context.Background(), 30*time.Second)), defer its cancel,
and pass that context into common.GetCloudConfig instead of context.TODO();
ensure error handling/log messages for the LoadConfig -> kclientset.NewForConfig
-> common.GetCloudConfig -> common.IsDualStack sequence still log the error
(including timeout) and that isDualStackCluster and dualStackDetectionReady are
only set when detection completes successfully, and keep the existing
log.Debugf("Dual-stack cluster detected: %v", isDualStackCluster) after
successful detection.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/dev/e2e-ote-ccm-aws.md`:
- Around line 166-168: The quoted test name in the HyperShift run example passed
to the run-test command is missing a closing quote around Managed; update the
test string used in the
./openshift-tests/bin/cloud-controller-manager-aws-tests-ext run-test invocation
(the long bracketed test selector starting with
"[cloud-provider-aws-e2e-openshift] ... with 'Managed value in cloud-config") to
close the single quote around Managed (e.g., "with 'Managed' value in
cloud-config ...") so the quoted test name is well-formed.

In `@openshift-tests/ccm-aws-tests/e2e/common/helper.go`:
- Around line 161-164: The error returned when fetching the ConfigMap via
mgmtClient.CoreV1().ConfigMaps(hcpNamespace).Get(...) drops the underlying err;
update the return to include/wrap the original error (e.g., append ": %w" or
include err with %v) so the failure from Get is preserved and actionable; change
the fmt.Errorf call that currently formats "failed to get HCP cloud-config
ConfigMap %s/%s" to include the err and wrap it using the original err variable
(referencing hcpNamespace and hcpCloudConfigName to locate the site).
- Around line 151-159: The error returns for BuildConfigFromFlags and
NewForConfig discard the underlying error; update the error wrapping in the
blocks where restConfig is built (clientcmd.BuildConfigFromFlags with
mgmtKubeconfig) and where mgmtClient is created (clientset.NewForConfig) to
include the original err (e.g. use fmt.Errorf with %w or include %v) so callers
see the underlying failure details; locate the two error branches around
restConfig and mgmtClient and wrap or append err to the returned fmt.Errorf
messages.

---

Nitpick comments:
In `@openshift-tests/ccm-aws-tests/main.go`:
- Around line 50-65: Replace the unbounded context.TODO() used in the dual-stack
detection call chain with a bounded context: create a context with timeout (e.g.
context.WithTimeout(context.Background(), 30*time.Second)), defer its cancel,
and pass that context into common.GetCloudConfig instead of context.TODO();
ensure error handling/log messages for the LoadConfig -> kclientset.NewForConfig
-> common.GetCloudConfig -> common.IsDualStack sequence still log the error
(including timeout) and that isDualStackCluster and dualStackDetectionReady are
only set when detection completes successfully, and keep the existing
log.Debugf("Dual-stack cluster detected: %v", isDualStackCluster) after
successful detection.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: ec60e4a6-62e6-4798-a902-08de766c83c7

📥 Commits

Reviewing files that changed from the base of the PR and between 7f6aa93 and 1d39590.

📒 Files selected for processing (7)
  • docs/dev/e2e-ote-ccm-aws.md
  • docs/dev/ote-ccm-aws.md
  • openshift-tests/ccm-aws-tests/e2e/aws/helper.go
  • openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go
  • openshift-tests/ccm-aws-tests/go.mod
  • openshift-tests/ccm-aws-tests/main.go
💤 Files with no reviewable changes (1)
  • docs/dev/ote-ccm-aws.md

Comment thread docs/dev/e2e-ote-ccm-aws.md Outdated
Comment on lines +166 to +168
./openshift-tests/bin/cloud-controller-manager-aws-tests-ext run-test \
"[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have NLBSecurityGroupMode with 'Managed value in cloud-config [Suite:openshift/conformance/parallel]"
```
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix malformed quoted test name in the HyperShift run example.

The example test string appears to miss a closing quote around Managed (with 'Managed value...), which can cause failed copy/paste execution for run-test.

Suggested doc fix
 ./openshift-tests/bin/cloud-controller-manager-aws-tests-ext run-test \
-  "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have NLBSecurityGroupMode with 'Managed value in cloud-config [Suite:openshift/conformance/parallel]"
+  "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have NLBSecurityGroupMode with 'Managed' value in cloud-config [Suite:openshift/conformance/parallel]"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
./openshift-tests/bin/cloud-controller-manager-aws-tests-ext run-test \
"[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have NLBSecurityGroupMode with 'Managed value in cloud-config [Suite:openshift/conformance/parallel]"
```
./openshift-tests/bin/cloud-controller-manager-aws-tests-ext run-test \
"[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have NLBSecurityGroupMode with 'Managed' value in cloud-config [Suite:openshift/conformance/parallel]"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/dev/e2e-ote-ccm-aws.md` around lines 166 - 168, The quoted test name in
the HyperShift run example passed to the run-test command is missing a closing
quote around Managed; update the test string used in the
./openshift-tests/bin/cloud-controller-manager-aws-tests-ext run-test invocation
(the long bracketed test selector starting with
"[cloud-provider-aws-e2e-openshift] ... with 'Managed value in cloud-config") to
close the single quote around Managed (e.g., "with 'Managed' value in
cloud-config ...") so the quoted test name is well-formed.

Comment thread openshift-tests/ccm-aws-tests/e2e/common/helper.go Outdated
Comment thread openshift-tests/ccm-aws-tests/e2e/common/helper.go Outdated
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 26, 2026

Fixed the dualstack check from config, collecting data sets from CI again after running successfully locally in a dualstack cluster:

$ oc get cm -n openshift-cloud-controller-manager cloud-conf -o yaml | grep NodeIPFamilies
    NodeIPFamilies                                  = ipv6
    NodeIPFamilies                                  = ipv4

$ run_test(){
while IFS= read -r t; do
     echo "=== Running: $t";
     $BIN run-test "$t" < /dev/null;
 done < <($BIN list tests | grep -v I0526 | jq -r '.[].name' | grep "AWSServiceLBNetworkSecurityGroup"); }
$ run_test | tee -a e2e-ote.log

$ grep -E "(name\"\:|\"result\")" e2e-ote.log
grep: warning: stray \ before :
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have NLBSecurityGroupMode with 'Managed value in cloud-config [Suite:openshift/conformance/parallel]",
    "result": "passed",
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should create NLB service with security group attached [Suite:openshift/conformance/parallel]",
    "result": "passed",
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have security groups attached to default ingress controller NLB [Suite:openshift/conformance/parallel]",
    "result": "passed",
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should update security group rules when service is updated [Suite:openshift/conformance/parallel]",
    "result": "passed",
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should cleanup security groups when service is deleted [Suite:openshift/conformance/parallel]",
    "result": "passed",
    "name": "[cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have correct security group rules for service ports [Suite:openshift/conformance/parallel]",
    "result": "passed",

/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5d7de3c0-5916-11f1-825a-985e881b83e7-0

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openshift-tests/ccm-aws-tests/e2e/common/helper.go`:
- Around line 263-268: The switch that checks the IP family string (switch v {
case "ipv6": ... case "ipv4": ... }) assigns hasIPv4 and hasIPv6 backwards;
change the case bodies so case "ipv6" sets hasIPv6 = true and case "ipv4" sets
hasIPv4 = true (leave the variable names hasIPv4 and hasIPv6 and the switch on v
unchanged).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c3489038-5d9a-40b5-89d2-5a570b967712

📥 Commits

Reviewing files that changed from the base of the PR and between 1d39590 and 48d1685.

📒 Files selected for processing (4)
  • openshift-tests/ccm-aws-tests/e2e/aws/helper.go
  • openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go
  • openshift-tests/ccm-aws-tests/main.go

Comment thread openshift-tests/ccm-aws-tests/e2e/common/helper.go Outdated
Comment thread openshift-tests/ccm-aws-tests/main.go Outdated
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

@tthvo: This PR was included in a payload test run from openshift/origin#31217
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/82f96140-593b-11f1-9177-54846f2a0e77-0

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

@tthvo: This PR was included in a payload test run from openshift/origin#31217
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv4-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8afa2d70-593b-11f1-94c3-bb46dcb523ca-0

@tthvo
Copy link
Copy Markdown
Member

tthvo commented May 26, 2026

/cc @sadasu @nrb @tthvo

@openshift-ci openshift-ci Bot requested review from nrb, sadasu and tthvo May 26, 2026 19:46
mtulio added a commit to mtulio/cluster-cloud-controller-manager-operator that referenced this pull request May 26, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
openshift-tests/ccm-aws-tests/main.go (1)

53-65: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Honor the fail-closed path when dual-stack detection is unavailable.

Because dualStackDetectionReady is never consulted after Line 63, any error in Lines 53-60 leaves isDualStackCluster == false, and Line 90 still appends the upstream loadbalancer selector. That means detection failures will run the unsupported upstream LB suite instead of skipping it.

Suggested fix
-	if isDualStackCluster {
-		framework.Logf("Dual-stack environment detected, skipping test name that contains '[cloud-provider-aws-e2e] loadbalancer'")
-	} else {
+	if !dualStackDetectionReady {
+		framework.Logf("Dual-stack detection was not available, skipping test name that contains '[cloud-provider-aws-e2e] loadbalancer'")
+	} else if isDualStackCluster {
+		framework.Logf("Dual-stack environment detected, skipping test name that contains '[cloud-provider-aws-e2e] loadbalancer'")
+	} else {
 		specSelectors = append(specSelectors, extensiontests.NameContains("[cloud-provider-aws-e2e] loadbalancer"))
 	}

Also applies to: 87-90

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift-tests/ccm-aws-tests/main.go` around lines 53 - 65, The dual-stack
detection block (using framework.LoadConfig, kclientset.NewForConfig,
common.GetCloudConfig, common.IsDualStack) currently leaves isDualStackCluster
false on errors but still allows later code to behave as if detection succeeded;
change the logic so that dualStackDetectionReady is only set true when detection
completes successfully and ensure downstream code that appends the upstream
loadbalancer selector checks dualStackDetectionReady before using
isDualStackCluster (i.e., do not append the upstream LB selector unless
dualStackDetectionReady == true and isDualStackCluster == true); apply the same
check in the other occurrence mentioned (the append block around upstream LB
handling) so detection failures skip the unsupported upstream LB suite rather
than defaulting to non-dual-stack behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@openshift-tests/ccm-aws-tests/main.go`:
- Around line 53-65: The dual-stack detection block (using framework.LoadConfig,
kclientset.NewForConfig, common.GetCloudConfig, common.IsDualStack) currently
leaves isDualStackCluster false on errors but still allows later code to behave
as if detection succeeded; change the logic so that dualStackDetectionReady is
only set true when detection completes successfully and ensure downstream code
that appends the upstream loadbalancer selector checks dualStackDetectionReady
before using isDualStackCluster (i.e., do not append the upstream LB selector
unless dualStackDetectionReady == true and isDualStackCluster == true); apply
the same check in the other occurrence mentioned (the append block around
upstream LB handling) so detection failures skip the unsupported upstream LB
suite rather than defaulting to non-dual-stack behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1ebae86b-b6e7-485c-a091-059d5ca08572

📥 Commits

Reviewing files that changed from the base of the PR and between 48d1685 and 3061a53.

📒 Files selected for processing (3)
  • openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go
  • openshift-tests/ccm-aws-tests/main.go
💤 Files with no reviewable changes (1)
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go

mtulio added a commit to mtulio/cluster-cloud-controller-manager-operator that referenced this pull request May 26, 2026
@mtulio mtulio force-pushed the OCPBUGS-86299 branch 3 times, most recently from 4fe49ef to ad1461f Compare May 26, 2026 21:36
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openshift-tests/ccm-aws-tests/main.go`:
- Around line 53-57: The code ignores errors from common.IsDualStack and thus
can “fail-open”; change the detection flow so that when common.GetCloudConfig or
common.IsDualStack returns an error you set dualStackDetectionReady = false (and
do not set isDualStackCluster), logging the error, and only treat dual-stack as
present if dualStackDetectionReady == true && isDualStackCluster == true; update
the later conditional that currently checks only isDualStackCluster to require
dualStackDetectionReady as well so upstream LB specs are included only when
detection succeeded.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 68360838-876c-4b19-b1fb-c1aeaa10e6fe

📥 Commits

Reviewing files that changed from the base of the PR and between 3061a53 and 4fe49ef.

📒 Files selected for processing (4)
  • openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go
  • openshift-tests/ccm-aws-tests/go.mod
  • openshift-tests/ccm-aws-tests/main.go

Comment thread openshift-tests/ccm-aws-tests/main.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
openshift-tests/ccm-aws-tests/main.go (1)

53-57: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail-closed dual-stack gating is still bypassed on detection errors.

Line 56 discards common.IsDualStack errors and Line 81 gates only on isDualStackCluster. This can still include upstream LB specs when detection fails.

Suggested fix
-	if cm, err := common.GetCloudConfig(context.TODO(), nil); err != nil {
-		log.Debugf("failed to get cloud-config for dual-stack detection: %v", err)
-	} else {
-		isDualStackCluster, _ = common.IsDualStack(cm)
-		dualStackDetectionReady = true
-		log.Debugf("Dual-stack cluster detected: %v", isDualStackCluster)
-	}
+	if cm, err := common.GetCloudConfig(context.TODO(), nil); err != nil {
+		log.Warnf("failed to get cloud-config for dual-stack detection: %v", err)
+	} else if dual, err := common.IsDualStack(cm); err != nil {
+		log.Warnf("failed to detect dual-stack from cloud-config: %v", err)
+	} else {
+		isDualStackCluster = dual
+		dualStackDetectionReady = true
+		log.Debugf("Dual-stack cluster detected: %v", isDualStackCluster)
+	}
@@
-	if isDualStackCluster {
+	if !dualStackDetectionReady || isDualStackCluster {
 		framework.Logf("Dual-stack environment detected, skipping test name that contains '[cloud-provider-aws-e2e] loadbalancer'")
 	} else {
 		specSelectors = append(specSelectors, extensiontests.NameContains("[cloud-provider-aws-e2e] loadbalancer"))
 	}

Also applies to: 81-85

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openshift-tests/ccm-aws-tests/main.go` around lines 53 - 57, If cloud-config
or dual-stack detection returns an error, fail-closed: treat the cluster as
dual-stack and mark detection as ready so downstream gating prevents unsafe
upstream LB specs. Concretely, in the block that calls
common.GetCloudConfig(...) and common.IsDualStack(...), do not discard errors
from common.IsDualStack; if IsDualStack returns an error, set isDualStackCluster
= true and dualStackDetectionReady = true (otherwise set isDualStackCluster to
the returned value and dualStackDetectionReady = true only on success). Also
update the later gating logic that uses
isDualStackCluster/dualStackDetectionReady to rely on dualStackDetectionReady to
decide readiness and to treat any detection-error case as dual-stack based on
the variables above.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@openshift-tests/ccm-aws-tests/e2e/common/helper.go`:
- Around line 190-198: The IsDualStack detection loop currently matches values
from the slice `values` case-sensitively and can miss "IPv4"/"IPv6" inputs;
before the switch in the loop that sets `hasIPv4` and `hasIPv6`, normalize each
`v` (e.g., strings.ToLower(v) or equivalent) and switch on the normalized value
so the cases "ipv4" and "ipv6" reliably match; update the loop that iterates
`for _, v := range values { ... switch v { ... } }` to switch on the lowercased
value.

---

Duplicate comments:
In `@openshift-tests/ccm-aws-tests/main.go`:
- Around line 53-57: If cloud-config or dual-stack detection returns an error,
fail-closed: treat the cluster as dual-stack and mark detection as ready so
downstream gating prevents unsafe upstream LB specs. Concretely, in the block
that calls common.GetCloudConfig(...) and common.IsDualStack(...), do not
discard errors from common.IsDualStack; if IsDualStack returns an error, set
isDualStackCluster = true and dualStackDetectionReady = true (otherwise set
isDualStackCluster to the returned value and dualStackDetectionReady = true only
on success). Also update the later gating logic that uses
isDualStackCluster/dualStackDetectionReady to rely on dualStackDetectionReady to
decide readiness and to treat any detection-error case as dual-stack based on
the variables above.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5d4bb127-87d1-4e4f-a8bb-afd1f7772f12

📥 Commits

Reviewing files that changed from the base of the PR and between 4fe49ef and ad1461f.

📒 Files selected for processing (4)
  • openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go
  • openshift-tests/ccm-aws-tests/e2e/common/helper.go
  • openshift-tests/ccm-aws-tests/go.mod
  • openshift-tests/ccm-aws-tests/main.go

Comment thread openshift-tests/ccm-aws-tests/e2e/common/helper.go
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 26, 2026

/test all

OCPBUGS-86299:

Upstream cloud-provider-aws load balancer tests do not support dual-stack
clusters yet. Detect dual-stack configuration from the cloud-config
(`NodeIPFamilies` key) at startup and exclude upstream LB tests when the
cluster is dual-stack. When detection fails, upstream LB tests are also
excluded to avoid false positives (fail-closed).

For the downstream AWSServiceLBNetworkSecurityGroup tests, patch the
NLB service with IPFamilyPolicy=RequireDualStack when dual-stack is
detected so the service matches the cluster's network configuration.
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 27, 2026

@mtulio: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/vendor ad1461f link true /test vendor

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 27, 2026

Overall changes isolated from #464, and PRs can move in parallel - without dependencies.

AFAICT ipFamilyPolicy with value RequiredDualStack when ipFamilies has IPv4 first in the list won't be affected by this change as we are detecting the presence of DualStack config, driven by NodeIPFamilies entries, letting CCM set those defaults, so this patch only care about ipFamilyPolicy preventing being set as SingleStack when IPv6 are first in the list (primary IPv6). Checking this statement with the job:

/test all
/payload-job periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv4-primary-techpreview

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 27, 2026

@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv4-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/b7ea5e90-5972-11f1-9f2b-ec634464a62f-0

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 27, 2026

Triggering primary IPv6 as well:

/payload-job periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 27, 2026

@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-installer-dualstack-ipv6-primary-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/11d2e120-5973-11f1-81d8-7538de235758-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants