OCPBUGS-81675: pkg/cvo/egress: trust user-ca-bundle in HyperShift mode#1370
OCPBUGS-81675: pkg/cvo/egress: trust user-ca-bundle in HyperShift mode#1370dpateriya wants to merge 4 commits intoopenshift:mainfrom
Conversation
In a HyperShift hosted cluster, HostedCluster.spec.additionalTrustBundle is synced by HCCO into openshift-config/user-ca-bundle, but that ConfigMap is never merged into openshift-config-managed/trusted-ca-bundle (which is the only source getTLSConfig() currently reads). As a result, when spec.updateService points to an update service whose TLS certificate is signed by an internal CA provided exclusively via additionalTrustBundle, CVO's Cincinnati client fails with: x509: certificate signed by unknown authority Fix: when running with --hypershift, also read openshift-config/user-ca-bundle via the already-wired cmConfigLister and merge it into the cert pool alongside trusted-ca-bundle. The new code path is strictly gated on optr.hypershift, so non-HyperShift clusters are completely unaffected. Add TestGetTLSConfig covering all combinations of bundle presence and the hypershift flag. Made-with: Cursor
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughRewrites Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/cvo/egress.go (1)
109-113: Consider documenting the intentional omission ofMinVersion.Static analysis flags the missing
MinVersionfield. For a client connecting to Cincinnati/OSUS update services, Go's default (TLS 1.2) is typically acceptable and avoids breaking compatibility with services that don't support TLS 1.3. However, a brief comment explaining this choice would preempt future questions.📝 Optional: Add clarifying comment
if !found { return nil, nil } + // MinVersion intentionally omitted; Go defaults to TLS 1.2 for clients, + // which is acceptable for Cincinnati/OSUS connectivity and avoids + // breaking compatibility with update services that only support TLS 1.2. return &tls.Config{RootCAs: certPool}, nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/cvo/egress.go` around lines 109 - 113, Add a brief explanatory comment next to the creation of the tls.Config (the &tls.Config{RootCAs: certPool} return in pkg/cvo/egress.go) stating that MinVersion is intentionally omitted so the client uses Go's default TLS version (currently 1.2) to maintain compatibility with Cincinnati/OSUS update services; include a note that this choice is deliberate to avoid forcing TLS 1.3 and to preempt future questions from static analysis.pkg/cvo/egress_test.go (1)
166-219: Consider adding error-path test cases.The current tests cover happy paths and "not found" scenarios. Adding cases where the listers return non-NotFound errors would validate the error propagation paths in
getTLSConfig()(lines 81-82 and 98-99 in egress.go).🧪 Example error test case to add
{ name: "managed lister error propagates", hypershift: false, // Set up managedLister with Err field to simulate a non-NotFound error wantError: true, },This would require adding a
wantError boolfield to the test struct and adjusting the test logic to expect and verify the error.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/cvo/egress_test.go` around lines 166 - 219, Add tests for error paths by extending the test table to include a wantError bool and cases that simulate non-NotFound errors from the listers (set the Err field on the cmConfigLister used for managedLister and/or configLister). In the t.Run loop, when tt.wantError is true assert that optr.getTLSConfig() returns a non-nil error (and skip further nil checks), otherwise keep the existing assertions; reference the cmConfigLister used to populate Operator.cmConfigManagedLister and Operator.cmConfigLister and the getTLSConfig() call to locate where to make changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/cvo/egress_test.go`:
- Around line 166-219: Add tests for error paths by extending the test table to
include a wantError bool and cases that simulate non-NotFound errors from the
listers (set the Err field on the cmConfigLister used for managedLister and/or
configLister). In the t.Run loop, when tt.wantError is true assert that
optr.getTLSConfig() returns a non-nil error (and skip further nil checks),
otherwise keep the existing assertions; reference the cmConfigLister used to
populate Operator.cmConfigManagedLister and Operator.cmConfigLister and the
getTLSConfig() call to locate where to make changes.
In `@pkg/cvo/egress.go`:
- Around line 109-113: Add a brief explanatory comment next to the creation of
the tls.Config (the &tls.Config{RootCAs: certPool} return in pkg/cvo/egress.go)
stating that MinVersion is intentionally omitted so the client uses Go's default
TLS version (currently 1.2) to maintain compatibility with Cincinnati/OSUS
update services; include a note that this choice is deliberate to avoid forcing
TLS 1.3 and to preempt future questions from static analysis.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 8cd764c4-898f-4073-9843-37ae10219b5a
📒 Files selected for processing (2)
pkg/cvo/egress.gopkg/cvo/egress_test.go
|
@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/lgtm |
|
@dpateriya: you cannot LGTM your own PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dpateriya The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
- Add comment explaining why tls.Config.MinVersion is intentionally omitted (Go defaults to TLS 1.2 for clients, acceptable for Cincinnati/OSUS and avoids breaking services not yet on TLS 1.3) - Add error-path test cases for non-NotFound lister errors from both cmConfigManagedLister and cmConfigLister to verify error propagation Made-with: Cursor
|
@PratikMahajan can you please review this PR? |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/cvo/egress.go`:
- Around line 75-77: The current code creates an empty certPool via
x509.NewCertPool() (certPool) and later assigns it to TLSClientConfig.RootCAs,
which replaces system roots; instead, initialize certPool from system roots
first (use x509.SystemCertPool() and fall back to a new pool if nil), then
append the user/managed bundles to that pool before assigning it to
TLSClientConfig.RootCAs so system trust is preserved; update the code paths that
call x509.NewCertPool() and the assignment to TLSClientConfig.RootCAs to use
this behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 2cfe5e62-7ca5-4a0a-a36a-61d4ce5246ef
📒 Files selected for processing (2)
pkg/cvo/egress.gopkg/cvo/egress_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/cvo/egress_test.go
Made-with: Cursor
|
@dpateriya: This pull request references Jira Issue OCPBUGS-81675, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Starting from x509.NewCertPool() causes RootCAs to replace system trust entirely when custom bundles are present. Seed from x509.SystemCertPool() instead so that custom CAs are additive — system CA trust is preserved and public-CA endpoints (e.g. a Cincinnati service with a Let's Encrypt cert) continue to work alongside internal-CA endpoints. If SystemCertPool is unavailable on the current platform, fall back to an empty pool to preserve the previous behaviour rather than hard-failing. Made-with: Cursor
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
/retest-required |
|
/test e2e-agnostic-ovn |
|
/retest-required |
|
@dpateriya: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Problem
In a HyperShift hosted cluster,
HostedCluster.spec.additionalTrustBundleis synced by the Hosted Cluster Config Operator (HCCO) intoopenshift-config/user-ca-bundle. However, that ConfigMap is not automatically merged intoopenshift-config-managed/trusted-ca-bundle— which is the only sourcegetTLSConfig()currently reads.As a result, when
spec.updateServicepoints to an update service whose TLS certificate is signed by an internal CA provided exclusively viaadditionalTrustBundle, the CVO Cincinnati client fails with:This affects customers running an internal Cincinnati-compatible update service (e.g. OpenShift Update Service / OSUS) behind a wildcard certificate signed by an internal CA.
Fix
When running with
--hypershift, also readopenshift-config/user-ca-bundle(via the already-wiredcmConfigLister) and merge its contents into the cert pool alongsidetrusted-ca-bundle.The new code path is strictly gated on
optr.hypershift, so standard OCP clusters are completely unaffected.Both listers (
cmConfigListerforopenshift-configandcmConfigManagedListerforopenshift-config-managed) are already wired incvo.New()— no new informers or flags are required.Changes
pkg/cvo/egress.go: extendgetTLSConfig()to mergeopenshift-config/user-ca-bundleinto the cert pool when--hypershiftis set.pkg/cvo/egress_test.go: newTestGetTLSConfigwith 8 table-driven cases covering all combinations of bundle presence and thehypershiftflag.Testing
All 8 cases pass:
trusted-ca-bundleonly → CA trusteduser-ca-bundlepresent buthypershift=false→ ignoredhypershift=false→ onlytrusted-ca-bundleusedtrusted-ca-bundleonly → CA trusteduser-ca-bundleonly → CA trustedSummary by CodeRabbit
New Features
Bug Fixes
Tests