no-jira:test/monitoring: increase load balancer readiness timeout to 20 minutes#30994
no-jira:test/monitoring: increase load balancer readiness timeout to 20 minutes#30994tthvo wants to merge 2 commits intoopenshift:mainfrom
Conversation
The service-type-load-balancer-availability monitor test was timing out during preparation when waiting for the load balancer to become reachable. DNS SOA records for AWS ELB in EUSC region show a 900s (15-minute) retry interval, which can cause the load balancer hostname resolution to take longer than the previous 10-minute timeout. Increase the timeout from 10 to 20 minutes to accommodate slow DNS propagation in regions with high TTL values.
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@tthvo: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughIncreased the initial HTTP reachability check timeout from 10 to 20 minutes in the load-balancer disruption test; added Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tthvo The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Scheduling required tests: |
The service-load-balancer-with-pdb disruption test was experiencing DNS resolution failures when connecting to the load balancer during test execution. AWS ELB DNS zones have an SOA TTL of 900 seconds, which means DNS propagation can take up to 15 minutes for newly created load balancers. This change adds a WithTimeout() method to BackendSampler and sets a 120-second timeout for the service load balancer disruption samplers, allowing sufficient time for DNS retries during the propagation period. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/monitor/backenddisruption/disruption_backend_sampler.go`:
- Around line 235-237: Validate the timeout in BackendSampler.WithTimeout and
refuse non-positive values: check if timeout <= 0 and if so do not set b.timeout
(return b unchanged) so callers cannot accidentally disable timeouts; otherwise
set b.timeout = &timeout and return b. This ensures BackendSampler.WithTimeout
enforces a positive time.Duration (use the existing BackendSampler and
WithTimeout identifiers and the b.timeout field).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 304a4f0e-faeb-43e6-99fa-921457c4fef5
📒 Files selected for processing (2)
pkg/monitor/backenddisruption/disruption_backend_sampler.gopkg/monitortests/network/disruptionserviceloadbalancer/monitortest.go
✅ Files skipped from review due to trivial changes (1)
- pkg/monitortests/network/disruptionserviceloadbalancer/monitortest.go
|
Scheduling required tests: |
|
/test e2e-aws-ovn-fips |
|
@tthvo: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
The service-type-load-balancer-availability monitor test was timing out during preparation when waiting for the load balancer to become reachable.
DNS SOA records for AWS ELB in EUSC region show a 900s (15-minute) retry interval, which can cause the load balancer hostname resolution to take longer than the previous 10-minute timeout.
Increase the timeout from 10 to 20 minutes to accommodate slow DNS propagation in regions with high TTL values.