[Debug] Add wait step to e2e-aws-mno-qe-integration-tests for OCP 4.22#78844
[Debug] Add wait step to e2e-aws-mno-qe-integration-tests for OCP 4.22#78844mmakwana30 wants to merge 1 commit intoopenshift:mainfrom
Conversation
This adds a wait step to enable debugging of test failures in OCP 4.22. The wait step pauses the workflow before tests run, allowing QE to: - SSH into the test environment - Inspect system state and logs - Debug configuration issues - Investigate test failures OCP Version: 4.22 Job: e2e-aws-mno-qe-integration-tests Timeout: 8 hours
WalkthroughA CI configuration update adds an 8-hour timeout environment variable and inserts a wait step before the existing test step in the e2e-aws-mno-qe-integration-tests job for the LVM operator. ChangesCI Test Job Configuration
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes 🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mmakwana30 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
[REHEARSALNOTIFIER]
Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals. Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ci-operator/config/openshift/lvm-operator/openshift-lvm-operator-main.yaml`:
- Around line 346-349: The 8-hour hardcoded wait (TIMEOUT: +8 hours) is causing
all weekly runs to idle; change the job so that the "+8 hours" wait is only
applied for a debug variant or when a debug flag is set. Modify the job config
around TIMEOUT and the test block that references the "wait" ref/as
"lvms-mno-integration-test" to add a gated variant (e.g., create a separate
debug job or wrap the TIMEOUT modification in a conditional like debug=true) so
the regular weekly job keeps the normal timeout while the debug variant retains
TIMEOUT: +8 hours and the test ref: wait entry.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 905f5a4e-dc25-4e9c-935c-75e717bde965
📒 Files selected for processing (1)
ci-operator/config/openshift/lvm-operator/openshift-lvm-operator-main.yaml
| TIMEOUT: +8 hours | ||
| test: | ||
| - ref: wait | ||
| - as: lvms-mno-integration-test |
There was a problem hiding this comment.
Make the 8-hour wait debug-only instead of always-on in the weekly job.
With the current change, every scheduled weekly run idles for 8 hours before tests, which can significantly delay signal and tie up AWS/lease capacity; it may also push total runtime beyond job timeout budgets once install + test time are included. Please move this into a dedicated debug variant (or otherwise gate it) so the regular periodic keeps its normal runtime.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@ci-operator/config/openshift/lvm-operator/openshift-lvm-operator-main.yaml`
around lines 346 - 349, The 8-hour hardcoded wait (TIMEOUT: +8 hours) is causing
all weekly runs to idle; change the job so that the "+8 hours" wait is only
applied for a debug variant or when a debug flag is set. Modify the job config
around TIMEOUT and the test block that references the "wait" ref/as
"lvms-mno-integration-test" to add a gated variant (e.g., create a separate
debug job or wrap the TIMEOUT modification in a conditional like debug=true) so
the regular weekly job keeps the normal timeout while the debug variant retains
TIMEOUT: +8 hours and the test ref: wait entry.
There was a problem hiding this comment.
This is just for debugging something and we have no plans of merging this code in.
There was a problem hiding this comment.
@kasturinarra, understood — thanks for the clarification! Since this is a temporary debug PR that won't be merged, the concern doesn't apply. Happy debugging! 🙂
|
@mmakwana30: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/pj-rehearse periodic-ci-openshift-lvm-operator-main-e2e-aws-mno-qe-integration-tests |
|
@mmakwana30: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-lvm-operator-main-e2e-aws-mno-qe-integration-tests |
|
@kasturinarra: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-lvm-operator-main-e2e-aws-mno-qe-integration-tests |
|
@kasturinarra: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
|
/pj-rehearse periodic-ci-openshift-lvm-operator-main-e2e-aws-mno-qe-integration-tests |
|
@kasturinarra: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
This adds a wait step to enable debugging of test failures in OCP 4.22.
The wait step pauses the workflow before tests run, allowing QE to:
OCP Version: 4.22
Job: e2e-aws-mno-qe-integration-tests
Timeout: 8 hours
Summary by CodeRabbit