-
Notifications
You must be signed in to change notification settings - Fork 501
Fix second-pass queue cleanup when workload no longer needs retry #8431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix second-pass queue cleanup when workload no longer needs retry #8431
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
|
Welcome @skools-here! |
|
Hi @skools-here. Thanks for your PR. I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
|
/retest-required |
gabesaba
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! I just have a small nitpick comment
/approve
pkg/cache/queue/manager.go
Outdated
| }) | ||
| return true | ||
| } | ||
| if !workload.NeedsSecondPass(w) && iteration > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: else-if, and add a short comment to document this branch?
| if !workload.NeedsSecondPass(w) && iteration > 0 { | |
| else if iteration > 0 { | |
| // some comment, including link to #8357 |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gabesaba, skools-here The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
And please update the release-note |
|
Hey @gabesaba I fixed the changes and added release notes also kindly review! |
| // QueueSecondPassIfNeeded queues for the second pass of scheduling with exponential | ||
| // delay. | ||
| func (m *Manager) QueueSecondPassIfNeeded(ctx context.Context, w *kueue.Workload, iteration int) bool { | ||
| log := ctrl.LoggerFrom(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add some unit test for that changes in, probably in TestQueueSecondPassIfNeeded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I’ll add unit coverage for this case and update the PR.
|
/test pull-kueue-test-e2e-main-1-35 |
|
The e2e job |
I think so too /retest |
pkg/cache/queue/manager_test.go
Outdated
| if name == "workload no longer needs second pass after first iteration" { | ||
| manager.QueueSecondPassIfNeeded( | ||
| ctx, | ||
| baseWorkloadNotNeedingSecondPass.DeepCopy(), | ||
| 1, | ||
| ) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an anti-pattern - having a special case in the shared test logic for one of the test cases.
Additionally, are we testing in this case that the new logic works? I don't see anything which changes whether the 2nd pass is needed, nor checks if it is removed from the queue.
We might need another test fixture to test this. Let me know if you need help (though I'm OOO monday)
|
@skools-here: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Fixes cleanup of the second-pass scheduling queue by ensuring workloads are removed once they no longer require a retry, while preserving the initial backoff behavior.
Which issue(s) this PR fixes:
Fixes #8357
Special notes for your reviewer:
Cleanup is gated on retry iteration to avoid prematurely canceling the initial backoff window.
Does this PR introduce a user-facing change?