Scheduler Improvement: Allow merging new actions to existing plan#7702
Open
jujokini wants to merge 2 commits into
Open
Scheduler Improvement: Allow merging new actions to existing plan#7702jujokini wants to merge 2 commits into
jujokini wants to merge 2 commits into
Conversation
…arding Previously, if a placement plan was already APPLYING when a new scheduling cycle produced results, the new plan was silently discarded. This caused VMs scheduled in subsequent cycles to wait until the running plan fully completed before being deployed. This change implements plan merging for placement plans (cid == -1): - When a new plan arrives and one is already APPLYING, merge_actions() is called instead of returning early - merge_actions() first prunes terminal actions (DONE/ERROR/TIMEOUT) from the running plan to prevent unbounded growth and unblock check_completed() - New actions for VMs not already in the plan are appended with IDs starting above the current maximum to avoid colliding with in-flight APPLYING action IDs stored in VM history records - execute_plans() is called immediately after merging so newly appended actions are dispatched without waiting for the next timer tick DRS cluster optimization plans (cid >= 0) retain the existing replace behaviour as concurrent optimizer runs would produce conflicting results. Also removes two stale TODO comments in SchedulerManager.cc that referred to this missing guard. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Jukka Jokiniva <jukka.jokiniva@qt.io>
Previously, execute_plan() stopped dispatching as soon as the first READY action hit either the per-host or per-cluster action limit. This meant VMs targeting uncongested hosts were skipped until the next cycle. Replace get_next_action() with get_ready_actions() which collects all READY action pointers upfront. The dispatch loop now breaks only on the cluster cap (hard ceiling) and continues past saturated hosts, allowing VMs assigned to other hosts to be started in the same cycle. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Jukka Jokiniva <jukka.jokiniva@qt.io>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
We had a problem with OpenNebula 7.0.1 and our custom drivers. The VMs stay a long time in boot phase due to the VM image download. While VMs stay in boot status, they block the scheduling as the scheduler waits for one plan to finish before starting a new one. The fix for is to allow merging new scheduling actions into the existing plan.
There is also a related bug report #7639
Branches to which this PR applies