[slice] Evict if Slice is DEGRADED (and not tolerated)#1123
[slice] Evict if Slice is DEGRADED (and not tolerated)#1123pajakd wants to merge 8 commits intoAI-Hypercomputer:slice-mainfrom
Conversation
6c45282 to
73057ad
Compare
slice/internal/core/core.go
Outdated
| if expr.Operator == corev1.NodeSelectorOpIn { | ||
| valueAllowedInExpr := false | ||
| for _, val := range expr.Values { | ||
| if val == value { | ||
| valueAllowedInExpr = true | ||
| break | ||
| } | ||
| } | ||
| if !valueAllowedInExpr { | ||
| termAllowsValue = false | ||
| } | ||
| } |
There was a problem hiding this comment.
we should also handle Operator: corev1.NodeSelectorOpNotIn, Values: ["Degraded"]
- isn't there some predefined method for checking this?
There was a problem hiding this comment.
Also, all the existing functions (like this one https://github.com/kubernetes/component-helpers/blob/master/scheduling/corev1/nodeaffinity/nodeaffinity.go#L321-L323C31) require node as argument. We actually use the existing func in Kueue https://github.com/kubernetes-sigs/kueue/blob/63d93ada52f1a9fa62deea59b65683fc9b6d99ea/pkg/cache/scheduler/tas_flavor_snapshot.go#L1567-L1569 but we don't have nodes in our context, we want to check if a given pods set allows DEGRADED slices or not. So, I think applying the existing one would actually introduce more complexity.
slice/internal/core/core.go
Outdated
| termAllowsValue := true | ||
| for _, expr := range term.MatchExpressions { | ||
| if expr.Key == key { | ||
| if expr.Operator == corev1.NodeSelectorOpIn { |
There was a problem hiding this comment.
This logic only evaluates corev1.NodeSelectorOpIn. If a user explicitly specifies NotIn: [Degraded], this function will skip evaluating that expression, and it will
erroneously return true (meaning Degraded is allowed). We should add support for corev1.NodeSelectorOpNotIn to ensure inverse conditions are respected.
| return false | ||
| } | ||
| for _, term := range spec.Affinity.NodeAffinity.RequiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms { | ||
| if len(term.MatchExpressions) > 0 { |
There was a problem hiding this comment.
The health of the cube is exposed only via label (cloud.google.com/gke-tpu-partition-4x4x4-state).
Since MatchFields is used to match against some fields of the nodes (not labels) its seem to be irrelevant in this context. So I'm just completely ignoring it.
(see eg https://github.com/kubernetes/kubernetes/blob/16c5a6be07bb2eab06b6e7732eb1cb759a7fe61c/pkg/apis/core/types.go#L3365-L3370) Let me know if I'm wrong.
|
lgtm |
Description
Evict the workload, when a slice enters DEGRADED state, while the workload requested only HEALTHY slices.
Issue
Testing