Skip to content

[Feature Request] Support per-version metrics label templating to close loop on KEDA per-version scaling #355

@carlydf

Description

@carlydf

Is your feature request related to a problem? Please describe.

After #286 closes, Worker Controller users will be able to use the KEDA temporal trigger for per-version scaling based on approximate backlog count using the KEDA temporal trigger. However, scaling workers based on backlog count (queue size) alone can cause premature scale down when queue size is zero at steady state. Task scheduling latency is lowest when backlog is zero, so zero backlog is a common target. To avoid scaling down prematurely, users need to combine backlog count with other metrics that indicate worker utilization so as to avoid scaling down when workers are in full use. One effective signal for this is the worker slot utilization metric, which is emitted locally by each worker.

HPA users are able to template the temporal_worker_build_id and temporal_worker_deployment_name tags into the HPA metrics query template via the matchLabels field which is compatible with any metrics provider that exposes the Kubernetes External Metrics API. The controller detects spec.metrics[*].external.metric.selector.matchLabels in any k8s resource with that field and [auto-injects] the relevant values.

Worker Controller users can therefore scale on backlog + slot utilization using HPA, but not with KEDA.

KEDA consumes metrics from various metrics providers using the "trigger" they define, such as:

The currently supported matchLabels injection pattern does not support per-version metrics filtering in KEDA, which means scaling on arbitrary cluster metrics such as slot utilization is not supported using KEDA for versioned worker, making KEDA compatibility with worker controller incomplete.

Describe the solution you'd like

Some templating format such that the above triggers (and other metrics triggers) can filter metrics by version. Could be restricted to the same variables already auto-injected to spec.metrics[*].external.metric.selector.matchLabels ({temporal_worker_deployment_name: <ns>_<wd-name>, temporal_worker_build_id: <buildID>, temporal_namespace: <temporal-ns>})

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions