fix(pod-scaler): defer CPU limit strip until after authoritative decrease by deepsm007 · Pull Request #5265 · openshift/ci-tools

deepsm007 · 2026-06-22T17:29:48Z

DPTP-2938 moved authoritative decrease to limits-only and runs it after reconcileLimits, which removes CPU limits but keeps memory limits. CPU authoritative dry-run/apply never fired because the limit was already gone; memory worked fine on the same path.

/cc @hector-vido

pod-scaler: Fix CPU limit removal timing to enable authoritative decrease

This fix corrects the order of operations in the pod-scaler's admission webhook to ensure CPU limits can be properly decreased based on measured usage in authoritative mode.

The problem: A previous change moved the authoritative decrease functionality (which reduces CPU and memory limits based on Prometheus-measured usage) to run after reconcileLimits. However, reconcileLimits was unconditionally removing CPU limits while preserving memory limits. This meant that when the authoritative CPU decrease logic executed, the CPU limit had already been stripped from the container, preventing any CPU limit reduction from being applied. Memory authoritative decrease continued to work since memory limits were preserved.

The solution:

reconcileLimits now only enforces the 200% memory limit threshold and no longer removes CPU limits
CPU limits are now explicitly deleted after applyAuthoritativeLimitDecrease completes, but only when:
- mutateResourceLimits is enabled (limit mutation is active)
- authoritativeCPU is false (CPU authoritative decrease is disabled)

This ensures both CPU and memory limits remain available for the authoritative decrease logic to operate on before being stripped. CI operators who have enabled authoritative CPU decrease will now see CPU limits properly reduced according to measured usage data.

Files changed:

cmd/pod-scaler/admission.go: Modified reconcileLimits to skip CPU limit removal; added explicit CPU limit deletion after applyAuthoritativeLimitDecrease when appropriate
cmd/pod-scaler/admission_test.go: Updated test expectations to reflect that CPU limits are no longer removed by reconcileLimits

…ease

openshift-merge-bot · 2026-06-22T17:29:51Z

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

coderabbitai · 2026-06-22T17:30:06Z

📝 Walkthrough

Walkthrough

reconcileLimits no longer deletes CPU limits; it now only enforces the memory-limit floor (≥200% of memory request). CPU limit deletion is moved into mutatePodResources as a conditional, nil-guarded block that fires only when mutateResourceLimits is true and authoritativeCPU is false. The TestReconcileLimits CPU case is updated to assert the CPU limit is preserved.

Changes

CPU Limit Deletion Relocation

Layer / File(s)	Summary
CPU limit cleanup: reconcileLimits, mutatePodResources, and test `cmd/pod-scaler/admission.go`, `cmd/pod-scaler/admission_test.go`	`reconcileLimits` drops the unconditional `ResourceCPU` limit deletion and only checks the memory 200% floor. `mutatePodResources` adds a nil-guarded conditional that deletes `ResourceCPU` from container limits when `mutateResourceLimits` is true and `authoritativeCPU` is false. `TestReconcileLimits` updated to assert CPU limit value `200` is retained rather than removed.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 15 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Coverage For New Features	⚠️ Warning	The PR fixes a bug where CPU limits were removed before applyAuthoritativeLimitDecrease could process them. While TestReconcileLimits was updated to verify CPU limits persist through reconcileLimit...	Add a test case in TestMutatePodResources with authoritativeCPU=true and mutateResourceLimits=true that verifies CPU limits are processed by applyAuthoritativeLimitDecrease instead of being deleted before it runs.

✅ Passed checks (15 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: deferring CPU limit removal until after the authoritative decrease operation completes, which is the core fix in this PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Go Error Handling	✅ Passed	The PR's changes to admission.go follow proper Go error handling patterns: (1) The new CPU limit deletion code (lines 402-406) correctly checks for nil before dereferencing Limits; (2) All fmt.Erro...
Stable And Deterministic Test Names	✅ Passed	The test file uses standard Go table-driven tests, not Ginkgo tests. All 59 test case names are static string literals with no dynamic content, timestamps, UUIDs, or environment-dependent values.
Test Structure And Quality	✅ Passed	Custom check requires review of Ginkgo test code, but the modified tests use standard Go testing package (testing.T), not Ginkgo. Check is not applicable to this PR.
Microshift Test Compatibility	✅ Passed	No new Ginkgo e2e tests added; PR only modifies standard Go unit tests in admission_test.go. Check not applicable.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	PR contains only unit tests in cmd/pod-scaler/, not Ginkgo e2e tests. SNO compatibility check applies only to new e2e tests and is not applicable here.
Topology-Aware Scheduling Compatibility	✅ Passed	This PR modifies only pod resource limit mutation logic in an admission webhook (reconcileLimits and applyAuthoritativeLimitDecrease). It does not introduce deployment manifests, controllers, or an...
Ote Binary Stdout Contract	✅ Passed	No process-level stdout writes introduced. Changes are in webhook handler functions using logrus (stderr by default), with no fmt.Print, fmt.Println, or klog stdout violations.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	PR contains only unit test changes to cmd/pod-scaler using standard Go testing framework, not Ginkgo e2e tests. Check is not applicable.
No-Weak-Crypto	✅ Passed	Changes are to pod resource scaling logic with no cryptographic operations; no weak crypto patterns detected in modified files.
Container-Privileges	✅ Passed	PR contains no container privilege escalation or unsafe security configurations. Changes are limited to resource limit mutation logic in admission.go, with no modifications to security contexts, ca...
No-Sensitive-Data-In-Logs	✅ Passed	No logging statements that expose sensitive data (passwords, tokens, API keys, PII, etc.) were added. The code changes introduce CPU limit deletion logic without any logging.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

hector-vido · 2026-06-22T17:30:35Z

/lgtm

openshift-ci · 2026-06-22T17:30:53Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deepsm007, hector-vido

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [deepsm007,hector-vido]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

🧹 Nitpick comments (1)

cmd/pod-scaler/admission.go (1)
402-406: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Add or point to coverage for the relocated CPU cleanup branch.

This production path now owns the key sequencing contract: authoritative CPU decrease/dry-run must see the configured CPU limit before it is stripped for non-authoritative CPU mode. Please cover at least mutateResourceLimits=true, authoritativeCPU=false and an authoritative CPU/dry-run case. As per coding guidelines, “The author should check and ensure the presubmits of the PR run successfully” and reviewers should assess “Tests” and production impact.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/pod-scaler/admission.go` around lines 402 - 406, The relocated CPU
cleanup branch in the mutateResourceLimits and authoritativeCPU conditions is
missing test coverage for a critical sequencing contract: authoritative CPU
decrease/dry-run operations must observe the configured CPU limit before it gets
stripped in non-authoritative CPU mode. Add test cases covering the scenario
where mutateResourceLimits is true and authoritativeCPU is false (the delete
containers[i].Resources.Limits corev1.ResourceCPU code path), and add a test for
the authoritative CPU with dry-run scenario. These tests should verify that the
CPU limit is visible to authoritative operations before being removed for
non-authoritative CPU mode, ensuring the production sequencing contract is
properly validated.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@cmd/pod-scaler/admission.go`:
- Around line 402-406: The relocated CPU cleanup branch in the
mutateResourceLimits and authoritativeCPU conditions is missing test coverage
for a critical sequencing contract: authoritative CPU decrease/dry-run
operations must observe the configured CPU limit before it gets stripped in
non-authoritative CPU mode. Add test cases covering the scenario where
mutateResourceLimits is true and authoritativeCPU is false (the delete
containers[i].Resources.Limits corev1.ResourceCPU code path), and add a test for
the authoritative CPU with dry-run scenario. These tests should verify that the
CPU limit is visible to authoritative operations before being removed for
non-authoritative CPU mode, ensuring the production sequencing contract is
properly validated.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 95737a9a-0777-49e9-9fbf-f4090ca13d29

📥 Commits

Reviewing files that changed from the base of the PR and between 2bfca56 and cb34ba2.

📒 Files selected for processing (2)

cmd/pod-scaler/admission.go
cmd/pod-scaler/admission_test.go

🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

openshift/release (manual)
openshift/ci-docs (manual)
openshift/release-controller (manual)
openshift/ci-chat-bot (manual)

openshift-merge-bot · 2026-06-22T19:53:55Z

/retest-required

Remaining retests: 0 against base HEAD 2bfca56 and 2 for PR HEAD cb34ba2 in total

deepsm007 · 2026-06-22T21:00:32Z

/test e2e integration

openshift-merge-bot · 2026-06-23T01:51:50Z

/retest-required

Remaining retests: 0 against base HEAD 7f8c694 and 1 for PR HEAD cb34ba2 in total

openshift-merge-bot · 2026-06-23T10:51:21Z

/retest-required

Remaining retests: 0 against base HEAD e05bba9 and 0 for PR HEAD cb34ba2 in total

openshift-merge-bot · 2026-06-23T13:52:28Z

/hold

Revision cb34ba2 was retested 3 times: holding

deepsm007 · 2026-06-23T14:51:59Z

/retest

deepsm007 · 2026-06-23T15:47:34Z

/unhold

deepsm007 · 2026-06-23T16:19:01Z

/override ci/prow/e2e
/override ci/prow/integration

openshift-merge-bot · 2026-06-23T16:21:31Z

Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage.

openshift-ci · 2026-06-23T16:21:35Z

@deepsm007: Overrode contexts on behalf of deepsm007: ci/prow/e2e, ci/prow/integration

Details

In response to this:

/override ci/prow/e2e
/override ci/prow/integration

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-merge-bot · 2026-06-23T16:52:17Z

/retest-required

Remaining retests: 0 against base HEAD a1a4aa3 and 2 for PR HEAD cb34ba2 in total

deepsm007 · 2026-06-23T17:41:14Z

/retest-required

deepsm007 · 2026-06-23T18:30:37Z

/override ci/prow/integration

openshift-ci · 2026-06-23T18:32:21Z

@deepsm007: Overrode contexts on behalf of deepsm007: ci/prow/integration

Details

In response to this:

/override ci/prow/integration

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-merge-bot · 2026-06-23T20:49:09Z

Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage.

openshift-merge-bot · 2026-06-24T10:06:43Z

/retest-required

Remaining retests: 0 against base HEAD 50714d1 and 1 for PR HEAD cb34ba2 in total

deepsm007 · 2026-06-24T12:24:26Z

/override ci/prow/integration

openshift-ci · 2026-06-24T12:28:32Z

@deepsm007: Overrode contexts on behalf of deepsm007: ci/prow/integration

Details

In response to this:

/override ci/prow/integration

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

deepsm007 · 2026-06-24T14:00:51Z

/override ci/prow/images

openshift-ci · 2026-06-24T14:05:29Z

@deepsm007: Overrode contexts on behalf of deepsm007: ci/prow/images

Details

In response to this:

/override ci/prow/images

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci · 2026-06-24T14:05:34Z

@deepsm007: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

fix(pod-scaler): defer CPU limit strip until after authoritative decr…

cb34ba2

…ease

openshift-ci Bot requested a review from hector-vido June 22, 2026 17:29

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 22, 2026

openshift-ci Bot assigned hector-vido Jun 22, 2026

openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 22, 2026

coderabbitai Bot reviewed Jun 22, 2026

View reviewed changes

openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 23, 2026

openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 23, 2026

openshift-merge-bot Bot merged commit f3d07fd into openshift:main Jun 24, 2026
17 checks passed

Conversation

deepsm007 commented Jun 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

pod-scaler: Fix CPU limit removal timing to enable authoritative decrease

Uh oh!

openshift-merge-bot Bot commented Jun 22, 2026

Uh oh!

coderabbitai Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

hector-vido commented Jun 22, 2026

Uh oh!

openshift-ci Bot commented Jun 22, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-merge-bot Bot commented Jun 22, 2026

Uh oh!

deepsm007 commented Jun 22, 2026

Uh oh!

openshift-merge-bot Bot commented Jun 23, 2026

Uh oh!

openshift-merge-bot Bot commented Jun 23, 2026

Uh oh!

openshift-merge-bot Bot commented Jun 23, 2026

Uh oh!

deepsm007 commented Jun 23, 2026

Uh oh!

deepsm007 commented Jun 23, 2026

Uh oh!

deepsm007 commented Jun 23, 2026

Uh oh!

openshift-merge-bot Bot commented Jun 23, 2026

Uh oh!

openshift-ci Bot commented Jun 23, 2026

Uh oh!

openshift-merge-bot Bot commented Jun 23, 2026

Uh oh!

deepsm007 commented Jun 23, 2026

Uh oh!

deepsm007 commented Jun 23, 2026

Uh oh!

openshift-ci Bot commented Jun 23, 2026

Uh oh!

openshift-merge-bot Bot commented Jun 23, 2026

Uh oh!

openshift-merge-bot Bot commented Jun 24, 2026

Uh oh!

deepsm007 commented Jun 24, 2026

Uh oh!

openshift-ci Bot commented Jun 24, 2026

Uh oh!

deepsm007 commented Jun 24, 2026

Uh oh!

openshift-ci Bot commented Jun 24, 2026

Uh oh!

openshift-ci Bot commented Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

deepsm007 commented Jun 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 22, 2026 •

edited

Loading