KREP-006: Propagation Control by ellistarn · Pull Request #861 · kubernetes-sigs/kro

ellistarn · 2025-11-24T21:50:16Z

KREP-006 introduces propagateWhen, a per-resource mechanism to conditionally gate mutation as
changes propagate through the graph. Both propagateWhen and readyWhen are complementary and
bookend when mutation for a node in the graph can start and is considered complete.

jakobmoellerdev · 2025-11-25T19:01:08Z

+Mechanically, supporting concurrent mutations will require new machinery in KRO. We defer the exact
+details of this discussion to the implementation phase, due to the magnitude of the change.
+
+Directionally, we could introduce a new `ResourceGraphRevision` CRD for each unique set of inputs to


resource graph revision will become absolutely critical for long term stability of kro and its graph reconciliation.

BUT I believe the graph revision should be included in its own KREP in separation of the propagation policy. We need a stable implementation plan first and foremost. I love the ideas but considering kros state right now we need to start thinking how to get this in without causing significant breaks.

IMHO Both ApplySets and Static Type Eval have IMHO caused too many regressions because we didnt focus enough on test plans.

this is what my proposal talks about https://docs.google.com/document/d/1kmi9hXK7tF5JkBBT-FLNDnz5j37RtiXuRIQYYBiFbUE/edit?usp=sharing

jamal · 2025-12-03T06:02:56Z

I wanted to +1 this, it would be fantastic to have this! I'm currently iterating on how to solve a problem I have where propagation of RGD changes can cause some impact to developers. To explain my use case a bit, I'm using kro to manage ephemeral development environments (somewhat similar to what Tilt, DevSpace or Skaffold will let you do). This is targeting game developers/designers who don't have kubectl or work with infrastructure/backend at all, so I wanted to avoid requiring installing things like kubectl, or even giving them cluster access. The RGD deploys a set of services and their dependencies so that a developer can have an isolated environment to work on.

But, one of the issues I'm running into is that the developer may be actively working and have state / configuration on the service that gets lost when the pod gets replaced. I'm iterating on options from trying to persist that state (which would add a ton of complexity) or just controlling when the instance can be updated.

Anyhow, having a way to control propagation would make that a lot simpler to solve. I'm still debating on what the control would be but either time based (to try to do things outside normal working hours) or manually managed by the developer using the cli tool.

Thank you! Definitely looking forward to see how this evolves.

barney-s

thanks for the proposal.

barney-s · 2026-01-05T19:31:36Z

+defined within. For example, an organization that has used KRO to unify application deployment with
+an Application CRD risks cluster-wide impact from a bad change to the ResourceGraphDefinition. A
+ResourceGraphDefinition that loops over a collection of zones to deploy a set of zonal Deployments
+risks regional impact from a bad change in the deployment's configuration.


Just to clarify, is the controls proposed scoped to a single instance ?
Or does it propose across instances of RGD ?

barney-s · 2026-01-05T19:31:36Z

+// Returns true when updated items grow exponentially: 1, 2, 4, 8, 16...
+// An item is considered updated when its generation annotation matches the graph revision generation
+exponentiallyUpdated(collection, each) =
+  size(collection.filter(i, i.metadata.annotations['kro.run/generation'] == string(schema.metadata.generation))) >=


Performance consideration: collection.filter(...) iterates over the entire collection.
If propagateWhen is evaluated for every resource in the collection during a reconciliation loop, this logic becomes O(N^2).
For large collections (e.g. thousands of resources), this could be a performance bottleneck.

This was just an example. Optimizing this in the impl.

barney-s · 2026-01-05T19:31:36Z

+    // ... existing fields ...
+
+    // PropagateWhen defines CEL expressions that allow the object to be mutated when true
+    PropagateWhen []string `json:"propagateWhen,omitempty"`


Please clarify the semantics of the []string slice.
Are these CEL expressions evaluated as a logical AND (all must be true) or logical OR (at least one must be true)?

It's identical to the ReadyWhen semantics. All must pass (AND)

barney-s · 2026-01-05T19:31:36Z

+    // ... existing fields ...
+
+    // PropagateWhen defines CEL expressions that allow the object to be mutated when true
+    PropagateWhen []string `json:"propagateWhen,omitempty"`


Consider adding a FailurePolicy field.
If a CEL expression fails to evaluate (e.g., division by zero, missing field, type error), should the propagation be allowed (FailOpen) or blocked (FailClosed)?

Let's consider this in future scope. Fail closed is appropriate, I think.

barney-s · 2026-01-05T19:31:36Z

+   resources defined in my ResourceGraphDefinitions when the inputs to the graph change. As an
+   administrator, I want to limit the rate of change to the instances of a ResourceGraphDefinition
+   when the definition itself changes.
+2. **Time Controls**: As an administrator, I want to prevent changes from happening outside of


For "Time Controls", will the CEL environment provide a safe and deterministic way to access the current timestamp (e.g., time.now())?
This is often restricted in standard CEL environments to ensure determinism.

Agreed -- punting time to an external KREP. You can always drive it with a cronjob and externalRef.

barney-s · 2026-01-05T21:12:37Z

+
+Probably.
+
+Changes can be made to the inputs of the graph while other changes are still propagating through.


if we make a change to an RGD, existing propagations would see the latest RGD. How can we have multiple views of RGD in multiple propagations ?

barney-s · 2026-01-05T21:13:11Z

+Changes can be made to the inputs of the graph while other changes are still propagating through.
+This is similar to Kubernetes deployments, which can be mutated mid-rollout. A common use case for
+this is Rollback, described above. A mutation is made to the graph, and then the inverse of the
+mutation is made before it completes. Overlapping propagations can be more complicated, with up to


are propagations stateful. If so where is state stored ?

Not currently, but I believe we need to make them so. Without this, we can only support a single propagation at a time. I am deferring this work to a future KREP (ResourceGraphRevisions). I think this work stands on it's own, though.

barney-s · 2026-01-05T21:14:28Z

+```
+Graph: A → B, C, D (collection with linearlyUpdated)
+
+T1: Mutation   T2: Mutation Propagates   T3: Rollback Starts   T4: Rollback Propagates


effectively indication there is only one view or propagation at a given time ?

barney-s · 2026-01-05T21:15:36Z

+increasingly stale. One clear example of this is when using KRO to model software release pipelines.
+Given an RGD for `SoftwareReleaseEnvironment` and a pipeline that deploys this environment to many
+stages and regions, each of which are dependent on each other, it may take O(days) to propagate a


again a reason why this may not be the right abstraction for such use cases. We may need to define what is scoped and what is not.

barney-s · 2026-01-05T21:16:35Z

+Mechanically, supporting concurrent mutations will require new machinery in KRO. We defer the exact
+details of this discussion to the implementation phase, due to the magnitude of the change.
+
+Directionally, we could introduce a new `ResourceGraphRevision` CRD for each unique set of inputs to


this is what my proposal talks about https://docs.google.com/document/d/1kmi9hXK7tF5JkBBT-FLNDnz5j37RtiXuRIQYYBiFbUE/edit?usp=sharing

chrisdoherty4 · 2026-01-22T17:14:28Z

I like the approach.

KREP-0003 discusses decorators and how generally they should be considered singletons; however it doesn't enforce the constraint and acknowledges there could be unknown use cases where a schema can be provided and result in non-conflicting changes between decorator instances. Is it worth considering how propagation control works in that context or at least making it an explicit non-goal?

ellistarn · 2026-01-23T05:09:54Z

KREP-0003 discusses decorators and how generally they should be considered singletons; however it doesn't enforce the constraint and acknowledges there could be unknown use cases where a schema can be provided and result in non-conflicting changes between decorator instances. Is it worth considering how propagation control works in that context or at least making it an explicit non-goal?

I owe this community a doc on "singletons" or "Schemaless Graphs".

Introduces propagateWhen, a per-resource mechanism to conditionally gate mutation as changes propagate through the graph. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…earlyReady The new names better describe the intent - "is this item ready to propagate?" rather than describing the mechanism (checking update counts). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ellistarn · 2026-03-10T20:13:58Z

+pod.ready()    // true if readyWhen conditions are satisfied
+pod.updated()  // true if updated to the current graph generation


I wonder if these should be collapsed. The both the linearlyReady and exponentiallyReday functions below use updated

i.e., what if ready returned false when the resource was itself ready, but the generation has moved on. Thoughts on modeling this @a-hilaly ?

cheeseandcereal · 2026-03-10T20:16:57Z

+
+// exponentiallyReady(item, collection) -> bool
+// Item can proceed when exponential batch (1, 2, 4, 8...) is reached
+exponentiallyReady(pod, pods)


Should the exponential factor be exposed as an optional parameter instead of always being forced to 2.0?

k1ranpk · 2026-03-10T20:24:01Z

+## Proposed API and Behavior
+
+1. Add `propagateWhen` to `ResourceGraphDefinitionSpec` to control propagation to graph instances
+2. Add `propagateWhen` to `Resource` to control propagation to a resource in a graph instance


Is it possible for the propagation to be stuck in a cycle between the different controls expressed across multiple resources and RGDs?

NicholasBlaskey · 2026-03-10T20:12:57Z

+
+1. **Rate Controls**: As an administrator, I want to limit the rate of change to collections of
+   resources defined in my ResourceGraphDefinitions when the inputs to the graph change. As an
+   administrator, I want to limit the rate of change to the instances of a ResourceGraphDefinition


what is the method for overriding? Would it be updating the RGD to no longer have these propagate controls?

If you specify propagateWhen: [], then there are no gates.

NicholasBlaskey · 2026-03-10T20:22:28Z

+  └─ InstanceManaged - Instance finalizers and labels are properly set
+```
+
+Above, we assert that readiness and propagation are separate concepts, and thus we introduce a


this may tie into resource graph revisions, but what would be the way for a user to know if an instance had the RGD propagated to it?

Is this check all or nothing or is there a way to have per instance understanding of what changes have been applied or not

I was imagining an annotation on the resource the shows a graph revision. And then KRO is managing those revisions internally.

JamesGAWS · 2026-03-10T20:25:05Z

+## Summary
+
+KREP-006 introduces `propagateWhen`, a per-resource mechanism to conditionally gate mutation as
+changes propagate through the graph. Both `propagateWhen` and `readyWhen` are complementary and


How will propagateWhen interact with includeWhen? If an RGD update changes an includeWhen condition to remove or add a node from a graph, is the node removal controlled by propagateWhen?

My understanding is that includeWhen removes the node from the graph, so it would not be counted in the propagation math. If it resolves to true, it is included in the graph and is taken into account. Though I suppose order matters here -- if it poofs into existence, we probably want to start with it, depending on the order.

This may be a problem to solve with resource graph revisions

JamesGAWS · 2026-03-10T20:26:10Z

+indexOf(pod, pods) < (pods.filter(p, p.updated()).size() / 3 + 1) * 3
+
+// exponentiallyReady(item, collection) -> bool
+// Item can proceed when exponential batch (1, 2, 4, 8...) is reached


How are instance batched for update? Are they chosen randomly per RGD generation or is the ordering fixed between generations?

ellistarn · 2026-03-10T20:26:28Z

+  resources:
+    - id: pods
+      forEach:
+        - pod: ${ schema.spec.pods }


This needs a revisit. Cannot reference schema.spec.pods

JamesGAWS · 2026-03-10T20:28:54Z

+Probably.
+
+Changes can be made to the inputs of the graph while other changes are still propagating through.
+This is similar to Kubernetes deployments, which can be mutated mid-rollout. A common use case for


Deployments use an intermediate object (replica sets) between deployments and pods. Does KRO needs a similar construct to allow overlapping propagations?

Yes! ResourceGraphRevision

michaelhtm · 2026-03-10T20:17:50Z

+1. Add `propagateWhen` to `ResourceGraphDefinitionSpec` to control propagation to graph instances
+2. Add `propagateWhen` to `Resource` to control propagation to a resource in a graph instance


can we define the CEL context propagateWhen will have for each scenario?

michaelhtm · 2026-03-10T20:30:48Z

+
+## Proposed API and Behavior
+
+1. Add `propagateWhen` to `ResourceGraphDefinitionSpec` to control propagation to graph instances


in what order would we process the instances?

linux-foundation-easycla · 2026-04-16T17:33:27Z

✅ login: ellistarn / name: Ellis Tarn (3a754fb, ba49042)
❌ The email address for the commit (3a754fb, ba49042) is not linked to the GitHub account, preventing the EasyCLA check. Consult this Help Article and GitHub Help to resolve. (To view the commit's email address, add .patch at the end of this PR page's URL.) For further assistance with EasyCLA, please submit a support request ticket.

One or more co-authors of this pull request were not found. You must specify co-authors in commit message trailer via:

Co-authored-by: name <email>

Supported Co-authored-by: formats include:

Anything <id+login@users.noreply.github.com> - it will locate your GitHub user by id part.
Anything <login@users.noreply.github.com> - it will locate your GitHub user by login part.
Anything <public-email> - it will locate your GitHub user by public-email part. Note that this email must be made public on Github.
Anything <other-email> - it will locate your GitHub user by other-email part but only if that email was used before for any other CLA as a main commit author.
login <any-valid-email> - it will locate your GitHub user by login part, note that login part must be at least 3 characters long.

Alternatively, if the co-author should not be included, remove the Co-authored-by: line from the commit message.

Please update your commit message(s) by doing git commit --amend and then git push [--force] and then request re-running CLA check via commenting on this pull request:

/easycla

k8s-ci-robot · 2026-04-16T17:33:28Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ellistarn
Once this PR has been reviewed and has the lgtm label, please assign jlbutler for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

This KREP discusses the implementation plan for instance propagation control accross GraphRevisions. builds on: * kubernetes-sigs#861 * kubernetes-sigs#1174

k8s-ci-robot requested review from jakobmoellerdev and jlbutler November 24, 2025 21:50

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 24, 2025

ellistarn force-pushed the prop-krep branch from ff44ca1 to f66be7b Compare November 24, 2025 21:51

ellistarn marked this pull request as draft November 24, 2025 22:32

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 24, 2025

ellistarn force-pushed the prop-krep branch 2 times, most recently from ec0eda0 to 06c4cc8 Compare November 24, 2025 22:48

ellistarn mentioned this pull request Nov 24, 2025

KREP-005: Level-based Topological Sorting for ResourceGraphDefinitions #859

Draft

ellistarn marked this pull request as ready for review November 25, 2025 16:28

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 25, 2025

k8s-ci-robot requested a review from barney-s November 25, 2025 16:28

jakobmoellerdev reviewed Nov 25, 2025

View reviewed changes

a-hilaly added this to the 0.10 milestone Nov 26, 2025

matthchr mentioned this pull request Dec 3, 2025

Mega Feature: Versioning and rollout of changes to ResourceGraphDefinitions #883

Open

barney-s reviewed Jan 5, 2026

View reviewed changes

barney-s mentioned this pull request Jan 7, 2026

KREP-009: Versioning support in KRO #935

Open

ellistarn force-pushed the prop-krep branch 4 times, most recently from 73714ee to 58f6e04 Compare January 28, 2026 17:38

KREP-006: Propagation Control

3a754fb

Introduces propagateWhen, a per-resource mechanism to conditionally gate mutation as changes propagate through the graph. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ellistarn force-pushed the prop-krep branch from 58f6e04 to 3a754fb Compare January 28, 2026 19:26

Rename exponentiallyUpdated/linearlyUpdated to exponentiallyReady/lin…

ba49042

…earlyReady The new names better describe the intent - "is this item ready to propagate?" rather than describing the mechanism (checking update counts). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ellistarn commented Mar 10, 2026

View reviewed changes

cheeseandcereal reviewed Mar 10, 2026

View reviewed changes

k1ranpk reviewed Mar 10, 2026

View reviewed changes

NicholasBlaskey reviewed Mar 10, 2026

View reviewed changes

JamesGAWS reviewed Mar 10, 2026

View reviewed changes

ellistarn commented Mar 10, 2026

View reviewed changes

JamesGAWS reviewed Mar 10, 2026

View reviewed changes

michaelhtm reviewed Mar 10, 2026

View reviewed changes

a-hilaly added the kind/krep label Mar 11, 2026

jakobmoellerdev mentioned this pull request Mar 27, 2026

KREP-023: Level-Aware Wavefront Instance Controller #1215

Draft

ellistarn closed this Apr 7, 2026

ellistarn deleted the prop-krep branch April 7, 2026 18:14

ellistarn restored the prop-krep branch April 16, 2026 17:33

ellistarn reopened this Apr 16, 2026

k8s-ci-robot added cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. and removed cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 16, 2026

michaelhtm mentioned this pull request Apr 17, 2026

KREP-024: Instance Propagation Control #1259

Draft


		Probably.

		Changes can be made to the inputs of the graph while other changes are still propagating through.

		pod.ready() // true if readyWhen conditions are satisfied
		pod.updated() // true if updated to the current graph generation

		1. Add `propagateWhen` to `ResourceGraphDefinitionSpec` to control propagation to graph instances
		2. Add `propagateWhen` to `Resource` to control propagation to a resource in a graph instance


		## Proposed API and Behavior

		1. Add `propagateWhen` to `ResourceGraphDefinitionSpec` to control propagation to graph instances

Conversation

ellistarn commented Nov 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jamal commented Dec 3, 2025

Uh oh!

barney-s left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ellistarn Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrisdoherty4 commented Jan 22, 2026

Uh oh!

ellistarn commented Jan 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cheeseandcereal Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

k1ranpk Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ellistarn Jan 28, 2026 •

edited

Loading

cheeseandcereal Mar 10, 2026 •

edited

Loading

k1ranpk Mar 10, 2026 •

edited

Loading