Context
Consumers assemble a resource by registering mutations in an explicit slice
(NewBuilder(baseline).WithMutation(a, b, c).Build()), and the application
order is positional: the slice order is the apply order, and later mutations
win on collision (EnsureEnvVar replaces by name). Each Mutation carries a
Feature version gate that decides whether it fires for the concrete
spec.version.
A representative consumer pipeline (camunda-operator, zeebe StatefulSet) is
layered by hand:
baseline (latest-version shape)
-> defaults.* (version-resolved: ContainerImage, ClusterEnv, JVMEnv, ExporterEnv, ...)
-> compat.* (version-gated rollbacks of the baseline: Pre8p5RestAPI, Pre8p9LivenessProbe, ...)
-> overrides.* (user-driven: ExtraEnv, PodLabels, PodAnnotations, Scheduling)
-> ChecksumAnnotations (must be dead last)
The problem
The layering above is an implicit, positional contract maintained by hand in
the consumer's build function. Two concrete invariants from that pipeline that
are enforced only by where a line sits in the slice:
JVMEnv (no version gate) must be applied before overrides.ExtraEnv (no
version gate) so a user can override JAVA_TOOL_OPTIONS.
ChecksumAnnotations (no version gate) must be last so operator-managed
checksum keys always win.
Get the position wrong and a mutation silently shadows another with no
signal: build still succeeds, tests for the individual mutation still pass, and
the clobber only surfaces if a whole-resource golden happens to cover that exact
spec+version. Every code review of this pipeline spends effort eyeballing
registration order ("is this registered before ExtraEnv?").
Worth stating explicitly because it is the tempting wrong fix: version is not
the right ordering key. Gates are ranges, not points (>= 8.8.0-alpha7 && < 8.9.0-alpha2), so they have no total order; at reconcile only one concrete
version exists and the matching mutations are usually mutually exclusive anyway;
and the no-gate overrides must apply last, not first. The real ordering key
is role/phase, with version used purely for gating.
Workarounds today (all unsatisfying)
- Maintain order by hand in the build function and rely on code-review
vigilance. Fragile, and it is exactly the recurring review burden above.
- Whole-resource goldens catch the result of a bad clobber, but only for the
spec+version snapshotted, and they show the symptom (wrong final value), not
the cause (which mutation shadowed which).
- Directory/naming conventions (
defaults/ compat/ overrides/) approximate
phases but are not understood or enforced by the framework, so nothing stops
a mutation from being registered in the wrong band.
Proposed solution shape
Two composing pieces (sketch, not insisting on the exact API):
A. Phases as the ordering primitive. Give Mutation a declared Phase
(e.g. BaselineAdjust < Default < Compat < Override < Finalize) plus an optional
Priority tiebreak. The framework orders by (phase, priority, registration);
the version Feature stays a gate only. The consumer's WithMutation(...) set
becomes order-independent, so the positional footgun disappears.
statefulset.Mutation{ Name: "JVMEnv", Phase: phase.Default, Feature: gate, Mutate: ... }
statefulset.Mutation{ Name: "ExtraEnv", Phase: phase.Override, ... } // always after defaults
B. Static same-phase conflict detection. The Mutator already records each
edit, so it can record which mutation wrote each key (provenance). Combined
with version-constraint intersection and phase, flag:
KEY is a conflict when two mutations both write it, their version gates
intersect (non-empty range), they are in the same phase, and neither
declares an override of the other.
Cross-phase shadowing (Override beating Default) is intended and allowed;
same-phase shadowing over an overlapping version range is almost always an
accidental clobber. This is static (constraint intersection + provenance, no
need to enumerate every version), so it can ship as a test helper:
mutationtest.AssertNoConflicts(pipeline) // fails on undeclared same-phase overlap
plus an optional provenance snapshot (key -> ordered [(mutation, phase, value)]) so any change in who-sets-what is a reviewable diff. Explicit intent is
declared on the winning mutation, e.g. Overrides("JAVA_TOOL_OPTIONS"), and the
detector flags any clobber not declared. (A literal "no explicit remove was
called" rule does not work, since EnsureEnvVar-replace is the normal override
path and would false-flag every legitimate override; the discriminator is
undeclared same-phase overlap, not absence of a remove.)
Open questions
- Fixed phase enum vs a consumer-defined ordered set of phase names? Per
resource-kind or global?
- The conflict detector's intersection logic must handle the existing
pre-release matching gotcha (Masterminds matches pre-releases only within the
same major.minor.patch tuple, so a pure -alpha version can fall through
gates); otherwise it mis-judges overlap at the -alpha edges.
- Provenance tracking adds
Mutator overhead; gate it to test/debug builds, or
make it always-on and cheap?
Context
Consumers assemble a resource by registering mutations in an explicit slice
(
NewBuilder(baseline).WithMutation(a, b, c).Build()), and the applicationorder is positional: the slice order is the apply order, and later mutations
win on collision (
EnsureEnvVarreplaces by name). EachMutationcarries aFeatureversion gate that decides whether it fires for the concretespec.version.A representative consumer pipeline (camunda-operator, zeebe StatefulSet) is
layered by hand:
The problem
The layering above is an implicit, positional contract maintained by hand in
the consumer's build function. Two concrete invariants from that pipeline that
are enforced only by where a line sits in the slice:
JVMEnv(no version gate) must be applied beforeoverrides.ExtraEnv(noversion gate) so a user can override
JAVA_TOOL_OPTIONS.ChecksumAnnotations(no version gate) must be last so operator-managedchecksum keys always win.
Get the position wrong and a mutation silently shadows another with no
signal: build still succeeds, tests for the individual mutation still pass, and
the clobber only surfaces if a whole-resource golden happens to cover that exact
spec+version. Every code review of this pipeline spends effort eyeballing
registration order ("is this registered before
ExtraEnv?").Worth stating explicitly because it is the tempting wrong fix: version is not
the right ordering key. Gates are ranges, not points (
>= 8.8.0-alpha7 && < 8.9.0-alpha2), so they have no total order; at reconcile only one concreteversion exists and the matching mutations are usually mutually exclusive anyway;
and the no-gate overrides must apply last, not first. The real ordering key
is role/phase, with version used purely for gating.
Workarounds today (all unsatisfying)
vigilance. Fragile, and it is exactly the recurring review burden above.
spec+version snapshotted, and they show the symptom (wrong final value), not
the cause (which mutation shadowed which).
defaults/ compat/ overrides/) approximatephases but are not understood or enforced by the framework, so nothing stops
a mutation from being registered in the wrong band.
Proposed solution shape
Two composing pieces (sketch, not insisting on the exact API):
A. Phases as the ordering primitive. Give
Mutationa declaredPhase(e.g.
BaselineAdjust < Default < Compat < Override < Finalize) plus an optionalPrioritytiebreak. The framework orders by(phase, priority, registration);the version
Featurestays a gate only. The consumer'sWithMutation(...)setbecomes order-independent, so the positional footgun disappears.
B. Static same-phase conflict detection. The
Mutatoralready records eachedit, so it can record which mutation wrote each key (provenance). Combined
with version-constraint intersection and phase, flag:
Cross-phase shadowing (Override beating Default) is intended and allowed;
same-phase shadowing over an overlapping version range is almost always an
accidental clobber. This is static (constraint intersection + provenance, no
need to enumerate every version), so it can ship as a test helper:
plus an optional provenance snapshot (
key -> ordered [(mutation, phase, value)]) so any change in who-sets-what is a reviewable diff. Explicit intent isdeclared on the winning mutation, e.g.
Overrides("JAVA_TOOL_OPTIONS"), and thedetector flags any clobber not declared. (A literal "no explicit remove was
called" rule does not work, since
EnsureEnvVar-replace is the normal overridepath and would false-flag every legitimate override; the discriminator is
undeclared same-phase overlap, not absence of a remove.)
Open questions
resource-kind or global?
pre-release matching gotcha (Masterminds matches pre-releases only within the
same major.minor.patch tuple, so a pure
-alphaversion can fall throughgates); otherwise it mis-judges overlap at the
-alphaedges.Mutatoroverhead; gate it to test/debug builds, ormake it always-on and cheap?