Skip to content

Order mutations by declared phase, and statically detect same-phase version-overlap conflicts #128

@sourcehawk

Description

@sourcehawk

Context

Consumers assemble a resource by registering mutations in an explicit slice
(NewBuilder(baseline).WithMutation(a, b, c).Build()), and the application
order is positional: the slice order is the apply order, and later mutations
win on collision (EnsureEnvVar replaces by name). Each Mutation carries a
Feature version gate that decides whether it fires for the concrete
spec.version.

A representative consumer pipeline (camunda-operator, zeebe StatefulSet) is
layered by hand:

baseline (latest-version shape)
  -> defaults.*   (version-resolved: ContainerImage, ClusterEnv, JVMEnv, ExporterEnv, ...)
  -> compat.*     (version-gated rollbacks of the baseline: Pre8p5RestAPI, Pre8p9LivenessProbe, ...)
  -> overrides.*  (user-driven: ExtraEnv, PodLabels, PodAnnotations, Scheduling)
  -> ChecksumAnnotations (must be dead last)

The problem

The layering above is an implicit, positional contract maintained by hand in
the consumer's build function. Two concrete invariants from that pipeline that
are enforced only by where a line sits in the slice:

  • JVMEnv (no version gate) must be applied before overrides.ExtraEnv (no
    version gate) so a user can override JAVA_TOOL_OPTIONS.
  • ChecksumAnnotations (no version gate) must be last so operator-managed
    checksum keys always win.

Get the position wrong and a mutation silently shadows another with no
signal: build still succeeds, tests for the individual mutation still pass, and
the clobber only surfaces if a whole-resource golden happens to cover that exact
spec+version. Every code review of this pipeline spends effort eyeballing
registration order ("is this registered before ExtraEnv?").

Worth stating explicitly because it is the tempting wrong fix: version is not
the right ordering key.
Gates are ranges, not points (>= 8.8.0-alpha7 && < 8.9.0-alpha2), so they have no total order; at reconcile only one concrete
version exists and the matching mutations are usually mutually exclusive anyway;
and the no-gate overrides must apply last, not first. The real ordering key
is role/phase, with version used purely for gating.

Workarounds today (all unsatisfying)

  1. Maintain order by hand in the build function and rely on code-review
    vigilance. Fragile, and it is exactly the recurring review burden above.
  2. Whole-resource goldens catch the result of a bad clobber, but only for the
    spec+version snapshotted, and they show the symptom (wrong final value), not
    the cause (which mutation shadowed which).
  3. Directory/naming conventions (defaults/ compat/ overrides/) approximate
    phases but are not understood or enforced by the framework, so nothing stops
    a mutation from being registered in the wrong band.

Proposed solution shape

Two composing pieces (sketch, not insisting on the exact API):

A. Phases as the ordering primitive. Give Mutation a declared Phase
(e.g. BaselineAdjust < Default < Compat < Override < Finalize) plus an optional
Priority tiebreak. The framework orders by (phase, priority, registration);
the version Feature stays a gate only. The consumer's WithMutation(...) set
becomes order-independent, so the positional footgun disappears.

statefulset.Mutation{ Name: "JVMEnv", Phase: phase.Default, Feature: gate, Mutate: ... }
statefulset.Mutation{ Name: "ExtraEnv", Phase: phase.Override, ... } // always after defaults

B. Static same-phase conflict detection. The Mutator already records each
edit, so it can record which mutation wrote each key (provenance). Combined
with version-constraint intersection and phase, flag:

KEY is a conflict when two mutations both write it, their version gates
intersect (non-empty range), they are in the same phase, and neither
declares an override of the other.

Cross-phase shadowing (Override beating Default) is intended and allowed;
same-phase shadowing over an overlapping version range is almost always an
accidental clobber. This is static (constraint intersection + provenance, no
need to enumerate every version), so it can ship as a test helper:

mutationtest.AssertNoConflicts(pipeline)        // fails on undeclared same-phase overlap

plus an optional provenance snapshot (key -> ordered [(mutation, phase, value)]) so any change in who-sets-what is a reviewable diff. Explicit intent is
declared on the winning mutation, e.g. Overrides("JAVA_TOOL_OPTIONS"), and the
detector flags any clobber not declared. (A literal "no explicit remove was
called" rule does not work, since EnsureEnvVar-replace is the normal override
path and would false-flag every legitimate override; the discriminator is
undeclared same-phase overlap, not absence of a remove.)

Open questions

  • Fixed phase enum vs a consumer-defined ordered set of phase names? Per
    resource-kind or global?
  • The conflict detector's intersection logic must handle the existing
    pre-release matching gotcha (Masterminds matches pre-releases only within the
    same major.minor.patch tuple, so a pure -alpha version can fall through
    gates); otherwise it mis-judges overlap at the -alpha edges.
  • Provenance tracking adds Mutator overhead; gate it to test/debug builds, or
    make it always-on and cheap?

Metadata

Metadata

Assignees

No one assigned

    Labels

    ideaSpeculative design proposal / exploration, not a committed feature request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions