Skip to content

Transient corrupts persistent base set via shared _keys array in Branch.add() #19

@whilo

Description

@whilo

Summary

Calling (transient base-set) and mutating the transient silently corrupts the original persistent base-set. Subsequent persistent operations on base-set lose elements.

Root Cause

In Branch.java, the add() method's "same len, not editable" path (line ~190) shares the _keys array between the original branch and the newly created copy when the maxKey doesn't change:

if (0 == cmp.compare(node.maxKey(), _keys[ins])) {
    newKeys = _keys;  // BUG: shares mutable array with original
}

When asTransient() creates a transient, both trees share the root. The first transient operation creates copies along the insertion path, but these copies share _keys with the originals. Subsequent transient operations find these copies editable and modify _keys[i] in place (line ~168: _keys[ins] = node.maxKey()), which corrupts the original persistent branch's separator keys.

This causes binary search to go to wrong children, silently losing elements.

The same pattern exists for _children and _addresses sharing at lines ~199-201, though that path appears to be dead code (the node == child(storage, ins) condition is never true on the non-editable path since children always create copies).

Reproduction

The bug is non-deterministic (JIT-dependent), failing ~60-80% of cold JVM runs but rarely in warm REPLs.

Save as repro.clj:

(require '[clojure.set :as cset]
         '[clojure.test.check :as tc]
         '[clojure.test.check.generators :as gen]
         '[clojure.test.check.properties :as prop]
         '[me.tonsky.persistent-sorted-set :as set])

(def gen-int (gen/choose -10000 10000))
(def gen-elements (gen/no-shrink (gen/vector gen-int 0 1000)))
(def gen-operation (gen/no-shrink (gen/tuple (gen/elements [:add :remove]) gen-int)))
(def gen-operations (gen/no-shrink (gen/vector gen-operation 0 500)))

(println "Testing: transient corrupts persistent base-set")
(let [result
      (tc/quick-check 5000
        (prop/for-all [initial-elements gen-elements
                       ops gen-operations]
          (let [base-set (into (set/sorted-set) initial-elements)
                ;; Transient mutates shared _keys, corrupting base-set
                _ (-> (transient base-set)
                      (#(reduce (fn [t [op val]]
                                  (case op
                                    :add (conj! t val)
                                    :remove (disj! t val)))
                                % ops))
                      persistent!)
                ;; Persistent ops on corrupted base-set lose elements
                persistent-result (reduce (fn [s [op val]]
                                            (case op
                                              :add (conj s val)
                                              :remove (disj s val)))
                                          base-set ops)
                ref-result (reduce (fn [s [op val]]
                                     (case op
                                       :add (clojure.core/conj s val)
                                       :remove (clojure.core/disj s val)))
                                   (apply sorted-set initial-elements) ops)]
            (= (vec persistent-result) (vec ref-result)))))]
  (println "Result:" (if (:pass? result) "PASSED" "FAILED")
           "after" (:num-tests result) "tests"))
(shutdown-agents)

Run in a cold JVM (not REPL):

clj -Sdeps '{:paths ["src-clojure" "target/classes"] :deps {org.clojure/clojure {:mvn/version "1.12.0"} org.clojure/test.check {:mvn/version "1.1.1"}}}' -M -i repro.clj

Typical failure output:

Testing: transient corrupts persistent base-set
Result: FAILED after 3559 tests

Running 3 times, 2/3 failed. The lost element is always near the max of the key range (e.g., 9971, 9993).

Fix

Always copy the _keys array instead of sharing it:

// same len, not editable
if (1 == nodes.length) {
    ANode<Key, Address> node = nodes[0];
    Key[] newKeys = Arrays.copyOfRange(_keys, 0, _len);
    newKeys[ins] = node.maxKey();
    // ...
}

This is safe because the Arrays.copyOfRange cost is negligible compared to the tree traversal, and it prevents transient editable paths from ever mutating a persistent branch's arrays.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions