Skip to content

Fix mutation persistence in persistent backends#553

Open
s-heppner wants to merge 4 commits into
eclipse-basyx:developfrom
rwth-iat:Fix/552
Open

Fix mutation persistence in persistent backends#553
s-heppner wants to merge 4 commits into
eclipse-basyx:developfrom
rwth-iat:Fix/552

Conversation

@s-heppner
Copy link
Copy Markdown
Member

PR #370 removed Referable.commit() and all call sites in the server
handlers without replacing the write-back mechanism. Since then, any
mutation to an object retrieved from LocalFileIdentifiableStore,
LocalFileDescriptorStore, or CouchDBIdentifiableStore was silently
lost on cache eviction, visible only within the same in-process
WeakValueDictionary cache entry. A different uWSGI worker, or any
request after the cache entry expired, would re-read the stale on-disk
or on-database state.

There was also a compounding bug: get_item() / get_identifiable_by_hash()
always re-read from storage even on a cache hit, then called
update_from() on the cached object, discarding any in-memory
mutations even within the same request.

This change fixes both issues across all three backends:

  • get_identifiable_by_hash() / get_identifiable_by_couchdb_id(): return the cached instance on a hit instead of overwriting it with a freshly-deserialized copy.
  • get_item(): check the cache first and return immediately on a hit.
  • Add commit() to LocalFileIdentifiableStore (re-serializes to .json), LocalFileDescriptorStore (same), and CouchDBIdentifiableStore (PUT with stored _rev, updates revision on success).
  • AbstractObjectStore.commit() is added as a no-op default so in-memory stores (DictIdentifiableStore) require no changes.
    All mutating handlers in server/app/interfaces/repository.py and registry.py now call self.object_store.commit() after each mutation.
  • A regression test test_mutation_persistence is added to sdk/test/backend/test_local_file.py.

Fixes #552

s-heppner added 4 commits May 22, 2026 18:38
Previously, mutations to objects retrieved from
`LocalFileIdentifiableStore` were never written back to disk.
Only `add()` wrote to the file store; subsequent in-memory changes
were invisible to other processes (e.g. different uWSGI workers)
once the `WeakValueDictionary` cache was evicted.

This was an issue introduced in eclipse-basyx#370, which removed
`Referable.commit()` and the `LocalFileBackend` class without
replacing the write-back mechanism.

Two fixes are made here:

- `AbstractObjectStore.commit()` is added as a no-op default.
  `LocalFileIdentifiableStore` overrides it to re-serialize the
  object to its `.json` file.
- `get_item()` now returns the cached object directly on a cache
  hit, instead of always re-reading from disk and overwriting
  in-memory state. This was a secondary bug that would silently
  discard mutations within the same process if the same object
  was retrieved more than once.

A regression test (`test_mutation_persistence`) is added that
mutates a stored object, evicts the cache via `gc.collect()`,
and verifies the updated value is read back from disk.
Previously, all mutating handlers in the repository called
\`Referable.commit()\` to persist changes back to the underlying
storage. This was removed in eclipse-basyx#370 together with the \`Backend\`
concept, leaving mutations silently un-persisted to disk when
using \`LocalFileIdentifiableStore\`.

Each of the mutating handlers now calls
\`self.object_store.commit()\` on the top-level \`Identifiable\`
after any in-memory mutation, restoring the write-back behaviour
that was lost in eclipse-basyx#370.
Previously, `LocalFileDescriptorStore` (and `LocalFileIdentifiableStore`
in the same iteration) had two compounding bugs: `get_item()` always
re-read from disk even on a cache hit, and mutations were never written
back to disk at all — so any change to a retrieved descriptor was lost
once its cache entry was evicted.

This change mirrors the fix already applied to `LocalFileIdentifiableStore`
(SDK) and `repository.py`:

- `get_descriptor_by_hash()` / `get_identifiable_by_hash()`: check the
  cache before inserting a freshly-deserialized object, so `__iter__`
  returns the same instance already in use rather than a new one.
- `get_item()` in `LocalFileDescriptorStore`: return the cached object
  directly on a hit without going to disk.
- Add `LocalFileDescriptorStore.commit()` to write the current in-memory
  state of a descriptor back to its file.
- `registry.py`: replace all `descriptor.commit()` / `aas_descriptor.commit()`
  / `submodel_descriptor.commit()` calls (which were no-ops on the model
  object) with `self.object_store.commit(x)`.
Previously, `get_identifiable_by_couchdb_id()` called `update_from()`
on the cached object when the same id was already in the
`WeakValueDictionary`, silently discarding any in-memory mutations.
`get_item()` had no cache-hit check and always went to the network.
`commit()` was never implemented despite the `_revision_store`
infrastructure already existing.

The fix mirrors the changes made to `LocalFileIdentifiableStore`:

- `get_identifiable_by_couchdb_id()`: return the cached instance on a
  hit (preserving mutations) without calling `update_from()`; the
  `_rev` from the GET response is still stored so future `commit()`
  calls use the correct revision.
- `get_item()`: check the cache first and return immediately on a hit
  to avoid an unnecessary network round-trip.
- Add `commit()`: PUT the serialized object with the stored `_rev`,
  update `_revision_store` with the new revision on success, and
  surface 409 Conflict as `CouchDBConflictError`.
@s-heppner s-heppner marked this pull request as ready for review May 22, 2026 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant