JDBCQuotaStore hits CannotSerializeTransactionException in clustered deployments

When more than one JVM runs `DiskQuotaMonitor` against the same database (any clustered GeoServer deployment), concurrent updates to the same `TILESET` or `TILEPAGE` row deterministically abort with PostgreSQL's `could not serialize access due to concurrent update`:

```
org.springframework.dao.CannotSerializeTransactionException:
  PreparedStatementCallback;
  SQL [UPDATE TILEPAGE SET FILL_FACTOR = ? WHERE KEY = ?];
  ERROR: could not serialize access due to concurrent update
```

There is no retry, so the operation just fails. Easy to trigger: set a global quota lower than the cached size on two nodes - both `LayerQuotaEnforcementTask` instances try the same `UPDATE TILEPAGE SET FILL_FACTOR = 0 ...`.

### Why it matters

The wire-up for clustered GeoServer is essentially complete in the rest of the stack; the JDBC quota store is the missing piece. Today every write surface in `JDBCQuotaStore` is vulnerable when two JVMs touch the same row. The hottest case is the global `___GLOBAL_QUOTA___` row in TILESET, which **every tile-writing node updates on every batched flush**. With N replicas, that row is the cluster-wide hotspot.

### Root cause

`JDBCQuotaStore.java`:

```java
this.tt.setIsolationLevel(TransactionTemplate.ISOLATION_SERIALIZABLE);
```

SERIALIZABLE was a defensive choice for single-JVM correctness. There is no retry on serialization failure. Two hot paths additionally use a tolerant SELECT-then-INSERT-or-UPDATE loop in Java rather than an atomic upsert:

- `upsertTilePageFillFactor()`
- `upsertTilePageHitAccessTime()`

Both loop up to 100 times; under `SERIALIZABLE` the read + write within a single transaction triggers Postgres [SSI's predicate locking](https://wiki.postgresql.org/wiki/Serializable#Predicate_Locking) and aborts when two nodes touch the same `TILEPAGE` row.

The TILESET quota update is already an in-place SQL increment (`SQLDialect.java:326`: `UPDATE TILESET SET BYTES = BYTES + (:bytes) ...`), so it's safe at READ COMMITTED today; only the isolation level needs to change for that path.

### Proposed fix

#### Part 1 (the fix, all dialects): drop SERIALIZABLE

Change `JDBCQuotaStore.java` to use the default isolation (READ COMMITTED). One line. No SQL changes.

This alone resolves the reported error. Under READ COMMITTED:

- The `TILESET` quota update is already an in-place SQL increment (`SQLDialect.java`: `UPDATE TILESET SET BYTES = BYTES + (:bytes) WHERE KEY = :tileSetId`). Two concurrent writers on the same row just serialize via row-level locks; second writer waits and proceeds with the latest committed value. No conflict.
- The `TILEPAGE` tolerant loops (`upsertTilePageFillFactor`, `upsertTilePageHitAccessTime`) are idempotent by design - they already retry on no-rows-modified. Without SSI predicate locking they hit fewer conflicts and the existing retry covers what's left.
- `setTruncated`'s SELECT-then-UPDATE is benign at READ COMMITTED: two concurrent calls both arrive at `FILL_FACTOR = 0`, last-writer-wins on the same value, no correctness issue.

#### Part 2 (Postgres only, optimization): native `ON CONFLICT` UPSERTs

Independent of Part 1. Replaces the 100-attempt loops with single atomic statements on Postgres.

`PostgreSQLDialect.java` is an empty subclass today. Add a capability flag `dialect.supportsAtomicUpsert()` (default `false`) and Postgres-native UPSERT overrides for the three insert-or-update paths:

- `getOrCreateTileSet` -> `INSERT INTO TILESET ... ON CONFLICT (KEY) DO NOTHING`. Replaces the loop at `JDBCQuotaStore.java`.
- `upsertTilePageFillFactor` -> `INSERT INTO TILEPAGE ... ON CONFLICT (KEY) DO UPDATE SET FILL_FACTOR = LEAST(1.0, GREATEST(0.0, TILEPAGE.FILL_FACTOR + EXCLUDED.FILL_FACTOR))`. The fill-factor delta moves from Java into the SQL `VALUES` clause.
- `upsertTilePageHitAccessTime` -> analogous, with `GREATEST` on `LAST_ACCESS_TIME_MINUTES` (commutative) and addition on `NUM_HITS`.

In `JDBCQuotaStore`, branch once on `dialect.supportsAtomicUpsert()` and call the new dialect methods when supported, fall back to the existing loop otherwise.

This is fewer round-trips per tile add (one statement vs the SELECT+UPDATE/INSERT pair) and removes the retry loop entirely on Postgres. It is **not required** to fix the reported error - Part 1 is.

### Scope

In scope:
- Part 1 (isolation drop) for everyone.
- Part 2 (native UPSERT) for `PostgreSQLDialect` only.

Out of scope (separate, unfunded task):
- Native UPSERT for `OracleDialect`. The existing tolerant loop remains the documented behavior for Oracle.
- H2 / HSQL parity beyond making sure existing tests keep passing.

### Affected versions

Present on `main` and on the `1.28.x` series. Backport target: `1.28.x` (consumed by GeoServer 2.28.x).

### Related

The simpler `JDBCQuotaStore.renameLayer` bug (only updates `LAYER_NAME`, leaves `TILESET.KEY` and `TILEPAGE.TILESET_ID` stale) is tracked in
- #1526


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JDBCQuotaStore hits CannotSerializeTransactionException in clustered deployments #1527

Why it matters

Root cause

Proposed fix

Part 1 (the fix, all dialects): drop SERIALIZABLE

Part 2 (Postgres only, optimization): native `ON CONFLICT` UPSERTs

Scope

Affected versions

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

JDBCQuotaStore hits CannotSerializeTransactionException in clustered deployments #1527

Description

Why it matters

Root cause

Proposed fix

Part 1 (the fix, all dialects): drop SERIALIZABLE

Part 2 (Postgres only, optimization): native ON CONFLICT UPSERTs

Scope

Affected versions

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Part 2 (Postgres only, optimization): native `ON CONFLICT` UPSERTs