Skip to content

Implement change-notify on the MetaDB#629

Open
amrc-benmorrow wants to merge 19 commits intomainfrom
bmz/metadb-notify
Open

Implement change-notify on the MetaDB#629
amrc-benmorrow wants to merge 19 commits intomainfrom
bmz/metadb-notify

Conversation

@amrc-benmorrow
Copy link
Copy Markdown
Contributor

Only WATCH and SEARCH on v2/app/{app}/object/ are implemented for now as we don't yet have a mechanism for tracking the class structure changes.

Implement App config schema validation. Originally this was going to use the notify interface internally but it turned out to be necessary to update the schemas synchronously so we can produce appropriate 409 response when existing entries don't validate. This means a SPARQL update can insert invalid entries but that is currently a superuser facility anyway.

Replace the JsonValueBodyReader from Jakarta EE. It has an important bug where it cannot decode a single-character JSON document, such as 1. This is because of instead of handling charsets correctly it tries to guess a Unicode charset.

We cannot use the AppUpdater to track config changes, as it only handles
structured apps. So remove all tracking from that class and revert to
using a ModelChangedListener; this will pick up on the config entry
changes made by the AppUpdater.

In order to understand the changes made we need to query the current
state of the triplestore. Currently this happens in a separate read-only
transaction after the main read-write transaction. This means
technically there is a window for things to change again, though given
that we will notify the correct current state I don't think that's a
problem. If we could be sure it was not possible for a Jena transaction
commit to fail we could continue in the RW txn; another alternative
would be to generate the updates inside the txn but not post them until
afterwards.
I was trying to stick to Java Optional on the grounds that it might be
more familiar to people, but Vavr's Option is just better. So use it
consistently, instead of mix-and-match.
This uses the Notify framework internally. This means there is in theory
a small delay between a successful PUT to a schema entry and the schema
coming into force, but I don't see a problem with that.
This is an API-level validation only. SPARQL access is always going to
be an advanced / superuser option in any case.
Allow a ClientError to influence both the headers and the body of the
error response. In particular if I am going to use 405 responses they
MUST have an Allow header.

The InvalidConfig error is for configs which do not conform to their
app config schema.
We cannot allow invalid config entries; they are very difficult to
handle correctly. Initially I was going to implement a system where
invalid config entries were left in the database but returned 405 on
GET; this would mean that loading a new schema could not fail. However
it's very difficult to see how to handle notify updates in this case;
even something as simple as 'this config entry has moved into/out of
this class' is difficult to divine from the raw changed statements.

Go back to the approach in the ConfigDB: fail schema updates with 409 if
they conflict with existing entries. For now SPARQL still provides a
back door. I do not think the ConfigDB correctly validated schema
entries on dump load either.
We will need WebSockets support, and Grizzly's support is non-standard.
Also Jetty server will share components with the Jetty client used by
the service-client.
Some of this should really move into a service-api package. I'm not
going to do that but keep the namespacing suitable for a later move.
The MetaDB will probably not implement v1.
The JSON request body reader supplied with jakarta.json is buggy. It tries
to guess the charset instead of using the media type information, and fails
to do so for single-character inputs.
Currently we only support WATCH, and only have one endpoint. But the
framework is functioning correctly.
notify.Response expects an etag to be a UUID, and in fact it always will
be. Potentially we might change to tagging UUIDs in the RDF with a
dedicated data type, in which case the decoding will be easier to UUID.
We were removing fully orphaned entries, where the object had been removed,
but not those where the structure query produced no result.

We were also not correctly handling config updates where the object of a
config had been removed. We must also search the removed graph for the
object UUID.
I originally thought the SEARCH handlers would need to return a structure
containing an Observable of incremental updates and a function to fetch a
full update. But since the MetaDB is tracking full state for any watched
Apps anyway I can just use that implementation and always accept full
updates. This also means we can debounce the internal sequence and still
get correct differencing sent to the client.
This is implemented differently from the JS service-api, which accepts a
seq of updates and performs full requests as it needs. This API wants a
seq of full updates and calculates its own differences to send to the
client. This is important because we have to be careful about how we
query the database; we must ensure queries end up inside the appropriate
transactions.
The translation to NotifyUpdates is standard. We can detect nonexistent
Notify endpoints because they return 410 responses; we know these will
never change, so we can send a 404 notify update instead of a 200/410
HTTP response.
@amrc-benmorrow amrc-benmorrow self-assigned this Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant