Skip to content

Implement a global lock#318

Open
rwb27 wants to merge 17 commits intomainfrom
global-lock
Open

Implement a global lock#318
rwb27 wants to merge 17 commits intomainfrom
global-lock

Conversation

@rwb27
Copy link
Copy Markdown
Collaborator

@rwb27 rwb27 commented Apr 22, 2026

This PR adds:

  • A server config key for the global lock
  • A class to implement the global lock
  • Context managers used when running actions or setting properties, to acquire said global lock
  • Arguments allowing actions or properties to opt out of the lock
  • Tests for global locking, in isolation and in a server
  • Error handling so that global lock related errors in properties result in helpful HTTP responses (these errors are already handled for Actions).

Properties will either use the lock for setting, or not use the lock. If functional properties should use the lock for reading, they can acquire it in the getter. This is the default behaviour I discussed briefly with @julianstirling based on the idea that reading properties shouldn't change state. If getting a property needs to access hardware, this should be protected by a lock specific to the hardware - the stage and camera already have this in OFM.

It's possible that something similar to the GlobalLock class would be useful to Thing subclasses for other locks, because it supports use as a context manager with a default timeout. Vanilla RLock instances will block forever, which can cause undesirable queuing.

We still need

  • some tests of the lock in the context of properties and actions, including the argument to opt out.
  • tests in the context of a ThingServer and using a TestClient so it's running in a real event loop.
  • error handling for HTTP requests.
  • create an OFM feature branch that enables locking.
  • a way for a Thing to require global locking (e.g. a thing setting?) - that may be best done after we decide on Remove feature flags #316 .
  • documentation!

Closes #101
OFM-Feature-Branch: v3-labthings-global-lock

@barecheck
Copy link
Copy Markdown

barecheck Bot commented Apr 23, 2026

@rwb27
Copy link
Copy Markdown
Collaborator Author

rwb27 commented Apr 27, 2026

I've now got passing tests with a TestClient however there is more to be done. Checking manually with a ThingClient and an HTTP server in a separate process gives different errors to the ones the test checks for. This suggests I may have found a difference between TestClient and actual HTTP, or that my test logic may be wrong.

Either way, I need to figure out how to handle the GlobalLockBusyError neatly, especially in the case of properties where there's not currently good error handling.

@rwb27
Copy link
Copy Markdown
Collaborator Author

rwb27 commented Apr 28, 2026

Either way, I need to figure out how to handle the GlobalLockBusyError neatly, especially in the case of properties where there's not currently good error handling.

Before I tidied up error handling for properties, the error handling behaved differently between the testclient and the HTTP server (the testclient-based test was raising GlobalLockBusyError instead of ClientPropertyError). I've now fixed this, and the behaviour is consistent. I believe this is down to how TestClient deals with unhandled exceptions, which is to propagate them back to the test code. This makes sense for a test suite, and the solution is not to have unhandled exceptions in property getter/setter code. My error handler means that GlobalLockBusyError exceptions now get handled by the FastAPI app, so there's no need to catch them in the Python code.

As has been mentioned elsewhere, it would be nice to add an integration test suite in addition to the OFM suite - largely because we can include more edge cases designed to test more rigorously. I am not proposing to do that here, as I think the TestClient version is already good enough, and adding an integration test suite will want proper review on its own.

Copy link
Copy Markdown
Contributor

@julianstirling julianstirling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice. I think the point about hold_global_lock(False) we should discuss in the call.

Comment thread src/labthings_fastapi/actions.py Outdated
Comment thread src/labthings_fastapi/exceptions.py
Comment thread src/labthings_fastapi/thing_server_interface.py Outdated
@rwb27 rwb27 added this to the v0.2.0 milestone Apr 29, 2026
rwb27 added 9 commits May 5, 2026 15:48
This commit adds:
* A server config key for the global lock
* A class to implement the global lock
* Context managers used when running actions or setting properties, to acquire said global lock

Properties will either use the lock for setting, or not use the lock. If functional properties should use the lock for reading, they can acquire it in the getter.

No documentation yet, but there are some tests for the lock.

We still need some tests of the lock in the context of properties and actions, including the argument to opt out.
I've added tests for the ThingServerInterface functions related to the lock. This revealed a bug that is now fixed.

I also added a convenience function to mock Thing instances, which provides a few attributes that are needed by the various descriptors and should help keep test suites tidy.
This now checks both directly and in the context of a `ThingServer` (simulated by a `TestClient`) that the global lock works as expected, if it is enabled.
I've now installed a FastAPI error handler that will provide a more
graceful error when the global lock is not available.

Errors are handled for actions by the normal mechanism, but for property
code any exceptions previously resulted in a 500 Internal Server Error
with no details.

I now explicitly catch `GlobalLockBusyError` and provide a 409 response with a message.
I thought it was elegant to call the lock context manager and bind just
the generator object it returns to the wrapped function. Unfortunately,
said generator object may not be re-used.

I have now added a test (that started off failing) to reproduce that
problem, and have changed the offending code to use a fresh
generator each time.

This was caught by the OFM test suite :)
Some newly-added tests in this branch needed to be updated after rebasing.
This was because they passed a dictionary of
Things to `ThingServer`'s constructor. The tests
now correctly use `from_things` and are otherwise unchanged.
rwb27 added 3 commits May 5, 2026 16:03
I've added text to the ActionDescriptor docstring directing the user to the global lock, in case they
need to access it manually.
This is in response to review comment from @julianstirling.
This splits out the context manager that holds the lock from the action function, when the
action is called over HTTP. That means we will only log a single line if an action fails to start
because of the global lock.

I have deliberately left other global lock errors (e.g. for when an unlocked function calls locked
code) to be handled as usual (i.e. log a traceback) because in my view that is an error, and should
be handled by the action in the usual way.
Re-raising the error as an InvocationError is
appropriate, if terminating because the lock
is busy is something we expect to happen.
The public API `hold_global_lock` no longer uses the True/False/None scheme, and instead
has a more explicitly-named argument, as suggested by @julianstirling.
@rwb27
Copy link
Copy Markdown
Collaborator Author

rwb27 commented May 5, 2026

For now, if a Thing really wants to insist there is a global lock, it can check that self._thing_server_interface._global_lock is not None during __init__. I think that's OK for now, and we can consider adding a class setting if we feel it's useful.

Calling with self._thing_server_interface.hold_global_lock(): will also (by default) raise an error if the lock is missing.

@rwb27 rwb27 requested a review from julianstirling May 5, 2026 21:38
@rwb27 rwb27 marked this pull request as ready for review May 5, 2026 21:38
@rwb27 rwb27 requested a review from bprobert97 May 6, 2026 08:09
@rwb27
Copy link
Copy Markdown
Collaborator Author

rwb27 commented May 6, 2026

@julianstirling I've made all the suggested changes, so hopefully this is straightforward for you to rereview.

@bprobert97 there is reasonable testing for this, I think, but I'm very open to suggestions if you think we can do better - hopefully in a make-an-issue-and-fix-it-at-our-leisure sort of way, but now feels like a sensible time to ask for another pair of eyes, either pre- or -post-merge.

Comment thread src/labthings_fastapi/actions.py
Comment thread src/labthings_fastapi/exceptions.py
Comment thread tests/utilities.py
Comment thread tests/test_thing_server_interface.py
Comment thread tests/test_thing_server_interface.py
Comment thread tests/test_global_lock.py Outdated
Comment thread tests/test_global_lock.py Outdated
Comment thread tests/test_global_lock.py
Comment thread tests/test_global_lock.py
Comment thread tests/test_global_lock.py
rwb27 added 2 commits May 6, 2026 17:26
This fixes several things:
* Adds a test for the expected error if an un-acquired lock is released.
* Adds a test that the global lock is the same lock.
* Adds a test for non-default timeouts.
* Fixes some docstrings.
* Clarifies a confusing comment.

Thanks @bprobert97
@rwb27 rwb27 requested a review from bprobert97 May 6, 2026 16:41
Copy link
Copy Markdown
Contributor

@bprobert97 bprobert97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with changes! Thanks Richard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Not clear how to lock a thing?

3 participants