Skip to content

Conversation

@amartya4256
Copy link
Contributor

@amartya4256 amartya4256 commented Dec 22, 2025

This PR handles the creation of handles / cached ImageData per zoom for Image objects in a thread-safe way by the means of double-checked locking. The implementation extends the ImageHandleManager in the following way:

  • ImageHandleManager#getHandleOrCreate now uses thread-safe compute methods
  • ImageHandleManager#executeOnHandle also uses ConcurrentHashMap#compute for thread-safety. This is utilized while calling getImageData to ensure thread safety.

Depends on #2964
This fixes vi-eclipse/Eclipse-Platform#562

@amartya4256 amartya4256 marked this pull request as draft December 22, 2025 12:24
Copy link
Contributor

@HeikoKlare HeikoKlare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea to wrap the handle-for-zoom mapping into a separate class. That clearly assigns all related management functionality to that class. I would be in favor of extracting the pure refactoring of the management functionality in a dedicated class into a separate PR. That PR will just improve code quality and comprehensibility and not improve thread safety.

You find several questions and proposals in the detailed comments. The probably most essential point is that to my understanding the implementation is not fully thread safe.

In addition, I am not sure if the concept of using a synchronized data structure is fully sufficient for all cases here (e.g., if we also include the rest of the API such as getBounds() and getImageData()). At least the complexity could increase such that ensuring correctness and maintainability becomes complex. Maybe we better see that when we adapt those methods as well, but I could imagine that simply synchronizing write access might be sufficient (as we don't expect multiple consumers to write for different zooms) and provides reduced compexity and error proneness.

@amartya4256 amartya4256 force-pushed the amartya4256/image_concurrency_fix branch 2 times, most recently from e9cb651 to 0e6ad0d Compare December 30, 2025 15:56
@github-actions
Copy link
Contributor

github-actions bot commented Dec 30, 2025

Test Results (win32)

   34 files  ±0     34 suites  ±0   4m 34s ⏱️ +3s
4 636 tests ±0  4 563 ✅ ±0  73 💤 ±0  0 ❌ ±0 
  170 runs  ±0    167 ✅ ±0   3 💤 ±0  0 ❌ ±0 

Results for commit 8dc9a9f. ± Comparison against base commit c2e0429.

♻️ This comment has been updated with latest results.

@amartya4256 amartya4256 marked this pull request as ready for review January 2, 2026 14:45
@amartya4256 amartya4256 force-pushed the amartya4256/image_concurrency_fix branch 2 times, most recently from 1494ed0 to 22f6ab7 Compare January 2, 2026 15:34
@amartya4256 amartya4256 changed the title Using ImageHandleManager for managing multiple Image handles in Image class Implemented Double-Checked Locking in ImagehandleManager Jan 5, 2026
@amartya4256 amartya4256 force-pushed the amartya4256/image_concurrency_fix branch from 22f6ab7 to 4da78e2 Compare January 8, 2026 10:53
Copy link
Contributor

@HeikoKlare HeikoKlare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As written here #2905 (review), getBounds() is also affected by potential race conditions.
I went through all call sequences of methods in ImageHandleManager and only found getBounds() with a potential remaining race condition. Still it would be good to have someone confirm this by checking the sequences.

In addition, I wonder why/if we need this runSynchronized method to synchronize on the map of ImageHandleManager or whether we cannot simply synchronize on ImageHandleManager itself? Then we would not need that method.
Edit: Maybe it would even be better to keep the synchronization inside ImageHandleManager but not make it explicit via an runSynchronized method but having a specific method for the use case that does the synchronizaiton, i.e., have something like a <R> R executeOnHandle(int zoom, Function<Optional<ImageHandle>,R> execution) method for the scenario(s) using runSynchronized like getting the image data.

@laeubi
Copy link
Contributor

laeubi commented Jan 8, 2026

I have not looked into the full details here but using syncronized in SWT code looks wrong. The whole SWT code assumes it is always executed single threaded on the display thread, so where are different Threads are coming from here? If it is because of multiple Displays I think the cache should more be per display and avoid any complicate synchronization here.

@HeikoKlare
Copy link
Contributor

The point is that resources used to be accessible from any thread (even though it is not recommended, but it worked and consumers used it) and this behavior broke when introducing the multi-handle support. This change is just restoring that capability.

@laeubi
Copy link
Contributor

laeubi commented Jan 8, 2026

The point is that resources used to be accessible from any thread (even though it is not recommended, but it worked and consumers used it) and this behavior broke when introducing the multi-handle support

Looking at the Resource (and its subclasses) I can't find any sign neither in code nor API that it is safe to access these from different threads.

If we still want to support this I think synchronized primitives are not the right way and instead concurrent data-structures must be used instead and we are prone to deadlocks here.

private final ImageHandleManager imageHandleManager = new ImageHandleManager();

private class ImageHandleManager {
private Map<Integer, DestroyableImageHandle> zoomLevelToImageHandle = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private Map<Integer, DestroyableImageHandle> zoomLevelToImageHandle = new ConurrentHashMap<>();

Comment on lines 162 to 170
synchronized (zoomLevelToImageHandle) {
imageHandle = (DestroyableImageHandle) get(zoom);
if (imageHandle == null) {
imageHandle = creator.get();
zoomLevelToImageHandle.put(zoom, imageHandle);
}
return imageHandle;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use zoomLevelToImageHandle.computeIfAbsent(...) or compute(...) on the concurrent map instead of own synchronization primitives

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

93108dc7d4
That's how I pretty much started off earlier but since we also need to implement this for getImageData which either retrieves it from ImageHandle#getImageData or creates an ImageData and caches it; we needed different mechanism and overall it made sense to use a consistent locking mechanism rather than using 2 different approaches in 2 different places. That's why we decided to go with double-checked locking using synchronized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must confess I don't understand the befit of this approach and why it should not work for getImageData.

Assume Thread 1 is entering here into the map.computeIfAbsent.

Now Thread 2 is entering getImageData and will be blocked on the get call until Thread 1 completes, what has the same effect as Thread 1 entering the synchronized and Thread 2 is hitting the null and then entering the synchronization and needs to wait until Thread 1 finishes.

The main difference here is, that if we have more threads then with your approach ALL threads will be blocked regardless of zoomlevel (both in getImageData and getOrCreate) even if they aim for different zoom levels!

... not talking even about the case that we have a race condition between two threads one that is removing it an one that is getting it (what will make a possibly destroyed image handle visible to the getting thread.

So actually I think one even needs to use Map.compute here and do two checks: If current value is null or current value is disposed we need to create a new handle...

Copy link
Contributor

@HeikoKlare HeikoKlare Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why it should not work for getImageData.

getImageData() is not doing an atomic call on the ImageHandleManager, which would be necessary to make a synchronization via the data structure. Maybe there is some possibility to refactor the code such that it is wrapped into an atomic data structure access.

ALL threads will be blocked regardless of zoomlevel (both in getImageData and getOrCreate) even if they aim for different zoom levels

That's correct but not a problem at all. This synchronization is only necessary to avoid issues due to the extremely rare case of a race condition, so there is no necessity to ensure that access to the data structure with different zooms concurrently works in an efficient way.

Copy link
Contributor

@laeubi laeubi Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must confess I don't understand the "not doing an atomic call" it is not doing an "atomic call" either now and why double checked locking should work/help here it all.

I have already outlined why I think my approach works and that it is semantically working the same (except the problematic locking), so maybe one can explain what is the real case to solve here and what threads are involved and why a synchronized works better than a plain get.

If it is "extremely rare" the only issue is that maybe two image data are created (what are not cached anyways) so I would argue this is acceptable in this situation as it only might hurt performance a very little bit in "extreme rare cases".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must confess I don't understand the "not doing an atomic call" it is not doing an "atomic call" either now and why double checked locking should work/help here it all.

With the used locking/synchronization, we ensure that the effect of the synchronized code block is atomically visible to other threads that synchronize on the manager as well. They will not operate on any intermediate state. When you first retrieve the handle from the manager and, if it is not present, execute some other code based on that decision, this may lead to a race condition with another thread accessing the manager. What precisely will happen is that another thread creates the handle, the thread trying to access the image data will create image data based on a temporary handle (so far it's just inefficient) but also caches the created image data without them ever being consumed again (this is the erroneous state that will be reached), as the image data is only removed from the cache when an actual handle is created out of it, which will not happen as the other thread concurrently created that handle already. That's also why this is not acceptable contrary to:

If it is "extremely rare" the only issue is that maybe two image data are created (what are not cached anyways) so I would argue this is acceptable in this situation as it only might hurt performance a very little bit in "extreme rare cases".

If there is a concrete proposal as an alternative the current one, we might agree on that or directly show the race condition on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

precisely will happen is that another thread creates the handle, the thread trying to access the image data will create image data based on a temporary handle

This it what does not makes sense to me: ImageData has no handle (only image has) and the created ImageData (in case we have no cache hit) is never stored anywhere!

So can we probably have a a concrete call sequence that describes the issue to be happens as I tried with #2905 (comment) like

  • We have Thread A + B
  • Thread A calls ...
  • Then a context switch happens and Thread B calls ...
  • What leads to object X is put into map ...
  • ...

Its really hard otherwise to catch up. From what I can see we have a Map that caches ImageData and handles (the map) for a zoom level and one method that wants to reuse the ImageData cached (if any). If nothing is cached we will get at worst an ImageData created twice (one because Thread A has created it while Thread B got a temporary one). And I don't see that with the current code structure this can be avoided.

Furthermore we have an (unsyncronized) map that performs non atomic get-and-set operation what is problematic. This would be fixed by using a ConcurrentHashMap and computeIfAbsent what makes the get-and-set an atomic operation and also ensures that while one thread is currently computing an ImageData another thread arriving after that will wait until this computation is complete and use the cached data.

Copy link
Contributor

@HeikoKlare HeikoKlare Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This it what does not makes sense to me: ImageData has no handle (only image has) and the created ImageData (in case we have no cache hit) is never stored anywhere!

The image data created by getImageData() is cached in case there is no handle for that zoom in this map:

private final Map<Integer, ImageData> cachedImageData = new HashMap<>();

I tried to sketch a sequence already, let's try more concretely:

  • We have an image based on an ImageDataProvider and threads A + B
  • A calls getImageData(zoom) and finds the manager not to contain a handle for that zoom
  • Context switch happens to B which calls getHandle(zoom, zoom) and creates a handle for that zoom in the manager (stored in the zoom to handle map of the manager)
  • Context switch happens to A which proceeds with imageProvider.newImageData(zoom); this creates a temporary handle for that zoom, extracts the image data out of it and stores it in the provider's cachedImageData

In consequence, cachedImageData contains image data for that zoom which will never be accessed or removed as the current contract (for single-threaded access) ensures that this cache only contains image data for zooms for which the manager does not contain a handle yet and, in case such a handle gets created, the image data is removed from that cache. So the sketched sequence leads to a kind of memory leak.

Furthermore we have an (unsyncronized) map that performs non atomic get-and-set operation what is problematic. This would be fixed by using a ConcurrentHashMap and computeIfAbsent what makes the get-and-set an atomic operation and also ensures that while one thread is currently computing an ImageData another thread arriving after that will wait until this computation is complete and use the cached data.

We could (and maybe even should) synchronize the map, yes. For most use cases it's not necessary as in case a thread sees an inconsistent state (especially missing a handle that has already been created), it will synchronize afterwards, which produces a memory barrier and results in properly updated data (that's why the double-checked locking works). What might go wrong is that you retrieve and access a handle that has already been disposed, as for that case there is no memory barrier involved.

As an additional note: I would be fully in favor of simply using a synchronized collection (which we probably need to use additionally anyway), I currently just don't see how to synchronize the image data creation just based on a synchronized zoom-to-handle collection. That would require some compute method on the collection that doesn ot yield a value of the map (in our case a handle) but something else (in our case an image data object). What might be possible is (mis) use a compute method on the map-to-handle collection and inside that method calculate the image data which is returned via writing a separate AtomicReference (that is local to the calling method). That more feels like a hack but could actually work and relieve us from all the synchronized stuff. @amartya4256 maybe you want to give this a try?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the current contract (for single-threaded access) ensures that this cache only contains image data for zooms for which the manager does not contain a handle yet and, in case such a handle gets created, the image data is removed from that cache.

Thanks, I think I now better understand what one want to handle here. I'll try to look a bit deeper into this the next days but what currently makes it quite hard is that this class is bloated with a lot of nested classes/interfaces that have implicit references to the image object.

Is there any compelling reason not to extract all these into (package private) classes and passing any required pointer to an Image instance explicitly in the constructor? I think this will make not only the code easier to read/understand but also make the responsibilities more visible.

My current impression is, that the distributed caches are actually a problem and should better be handled in a way where we have one Map (indexed by the zoom level) that returns a ImageDataState. This manger would then have (possibly synchronized) methods that acquire image datas and manage the handle and the can easily make sure things are cleared out (either handle or image data) as it only has to cope with a single field of each item instead of these distributed values all over the place. This would then also be more memory efficient (currently we create multiple maps wich each has a capacity of 16 items while likely one won't have that much zoom levels ever stored.

@HeikoKlare
Copy link
Contributor

Looking at the Resource (and its subclasses) I can't find any sign neither in code nor API that it is safe to access these from different threads.

That's true, there is nothing explicitly implemented to ensure thread-safety. However, most of the functionality was effectively thread-safe as after object initialization most of the state was effectively final (in particular the handles themselves), such that without any additional effort all access scenarios that did not modify state worked find in a concurrent access scenario.

@amartya4256 amartya4256 force-pushed the amartya4256/image_concurrency_fix branch 3 times, most recently from eb01fd2 to 9ab5ed9 Compare January 9, 2026 13:42
@amartya4256 amartya4256 force-pushed the amartya4256/image_concurrency_fix branch 2 times, most recently from 10378c8 to 70a4d45 Compare January 12, 2026 11:08
@amartya4256 amartya4256 force-pushed the amartya4256/image_concurrency_fix branch from 70a4d45 to 80fe2d2 Compare January 13, 2026 12:50
@amartya4256 amartya4256 force-pushed the amartya4256/image_concurrency_fix branch 4 times, most recently from b32f676 to 6ebfc14 Compare January 14, 2026 11:13
@amartya4256 amartya4256 force-pushed the amartya4256/image_concurrency_fix branch 3 times, most recently from bdbd1bc to 858ad75 Compare January 14, 2026 15:31
This commit makes the ImageHandle creation and ImageData caching process
synchronous to avoid any racing condition and inefficient memory usages.

# Conflicts:
#	bundles/org.eclipse.swt/Eclipse SWT/win32/org/eclipse/swt/graphics/Image.java
# Conflicts:
#	bundles/org.eclipse.swt/Eclipse SWT/win32/org/eclipse/swt/graphics/Image.java
@amartya4256 amartya4256 force-pushed the amartya4256/image_concurrency_fix branch from 858ad75 to 8dc9a9f Compare January 16, 2026 09:59
Copy link
Contributor

@HeikoKlare HeikoKlare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code seems to be fine regarding what is supposed to be achieved. In general, the implementations based on the concurrent data structure looks much better/simpler than before. The more complex executeOnHandle call(s) can probably be simplified for better understanding, as currently those calls are quite hard to understand. I made a concrete proposal for that. With that, I think this is the best we can achieve, and we actually don't introduce any relevant cognitive overhead.

});
}

<R> R executeOnHandle(int zoom, Supplier<DestroyableImageHandle> handleCreator, Function<Optional<InternalImageHandle>,R> executable) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing the implementation (especially in getImageData()), I wonder if it would not be easier to understand if there was only was lambda that allows to create the ImageHandle as well. Precisely, this method could be something like:

<R> R executeOnHandle(int zoom, Function<AtomicReference<DestroyableImageHandle>,R> executable) {
	AtomicReference<R> computedValue = new AtomicReference<>();
	zoomLevelToImageHandle.compute(zoom, (key, value) -> {
		AtomicReference<DestroyableImageHandle> handleReference = new AtomicReference<>(value);
		computedValue.set(executable.apply(handleReference));
		return handleReference.get();
	});
	return computedValue.get();
}

And then the callers could set the handle reference if they need to create it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make Image class thread safe

3 participants