[dylink] Fix deadlock in _emscripten_dlsync_threads()#27000
Conversation
| DBG("waking main runtime thread using _emscripten_thread_notify"); | ||
| if (ret) { | ||
| // Wake up the target thread in case it is blocked in | ||
| // `emscripten_futex_wait`. |
There was a problem hiding this comment.
This is little confusing (and maybe wasteful) though since normal pthread do not run their proxy queue in emscripten_yield, and therefore wakeing them them up should not be needed.
pthreads do however call _emscripten_process_dlopen_queue from emscripten_yield .. so it should only be necessary to wake them up if/when need need to call _emscripten_process_dlopen_queue, and not whereever work gets proxied to them.
There was a problem hiding this comment.
I'm not sure if _emscripten_thread_notify() is actually that expensive, since it already returns early when the thread isn't waiting.
I tried limiting this notify to the dlopen() proxying queue with commit 72cb72a, but reintroducing && pthread_equal(target_thread, emscripten_main_runtime_thread_id()) regressed this again.
There was a problem hiding this comment.
pthreads do however call _emscripten_process_dlopen_queue from
emscripten_yield[...]
I think that's probably why reintroducing && pthread_equal(target_thread, emscripten_main_runtime_thread_id()) caused this regression again, as the queue is processed from pthreads rather than from the main runtime thread.
There was a problem hiding this comment.
I'm not really worried about the code of doing the _emscripten_thread_notify. But I would really like to know why it is needed. Because pthreads don't run their messge queue on wakeup I don't see any reason to wake them when we put something in the queue.
pthreads should only need to wakeup if they are need to process the dlopen queue, right?
This extends commit 662cb06 by ensuring `_emscripten_thread_notify()` is also called for the synchronous `_emscripten_dlsync_threads()` path. This ensures that threads process the dlopen catch-up queue promptly, even when blocked in `emscripten_futex_wait()`. To implement this cleanly, `dlopen_proxying_queue` is moved from a dynamic pointer in `dynlink.c` to a statically initialized queue in `proxying.c`. Fixes: emscripten-core#26913.
This is an automatic change generated by tools/maint/rebaseline_tests.py. The following (2) test expectation files were updated by running the tests with `--rebaseline`: ``` codesize/test_codesize_minimal_pthreads.json: 26180 => 26179 [-1 bytes / -0.00%] codesize/test_codesize_minimal_pthreads_memgrowth.json: 26589 => 26588 [-1 bytes / -0.00%] Average change: -0.00% (-0.00% - -0.00%) ```
72cb72a to
d7be9f4
Compare
_emscripten_dlsync_threads() path_emscripten_dlsync_threads()
This is an automatic change generated by tools/maint/rebaseline_tests.py. The following (2) test expectation files were updated by running the tests with `--rebaseline`: ``` codesize/test_codesize_minimal_pthreads.json: 26158 => 26157 [-1 bytes / -0.00%] codesize/test_codesize_minimal_pthreads_memgrowth.json: 26567 => 26566 [-1 bytes / -0.00%] Average change: -0.00% (-0.00% - -0.00%) ```
|
I'd really like to get this fixed for the 6.0.0 release, but I'd also love to understand the issue better rather then just always waking pthreads when they we add a message to their queue (which still seem like it should not be needed in theory). |
This extends commit 662cb06 by ensuring
_emscripten_thread_notify()is also called for the synchronous
_emscripten_dlsync_threads()path. This ensures that threads process the dlopen catch-up queue
promptly, even when blocked in
emscripten_futex_wait().To implement this cleanly,
dlopen_proxying_queueis moved from adynamic pointer in
dynlink.cto a statically initialized queue inproxying.c.Fixes: #26913.