From 9b3ea1b856eea03496c236bb9f57517268fb70c4 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Sat, 6 Jun 2026 18:59:35 +0300 Subject: [PATCH 1/3] fix(threadpool): run each task body in a per-task nursery Scope (snapshot UAF) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit submit() deep-copies the task closure and every nested closure into a per-task snapshot arena. The non-coroutine worker freed that arena the instant the task body returned, but any coroutine the task had spawn()ed was still pending in the scheduler — its op_array lived in the just-freed arena, so the scheduler ran it against freed memory: 0xC0000005 on the Windows debug heap, latent (ASAN-detectable) on Linux. Fix: run each task body under its own per-task Scope in NOT-safe mode (a nursery). Async\spawn uses the current coroutine's scope, so coroutines the task spawns now land in task_scope. On task exit the worker restores its own scope, cancels task_scope and awaits its teardown (cancel + waker-resume on the scope event + SUSPEND, mirroring Scope::awaitAfterCancellation at the C level) BEFORE the snapshot is freed — so no spawned coroutine can outlive the snapshot. NOT-safe cancel means un-awaited children are cancelled at task exit, never awaited as zombies (bounded by the runtime grace). Verified on Windows: true-async/server core/007-server-transfer-handler- dispatch goes red->green; a minimal repro (ThreadPool task that spawns an un-awaited coroutine) no longer crashes; ext/async thread_pool suite 65 pass / 0 fail (incl. the #146/#154 lifetime tests). Regression test tests/thread_pool/065-task_scope_nursery_no_uaf.phpt. Follow-up: coroutine-mode submit already runs each task in its own scope but frees the snapshot on the task coroutine's dispose rather than on scope completion — same latent UAF, not yet fixed here. --- CHANGELOG.md | 1 + .../065-task_scope_nursery_no_uaf.phpt | 46 +++++++++++++++++ thread_pool.c | 49 +++++++++++++++++++ 3 files changed, 96 insertions(+) create mode 100644 tests/thread_pool/065-task_scope_nursery_no_uaf.phpt diff --git a/CHANGELOG.md b/CHANGELOG.md index ae03f0c..26d2a2a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -19,6 +19,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - **`Scope::allowZombies(): Scope`** — opt back into safe disposal on a scope (sets `DISPOSE_SAFELY`); returns `$this` for chaining. Inverse of the existing `asNotSafely()`. Use after `new Scope()` when the scope is expected to outlive coroutines parked in `delay()`/`recv()`/etc., turning them into zombies instead of cancelling them at dispose time. ### Fixed +- **ThreadPool task freed a spawned coroutine's bytecode out from under it (cross-thread snapshot UAF)** — `submit()` deep-copies the task closure and every nested closure into a per-task snapshot arena. The non-coroutine worker freed that arena the instant the task body returned (`thread_pool.c` task_cleanup), but any coroutine the task had `spawn()`ed was still pending in the worker's scheduler — its `op_array` lived in the just-freed arena, so the scheduler ran it against freed memory: `0xC0000005` on the Windows debug heap, latent (and ASAN-detectable) on Linux. Fixed by running each task body under its own per-task `Scope` in NOT-safe mode — a nursery: coroutines the task spawns now land in that scope, and on task exit the worker cancels the scope and awaits its teardown (mirroring `Scope::awaitAfterCancellation` at the C level) before the snapshot is freed, so no spawned coroutine can outlive the snapshot. Un-awaited children are cancelled at task exit, never awaited as zombies. Regression test `tests/thread_pool/065-task_scope_nursery_no_uaf.phpt` (surfaced by true-async/server `core/007-server-transfer-handler-dispatch`). Coroutine-mode `submit` already runs each task in its own scope but still frees the snapshot on the task coroutine's dispose rather than on scope completion — the same latent UAF, tracked as a follow-up. - **file → socket `sendfile` silently dropped the body on Windows** — `ZEND_ASYNC_IO_SENDFILE` is documented as "Windows `TransmitFile`" (see the API entry under Added), but the implementation routed every platform through `uv_fs_sendfile`, whose Windows backend writes the destination via CRT `_write()`. A TCP destination is a Winsock `SOCKET`, not a CRT file descriptor — three distinct Windows descriptor namespaces (HANDLE / SOCKET / CRT fd) that POSIX collapses into one `int`, which is exactly why `uv_fs_sendfile(out->crt_fd, …)` works on Linux and writes nowhere on Windows. Headers (sent via `uv_write`) arrived; the file body vanished — so any static file above the 64 KiB slurp threshold served an empty/garbage body over plaintext HTTP/1 and HTTP/2. Fixed in `libuv_reactor.c`: `libuv_io_sendfile` now dispatches socket destinations to the real `TransmitFile` — destination socket from `descriptor.socket`, source as a Win32 `HANDLE` via `_get_osfhandle(crt_fd)` (never the conflated `crt_fd`-as-socket value) — run synchronously on a libuv threadpool worker (`uv_queue_work`), exactly as `uv_fs_sendfile` runs its own copy loop off-loop, so completion still arrives on the loop thread through the unchanged `sendfile_complete()` notify path. POSIX is untouched (the new path is entirely under `#ifdef PHP_WIN32`). Regression backstop: true-async/server `tests/phpt/server/static/011-static-precompressed-cache-uaf.phpt` (96 KiB precompressed sidecar over plain TCP). - **#154 `ThreadPool` worker crash on `exit()`/`die()` in a task or bootloader** — a graceful `exit()`/`die()` inside a submitted task or the bootloader threw an unwind-exit token that the worker either passed to `reject()` or re-raised via `zend_bailout()` — both crash the worker fiber with `"Error transfer requires a throwable value"` (assert + ASAN stack-use-after-return at `zend_fibers.c:491`). Fixed in `thread_pool.c`: the sync task call and the bootloader call now run under a `zend_try`, and the worker checks `zend_is_unwind_exit()`/`zend_is_graceful_exit()` on `EG(exception)`. Behaviour: `exit()`/`die()` in a **task** is graceful "this task is done" — the worker's request survives it, so the task's future resolves to **`null`** and the worker keeps serving subsequent tasks (verified mixing `exit()` with throwing and normal tasks on a single worker). A real fatal error (e.g. OOM `zend_bailout`) or `exit()`/`die()` in the **bootloader** instead delivers an `Async\ThreadTransferException` to pending awaiters and tears the pool down — the worker can't safely continue. Regression tests `tests/thread_pool/061-bootloader_exit.phpt`, `062-task_exit_sync.phpt`, `064-task_exit_worker_survives.phpt`. - **#154 `ThreadPool` swallowed the real error when a `$this`-bound bootloader could not load on the worker** — a bootloader bound to `$this` of a class not defined on the worker (or whose body threw) had its exception converted to an `E_WARNING` and discarded, while pending tasks were rejected with a generic `"task was cancelled before execution"` cancellation — hiding the cause. `thread_pool_drain_tasks` gained a `reject_with` parameter so the worker now propagates the actual error (e.g. `Cannot load transferred object: class "C" not found`, or the thrown exception) to every awaiter. Regression tests `tests/thread_pool/060-bootloader_this_transfer_error.phpt`, `063-bootloader_exception.phpt`. diff --git a/tests/thread_pool/065-task_scope_nursery_no_uaf.phpt b/tests/thread_pool/065-task_scope_nursery_no_uaf.phpt new file mode 100644 index 0000000..d06e095 --- /dev/null +++ b/tests/thread_pool/065-task_scope_nursery_no_uaf.phpt @@ -0,0 +1,46 @@ +--TEST-- +ThreadPool: task scope is a nursery — un-awaited spawned coroutines are cancelled at task exit (snapshot UAF regression) +--SKIPIF-- + +--FILE-- +submit(function () { + /* Un-awaited child — still pending when the task body returns. Before + * the fix this crashed the worker (use-after-free of the freed snapshot + * arena); now the task's nursery scope cancels it at exit, so its body + * ("child") never runs and nothing is freed out from under it. */ + spawn(function () { echo "child\n"; }); + return 'task-done'; +}); + +var_dump(await($f)); + +$pool->close(); +echo "ok\n"; +--EXPECT-- +string(9) "task-done" +ok diff --git a/thread_pool.c b/thread_pool.c index 0c23cbd..83d3040 100644 --- a/thread_pool.c +++ b/thread_pool.c @@ -338,6 +338,13 @@ static void thread_pool_worker_handler(zend_async_thread_event_t *event, void *c ZVAL_UNDEF(&retval); ZVAL_UNDEF(&callable); + /* Per-task nursery Scope (structured concurrency); used by the + * non-coroutine run path below. Declared up here so the early + * error gotos to task_cleanup never jump past an initializer. */ + zend_async_scope_t *task_scope = NULL; + zend_async_scope_t *task_saved_scope = NULL; + zend_coroutine_t *task_worker_coro = NULL; + async_thread_create_closure(&snapshot->entry, &callable); if (UNEXPECTED(EG(exception))) { @@ -433,6 +440,21 @@ static void thread_pool_worker_handler(zend_async_thread_event_t *event, void *c goto task_cleanup; } + /* Structured concurrency: run the task body under its own + * per-task Scope in NOT-safe mode (a nursery). Async\spawn uses + * the current coroutine's scope, so coroutines the task spawns + * land in task_scope. Temporarily point the worker coroutine's + * scope at task_scope around the call; the worker itself is NOT a + * member of task_scope (never spawned into it), so it can await it + * below without tripping the same-scope deadlock guard. */ + task_worker_coro = ZEND_ASYNC_CURRENT_COROUTINE; + task_scope = ZEND_ASYNC_NEW_SCOPE(ZEND_ASYNC_CURRENT_SCOPE); + if (EXPECTED(task_scope != NULL && task_worker_coro != NULL)) { + ZEND_ASYNC_SCOPE_CLR_DISPOSE_SAFELY(task_scope); /* not-safe */ + task_saved_scope = task_worker_coro->scope; + task_worker_coro->scope = task_scope; + } + /* A real fatal error longjmps (zend_bailout); exit()/die() instead * throws an unwind-exit token into EG(exception). Neither may be * re-raised or passed to reject() — the token can't cross the fiber @@ -444,6 +466,33 @@ static void thread_pool_worker_handler(zend_async_thread_event_t *event, void *c task_bailed = true; } zend_end_try(); + /* Task body returned. Restore the worker's own scope, then cancel + * any coroutines the task spawned but left running and await their + * teardown BEFORE the snapshot arena (which backs every spawned + * closure's op_array) is freed at task_cleanup — so no child + * outlives the snapshot. NOT-safe cancel = stragglers/zombies are + * cancelled, never awaited forever (bounded by the runtime grace). + * Mirrors Scope::awaitAfterCancellation at the C level. */ + if (task_scope != NULL && task_worker_coro != NULL) { + task_worker_coro->scope = task_saved_scope; + ZEND_ASYNC_SCOPE_CANCEL(task_scope, NULL, false, false); + + if (!ZEND_ASYNC_SCOPE_IS_COMPLETELY_DONE(task_scope)) { + ZEND_ASYNC_WAKER_NEW(task_worker_coro); + if (EXPECTED(EG(exception) == NULL)) { + zend_async_resume_when(task_worker_coro, &task_scope->event, + false, zend_async_waker_callback_resolve, NULL); + if (EXPECTED(EG(exception) == NULL)) { + ZEND_ASYNC_SUSPEND(); + } + zend_async_waker_clean(task_worker_coro); + } + } + + ZEND_ASYNC_SCOPE_RELEASE(task_scope); + task_scope = NULL; + } + /* Decrement running and bump completed BEFORE notifying the awaiter * via complete/reject — otherwise a coroutine waking from await() * would observe stale running_count and a missing completed bump. */ From 765d3073861fcfe6d43e587c826b398be8ac4616 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Sat, 6 Jun 2026 19:32:24 +0300 Subject: [PATCH 2/3] fix(threadpool): await physical child disposal (IS_CLOSED guard + coroutine count) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The first cut gated the post-cancel await on IS_COMPLETELY_DONE, which (can_be_disposed with_zombies=true) flips true the moment children are cancelled — while their coroutine objects, holding op_arrays that live in the snapshot arena, are still queued for disposal on a later scheduler tick. The worker skipped the await and freed the arena; the deferred coroutine_dispose then ran destroy_op_array against freed memory (ASAN heap-use-after-free on Linux; masked by timing on the Windows debug heap). It also called resume_when on an already-terminated scope event when the task body had run its own children to completion (core/007: "event cannot be used after it has been terminated"). Mirror Scope::awaitAfterCancellation: skip when the scope is already CLOSED (children disposed, event gone), otherwise await while live coroutines or child scopes remain (async_scope_t.coroutines.length). The SUSPEND now lets each child run to full disposal while the arena is alive; cooperative scheduling guarantees they finish before the worker resumes and frees it. core/007 green 3/3; ext/async thread_pool 66/0. --- thread_pool.c | 39 +++++++++++++++++++++++++++++++-------- 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/thread_pool.c b/thread_pool.c index 83d3040..57db909 100644 --- a/thread_pool.c +++ b/thread_pool.c @@ -17,6 +17,7 @@ #include "thread_pool.h" #include "thread_pool_arginfo.h" #include "thread.h" +#include "scope.h" #include "async_API.h" #include "exceptions.h" #include "php_async.h" @@ -475,17 +476,39 @@ static void thread_pool_worker_handler(zend_async_thread_event_t *event, void *c * Mirrors Scope::awaitAfterCancellation at the C level. */ if (task_scope != NULL && task_worker_coro != NULL) { task_worker_coro->scope = task_saved_scope; - ZEND_ASYNC_SCOPE_CANCEL(task_scope, NULL, false, false); - if (!ZEND_ASYNC_SCOPE_IS_COMPLETELY_DONE(task_scope)) { - ZEND_ASYNC_WAKER_NEW(task_worker_coro); - if (EXPECTED(EG(exception) == NULL)) { - zend_async_resume_when(task_worker_coro, &task_scope->event, - false, zend_async_waker_callback_resolve, NULL); + /* If every child already ran to completion during the task body + * (e.g. the body ran an event loop), the scope is already closed + * and its coroutines disposed — nothing to cancel or await, and + * touching the terminated scope event would throw. Only engage + * the nursery teardown while the scope is still open. */ + if (!ZEND_ASYNC_SCOPE_IS_CLOSED(task_scope)) { + ZEND_ASYNC_SCOPE_CANCEL(task_scope, NULL, false, false); + + /* Await until every coroutine the task spawned has PHYSICALLY + * disposed — not merely until the scope reports "completed". + * IS_COMPLETELY_DONE (can_be_disposed with_zombies=true) flips + * true the instant the children are cancelled, while their + * coroutine objects (and the op_arrays they hold, which live + * in the snapshot arena) are still queued for disposal on a + * later scheduler tick. Gate on the live coroutine/child-scope + * count instead — mirrors Scope::awaitAfterCancellation — so + * the SUSPEND lets each child run to full disposal (op_array + * freed) while the arena is still alive; cooperative scheduling + * guarantees they finish before we resume and free it. */ + if (!ZEND_ASYNC_SCOPE_IS_CLOSED(task_scope) + && (((async_scope_t *) task_scope)->coroutines.length != 0 + || task_scope->scopes.length != 0) + && EXPECTED(EG(exception) == NULL)) { + ZEND_ASYNC_WAKER_NEW(task_worker_coro); if (EXPECTED(EG(exception) == NULL)) { - ZEND_ASYNC_SUSPEND(); + zend_async_resume_when(task_worker_coro, &task_scope->event, + false, zend_async_waker_callback_resolve, NULL); + if (EXPECTED(EG(exception) == NULL)) { + ZEND_ASYNC_SUSPEND(); + } + zend_async_waker_clean(task_worker_coro); } - zend_async_waker_clean(task_worker_coro); } } From 9ba383e42590d98d8fdf011cd84c01a221ee2c16 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Sat, 6 Jun 2026 19:57:33 +0300 Subject: [PATCH 3/3] fix(threadpool): pin task scope across cancel/await so it can't self-free MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ASAN caught a second UAF — this one on the task scope itself. When the last child coroutine finishes during the worker's SUSPEND, async_scope_notify_ coroutine_finished -> scope_try_to_dispose -> scope_dispose frees the scope; the worker then resumes and touches the freed scope (CLR_OWNER_PINNED / RELEASE) -> heap-use-after-free at thread_pool.c. Pin the scope with ZEND_ASYNC_SCOPE_SET_OWNER_PINNED before the cancel/await (same pattern pool_scope uses to outlive its tasks) and clear it right before RELEASE, so the scope survives the await and the worker owns its disposal. core/007 green 3/3; ext/async thread_pool 66/0. --- thread_pool.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/thread_pool.c b/thread_pool.c index 57db909..f978250 100644 --- a/thread_pool.c +++ b/thread_pool.c @@ -452,6 +452,11 @@ static void thread_pool_worker_handler(zend_async_thread_event_t *event, void *c task_scope = ZEND_ASYNC_NEW_SCOPE(ZEND_ASYNC_CURRENT_SCOPE); if (EXPECTED(task_scope != NULL && task_worker_coro != NULL)) { ZEND_ASYNC_SCOPE_CLR_DISPOSE_SAFELY(task_scope); /* not-safe */ + /* Pin the scope so it survives the cancel/await below: when the + * last child finishes mid-SUSPEND the scope would otherwise + * try_to_dispose and free itself out from under us. Same pattern + * as pool_scope. Cleared right before RELEASE. */ + ZEND_ASYNC_SCOPE_SET_OWNER_PINNED(task_scope); task_saved_scope = task_worker_coro->scope; task_worker_coro->scope = task_scope; } @@ -512,6 +517,7 @@ static void thread_pool_worker_handler(zend_async_thread_event_t *event, void *c } } + ZEND_ASYNC_SCOPE_CLR_OWNER_PINNED(task_scope); ZEND_ASYNC_SCOPE_RELEASE(task_scope); task_scope = NULL; }