Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- **`Scope::allowZombies(): Scope`** — opt back into safe disposal on a scope (sets `DISPOSE_SAFELY`); returns `$this` for chaining. Inverse of the existing `asNotSafely()`. Use after `new Scope()` when the scope is expected to outlive coroutines parked in `delay()`/`recv()`/etc., turning them into zombies instead of cancelling them at dispose time.

### Fixed
- **ThreadPool task freed a spawned coroutine's bytecode out from under it (cross-thread snapshot UAF)** — `submit()` deep-copies the task closure and every nested closure into a per-task snapshot arena. The non-coroutine worker freed that arena the instant the task body returned (`thread_pool.c` task_cleanup), but any coroutine the task had `spawn()`ed was still pending in the worker's scheduler — its `op_array` lived in the just-freed arena, so the scheduler ran it against freed memory: `0xC0000005` on the Windows debug heap, latent (and ASAN-detectable) on Linux. Fixed by running each task body under its own per-task `Scope` in NOT-safe mode — a nursery: coroutines the task spawns now land in that scope, and on task exit the worker cancels the scope and awaits its teardown (mirroring `Scope::awaitAfterCancellation` at the C level) before the snapshot is freed, so no spawned coroutine can outlive the snapshot. Un-awaited children are cancelled at task exit, never awaited as zombies. Regression test `tests/thread_pool/065-task_scope_nursery_no_uaf.phpt` (surfaced by true-async/server `core/007-server-transfer-handler-dispatch`). Coroutine-mode `submit` already runs each task in its own scope but still frees the snapshot on the task coroutine's dispose rather than on scope completion — the same latent UAF, tracked as a follow-up.
- **file → socket `sendfile` silently dropped the body on Windows** — `ZEND_ASYNC_IO_SENDFILE` is documented as "Windows `TransmitFile`" (see the API entry under Added), but the implementation routed every platform through `uv_fs_sendfile`, whose Windows backend writes the destination via CRT `_write()`. A TCP destination is a Winsock `SOCKET`, not a CRT file descriptor — three distinct Windows descriptor namespaces (HANDLE / SOCKET / CRT fd) that POSIX collapses into one `int`, which is exactly why `uv_fs_sendfile(out->crt_fd, …)` works on Linux and writes nowhere on Windows. Headers (sent via `uv_write`) arrived; the file body vanished — so any static file above the 64 KiB slurp threshold served an empty/garbage body over plaintext HTTP/1 and HTTP/2. Fixed in `libuv_reactor.c`: `libuv_io_sendfile` now dispatches socket destinations to the real `TransmitFile` — destination socket from `descriptor.socket`, source as a Win32 `HANDLE` via `_get_osfhandle(crt_fd)` (never the conflated `crt_fd`-as-socket value) — run synchronously on a libuv threadpool worker (`uv_queue_work`), exactly as `uv_fs_sendfile` runs its own copy loop off-loop, so completion still arrives on the loop thread through the unchanged `sendfile_complete()` notify path. POSIX is untouched (the new path is entirely under `#ifdef PHP_WIN32`). Regression backstop: true-async/server `tests/phpt/server/static/011-static-precompressed-cache-uaf.phpt` (96 KiB precompressed sidecar over plain TCP).
- **#154 `ThreadPool` worker crash on `exit()`/`die()` in a task or bootloader** — a graceful `exit()`/`die()` inside a submitted task or the bootloader threw an unwind-exit token that the worker either passed to `reject()` or re-raised via `zend_bailout()` — both crash the worker fiber with `"Error transfer requires a throwable value"` (assert + ASAN stack-use-after-return at `zend_fibers.c:491`). Fixed in `thread_pool.c`: the sync task call and the bootloader call now run under a `zend_try`, and the worker checks `zend_is_unwind_exit()`/`zend_is_graceful_exit()` on `EG(exception)`. Behaviour: `exit()`/`die()` in a **task** is graceful "this task is done" — the worker's request survives it, so the task's future resolves to **`null`** and the worker keeps serving subsequent tasks (verified mixing `exit()` with throwing and normal tasks on a single worker). A real fatal error (e.g. OOM `zend_bailout`) or `exit()`/`die()` in the **bootloader** instead delivers an `Async\ThreadTransferException` to pending awaiters and tears the pool down — the worker can't safely continue. Regression tests `tests/thread_pool/061-bootloader_exit.phpt`, `062-task_exit_sync.phpt`, `064-task_exit_worker_survives.phpt`.
- **#154 `ThreadPool` swallowed the real error when a `$this`-bound bootloader could not load on the worker** — a bootloader bound to `$this` of a class not defined on the worker (or whose body threw) had its exception converted to an `E_WARNING` and discarded, while pending tasks were rejected with a generic `"task was cancelled before execution"` cancellation — hiding the cause. `thread_pool_drain_tasks` gained a `reject_with` parameter so the worker now propagates the actual error (e.g. `Cannot load transferred object: class "C" not found`, or the thrown exception) to every awaiter. Regression tests `tests/thread_pool/060-bootloader_this_transfer_error.phpt`, `063-bootloader_exception.phpt`.
Expand Down
46 changes: 46 additions & 0 deletions tests/thread_pool/065-task_scope_nursery_no_uaf.phpt
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
--TEST--
ThreadPool: task scope is a nursery — un-awaited spawned coroutines are cancelled at task exit (snapshot UAF regression)
--SKIPIF--
<?php
if (!PHP_ZTS) die('skip ZTS required');
if (!class_exists('Async\ThreadPool')) die('skip ThreadPool not available');
?>
--FILE--
<?php
/*
* Regression: a ThreadPool task deep-copies its closure (and every nested
* closure) into a per-task snapshot arena. Before the fix the worker freed
* that arena the moment the task body returned — while a coroutine the task
* had spawned was still pending in the scheduler. The pending coroutine's
* op_array lived in the just-freed arena, so running it dereferenced freed
* memory (hard crash on the Windows debug heap; latent + ASAN-caught on
* Linux).
*
* The fix runs each task body under its own per-task Scope in NOT-safe mode
* (a nursery): coroutines the task spawns land in that scope, and on task
* exit the scope is cancelled and awaited before the snapshot is freed — so
* no spawned coroutine can outlive the snapshot. An un-awaited child is
* therefore cancelled at task exit and never runs.
*/
use Async\ThreadPool;
use function Async\spawn;
use function Async\await;

$pool = new ThreadPool(1);

$f = $pool->submit(function () {
/* Un-awaited child — still pending when the task body returns. Before
* the fix this crashed the worker (use-after-free of the freed snapshot
* arena); now the task's nursery scope cancels it at exit, so its body
* ("child") never runs and nothing is freed out from under it. */
spawn(function () { echo "child\n"; });
return 'task-done';
});

var_dump(await($f));

$pool->close();
echo "ok\n";
--EXPECT--
string(9) "task-done"
ok
78 changes: 78 additions & 0 deletions thread_pool.c
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
#include "thread_pool.h"
#include "thread_pool_arginfo.h"
#include "thread.h"
#include "scope.h"
#include "async_API.h"
#include "exceptions.h"
#include "php_async.h"
Expand Down Expand Up @@ -338,6 +339,13 @@ static void thread_pool_worker_handler(zend_async_thread_event_t *event, void *c
ZVAL_UNDEF(&retval);
ZVAL_UNDEF(&callable);

/* Per-task nursery Scope (structured concurrency); used by the
* non-coroutine run path below. Declared up here so the early
* error gotos to task_cleanup never jump past an initializer. */
zend_async_scope_t *task_scope = NULL;
zend_async_scope_t *task_saved_scope = NULL;
zend_coroutine_t *task_worker_coro = NULL;

async_thread_create_closure(&snapshot->entry, &callable);

if (UNEXPECTED(EG(exception))) {
Expand Down Expand Up @@ -433,6 +441,26 @@ static void thread_pool_worker_handler(zend_async_thread_event_t *event, void *c
goto task_cleanup;
}

/* Structured concurrency: run the task body under its own
* per-task Scope in NOT-safe mode (a nursery). Async\spawn uses
* the current coroutine's scope, so coroutines the task spawns
* land in task_scope. Temporarily point the worker coroutine's
* scope at task_scope around the call; the worker itself is NOT a
* member of task_scope (never spawned into it), so it can await it
* below without tripping the same-scope deadlock guard. */
task_worker_coro = ZEND_ASYNC_CURRENT_COROUTINE;
task_scope = ZEND_ASYNC_NEW_SCOPE(ZEND_ASYNC_CURRENT_SCOPE);
if (EXPECTED(task_scope != NULL && task_worker_coro != NULL)) {
ZEND_ASYNC_SCOPE_CLR_DISPOSE_SAFELY(task_scope); /* not-safe */
/* Pin the scope so it survives the cancel/await below: when the
* last child finishes mid-SUSPEND the scope would otherwise
* try_to_dispose and free itself out from under us. Same pattern
* as pool_scope. Cleared right before RELEASE. */
ZEND_ASYNC_SCOPE_SET_OWNER_PINNED(task_scope);
task_saved_scope = task_worker_coro->scope;
task_worker_coro->scope = task_scope;
}

/* A real fatal error longjmps (zend_bailout); exit()/die() instead
* throws an unwind-exit token into EG(exception). Neither may be
* re-raised or passed to reject() — the token can't cross the fiber
Expand All @@ -444,6 +472,56 @@ static void thread_pool_worker_handler(zend_async_thread_event_t *event, void *c
task_bailed = true;
} zend_end_try();

/* Task body returned. Restore the worker's own scope, then cancel
* any coroutines the task spawned but left running and await their
* teardown BEFORE the snapshot arena (which backs every spawned
* closure's op_array) is freed at task_cleanup — so no child
* outlives the snapshot. NOT-safe cancel = stragglers/zombies are
* cancelled, never awaited forever (bounded by the runtime grace).
* Mirrors Scope::awaitAfterCancellation at the C level. */
if (task_scope != NULL && task_worker_coro != NULL) {
task_worker_coro->scope = task_saved_scope;

/* If every child already ran to completion during the task body
* (e.g. the body ran an event loop), the scope is already closed
* and its coroutines disposed — nothing to cancel or await, and
* touching the terminated scope event would throw. Only engage
* the nursery teardown while the scope is still open. */
if (!ZEND_ASYNC_SCOPE_IS_CLOSED(task_scope)) {
ZEND_ASYNC_SCOPE_CANCEL(task_scope, NULL, false, false);

/* Await until every coroutine the task spawned has PHYSICALLY
* disposed — not merely until the scope reports "completed".
* IS_COMPLETELY_DONE (can_be_disposed with_zombies=true) flips
* true the instant the children are cancelled, while their
* coroutine objects (and the op_arrays they hold, which live
* in the snapshot arena) are still queued for disposal on a
* later scheduler tick. Gate on the live coroutine/child-scope
* count instead — mirrors Scope::awaitAfterCancellation — so
* the SUSPEND lets each child run to full disposal (op_array
* freed) while the arena is still alive; cooperative scheduling
* guarantees they finish before we resume and free it. */
if (!ZEND_ASYNC_SCOPE_IS_CLOSED(task_scope)
&& (((async_scope_t *) task_scope)->coroutines.length != 0
|| task_scope->scopes.length != 0)
&& EXPECTED(EG(exception) == NULL)) {
ZEND_ASYNC_WAKER_NEW(task_worker_coro);
if (EXPECTED(EG(exception) == NULL)) {
zend_async_resume_when(task_worker_coro, &task_scope->event,
false, zend_async_waker_callback_resolve, NULL);
if (EXPECTED(EG(exception) == NULL)) {
ZEND_ASYNC_SUSPEND();
}
zend_async_waker_clean(task_worker_coro);
}
}
}

ZEND_ASYNC_SCOPE_CLR_OWNER_PINNED(task_scope);
ZEND_ASYNC_SCOPE_RELEASE(task_scope);
task_scope = NULL;
}

/* Decrement running and bump completed BEFORE notifying the awaiter
* via complete/reject — otherwise a coroutine waking from await()
* would observe stale running_count and a missing completed bump. */
Expand Down
Loading