From b07c1af637d1bc3fcd4100bb822b02c7150848c6 Mon Sep 17 00:00:00 2001 From: Matthew Parkinson Date: Wed, 6 May 2026 16:51:00 +0100 Subject: [PATCH 1/2] Add corruption-detection test for probabilistic mitigations Adds a new functional test, func/corruption_detection, that validates snmalloc's probabilistic memory-safety mitigations actually fire on the corruption patterns they are designed to catch. Without it, regressions that silently weakened a mitigation would be invisible to the existing suite, since every other test exercises only the non-failing arm of the integrity checks. Each scenario runs in a forked child so the expected abort does not kill the harness. Detection is reported as the child being killed by SIGABRT/SIGSEGV/SIGBUS/SIGILL; a clean exit means the corruption went undetected and the test fails. Six scenarios are covered, spanning the local-thread, remote-thread and large-allocation paths: * double_free - small alloc, two local frees of the same slot. Detected by freelist_backward_edge when the resulting cycle is later traversed. * uaf_freelist - small alloc, free, then write garbage into the freed slot's first two words (the obfuscated next/prev). Detected by check_prev on the next freelist consumption. * oob_into_neighbor - tiny allocs, free even slots, overrun from an odd live slot into freed neighbours. Detected by check_prev when the neighbour is later allocated. * remote_double_free - small alloc, free locally, then free again from a different thread (the second free travels via the remote message queue). Detected as !meta->is_unused() in the dealloc path. * remote_uaf - small alloc, free via a different thread, then write garbage through the dangling pointer while the slot sits on the owning allocator's pending-remote queue. Detected by check_prev during handle_message_queue_slow's drain - a code path no other test exercises. * large_double_free - allocation larger than any small sizeclass (handled by the chunk allocator and per-chunk metadata rather than the slab freelist), freed twice. Detected as !meta->is_unused() in the large-dealloc path. The test is Linux-only (uses fork()/waitpid()) and is a no-op when SNMALLOC_CHECK_CLIENT is not defined, since the mitigations it relies on are then compiled out. The test is also instrumented to cooperate with clang source-based coverage: the forked child re-resolves LLVM_PROFILE_FILE with its own pid (the parent's %p expansion is otherwise inherited and all children would write to the same file) and a signal handler flushes .profraw before re-raising the fatal signal. The runtime entry points are declared as weak symbols so the test still links in non-coverage builds. Picked up automatically by make_tests so it runs as both func-corruption_detection-fast and func-corruption_detection-check; the fast variant immediately exits with the "skip" message because the mitigations are off. --- .../corruption_detection.cc | 401 ++++++++++++++++++ 1 file changed, 401 insertions(+) create mode 100644 src/test/func/corruption_detection/corruption_detection.cc diff --git a/src/test/func/corruption_detection/corruption_detection.cc b/src/test/func/corruption_detection/corruption_detection.cc new file mode 100644 index 000000000..4410ca78c --- /dev/null +++ b/src/test/func/corruption_detection/corruption_detection.cc @@ -0,0 +1,401 @@ +/** + * Tests that snmalloc's probabilistic mitigations detect several + * classes of memory-safety violation: + * + * - double-free of a small allocation (local-thread path), + * - use-after-free that corrupts the intra-slab free list, + * - out-of-bounds writes that spill into a freed neighbour, + * - double-free crossing thread boundaries (the second free goes + * down the remote-message-queue path), + * - use-after-free of a slot that has been freed remotely (i.e. + * written through the dangling pointer while the slot sits on + * the owning allocator's pending-remote queue), + * - double-free of a large allocation that does not fit in any + * small sizeclass and is therefore handled by the chunk + * allocator and metadata path rather than the slab free list. + * + * snmalloc detects free-list corruption by checking the integrity + * of the obfuscated forward and backward edges of the intra-slab + * free list when the list is later consumed (allocated from), and + * detects double-free of large allocations by inspecting the + * per-chunk metadata. Detection is therefore probabilistic per + * round, but deterministic at the scale used here: each scenario + * performs many rounds across many slabs, and at least one of them + * is overwhelmingly likely to traverse the corrupted edge or hit + * the metadata check before the test would otherwise complete. + * + * Each scenario runs in a forked child so that the expected abort + * does not kill the test harness. Detection is reported as + * `WIFSIGNALED && WTERMSIG ∈ {SIGABRT, SIGSEGV, SIGBUS, SIGILL}`; + * a clean exit means the corruption was *not* detected and the test + * fails. + * + * The test is Linux-only (uses `fork()`/`waitpid()`). It is a no-op + * when `SNMALLOC_CHECK_CLIENT` is not defined, because none of the + * mitigations these tests rely on are compiled in. + */ + +#include +#include +#include +#include +#include +#include +#include + +#if defined(__linux__) +# include +# include +# include +#endif + +using namespace snmalloc; + +// Forward declarations of clang's source-based-coverage runtime +// entry points. Declared as weak symbols so the test still links +// against builds without `-fprofile-instr-generate -fcoverage-mapping`. +// +// `__llvm_profile_set_filename` is needed because the LLVM profile +// runtime resolves `%p` in `LLVM_PROFILE_FILE` exactly once at +// startup. Forked children inherit the parent's resolved filename +// and so all write to the same file, overwriting each other. Each +// child has to set its own filename (with its own pid) before +// calling `__llvm_profile_write_file`. +extern "C" int __llvm_profile_write_file(void) __attribute__((weak)); +extern "C" void __llvm_profile_set_filename(const char*) __attribute__((weak)); + +namespace +{ + // Per-scenario knobs. ROUNDS amplifies the per-round detection + // probability; N is the number of objects allocated per round; SIZE + // picks a small sizeclass so a few KiB of slab is exercised per + // round. + constexpr size_t ROUNDS = 1024; + constexpr size_t N = 64; + constexpr size_t SMALL_SIZE = 32; + constexpr size_t TINY_SIZE = 16; + // Cross-thread scenarios use fewer rounds; each round pays a + // thread-create/join cost and we still need only one detection. + constexpr size_t REMOTE_ROUNDS = 64; + // A size that is guaranteed to fall outside every small sizeclass + // and therefore exercises the chunk-allocator/metadata dealloc + // path rather than the slab free list. + constexpr size_t LARGE_SIZE = MIN_CHUNK_SIZE * 4; + + void try_double_free() + { + for (size_t r = 0; r < ROUNDS; r++) + { + void* ps[N]; + for (size_t i = 0; i < N; i++) + ps[i] = snmalloc::alloc(SMALL_SIZE); + + // Double-free a single slot. With sanity_checks, the second + // dealloc may fire immediately. With freelist_backward_edge + // alone, the resulting cycle in the doubly-linked free list is + // detected when the list is later traversed. + snmalloc::dealloc(ps[N / 2]); + snmalloc::dealloc(ps[N / 2]); + + // Free the rest (skipping the double-freed slot to avoid + // freeing an unrelated live allocation that happens to have + // been handed out from the same address) and reallocate to + // drive freelist consumption. + for (size_t i = 0; i < N; i++) + if (i != N / 2) + snmalloc::dealloc(ps[i]); + + for (size_t i = 0; i < N; i++) + ps[i] = snmalloc::alloc(SMALL_SIZE); + for (size_t i = 0; i < N; i++) + snmalloc::dealloc(ps[i]); + } + } + + void try_uaf_freelist_corruption() + { + for (size_t r = 0; r < ROUNDS; r++) + { + void* ps[N]; + for (size_t i = 0; i < N; i++) + ps[i] = snmalloc::alloc(SMALL_SIZE); + for (size_t i = 0; i < N; i++) + snmalloc::dealloc(ps[i]); + + // UAF: write into a freed slot. The first two pointer-sized + // words of a freed slot hold the obfuscated forward edge (and, + // with freelist_backward_edge enabled, a backward edge). + // Either de-obfuscation produces a wild pointer that fails + // domestication, or the doubly-linked invariant breaks. + auto* victim = static_cast(ps[N / 2]); + victim[0] = 0xDEADBEEFCAFEBABEULL; + victim[1] = 0xBADC0FFEE0DDF00DULL; + + // Drive the freelist by reallocating from the same sizeclass. + void* qs[N]; + for (size_t i = 0; i < N; i++) + qs[i] = snmalloc::alloc(SMALL_SIZE); + for (size_t i = 0; i < N; i++) + snmalloc::dealloc(qs[i]); + } + } + + // Free `p` from a freshly created thread, so the dealloc takes the + // remote-message-queue path rather than the local-freelist path. + // The thread is joined before returning, so `p` has definitely + // been handed off to the owning allocator's pending-remote queue + // (or already drained from it) by the time we return. + void remote_dealloc(void* p) + { + std::thread t([p]() { snmalloc::dealloc(p); }); + t.join(); + } + + void try_remote_double_free() + { + for (size_t r = 0; r < REMOTE_ROUNDS; r++) + { + void* ps[N]; + for (size_t i = 0; i < N; i++) + ps[i] = snmalloc::alloc(SMALL_SIZE); + + // First free is local (this thread allocated). Second free is + // from a different thread, so it goes through the remote + // message queue and ends up being inserted onto the owning + // allocator's free list a second time. The resulting cycle is + // detected on the next traversal. + void* victim = ps[N / 2]; + snmalloc::dealloc(victim); + remote_dealloc(victim); + + for (size_t i = 0; i < N; i++) + if (i != N / 2) + snmalloc::dealloc(ps[i]); + // Drive freelist consumption so the corruption is observed. + for (size_t i = 0; i < N; i++) + ps[i] = snmalloc::alloc(SMALL_SIZE); + for (size_t i = 0; i < N; i++) + snmalloc::dealloc(ps[i]); + } + } + + void try_remote_uaf() + { + for (size_t r = 0; r < REMOTE_ROUNDS; r++) + { + void* ps[N]; + for (size_t i = 0; i < N; i++) + ps[i] = snmalloc::alloc(SMALL_SIZE); + + // Free everything via a different thread. The slots travel + // through the remote message queue back to this allocator and + // end up on its free list. While in flight (or once parked on + // the free list) the obfuscated next/prev fields live in the + // first words of the slot. + for (size_t i = 0; i < N; i++) + remote_dealloc(ps[i]); + + // UAF write through the now-dangling pointer. This corrupts + // the freelist node that the owning allocator will traverse + // when it next allocates from this slab. + auto* victim = static_cast(ps[N / 2]); + victim[0] = 0xDEADBEEFCAFEBABEULL; + victim[1] = 0xBADC0FFEE0DDF00DULL; + + void* qs[N]; + for (size_t i = 0; i < N; i++) + qs[i] = snmalloc::alloc(SMALL_SIZE); + for (size_t i = 0; i < N; i++) + snmalloc::dealloc(qs[i]); + } + } + + void try_large_double_free() + { + // Large allocations bypass the slab free list. Detection here + // comes from the chunk-allocator/metadata path: the second + // dealloc finds the per-chunk metadata in a state inconsistent + // with an owned live allocation. One round is normally enough, + // but loop a few times so a single missed detection (e.g. a + // metadata layout that masks the problem) still trips on a + // later round. + for (size_t r = 0; r < 16; r++) + { + void* p = snmalloc::alloc(LARGE_SIZE); + snmalloc::dealloc(p); + snmalloc::dealloc(p); + } + } + + void try_oob_into_neighbor() + { + for (size_t r = 0; r < ROUNDS; r++) + { + void* ps[N]; + for (size_t i = 0; i < N; i++) + ps[i] = snmalloc::alloc(TINY_SIZE); + + // Free even-indexed slots so their freelist headers occupy + // their first bytes. + for (size_t i = 0; i < N; i += 2) + snmalloc::dealloc(ps[i]); + + // From an odd (still-allocated) slot, write a generous overrun + // past its bounds. The exact layout of adjacent slots within a + // slab is implementation-defined, so we splatter several + // sizeclass-widths of garbage to ensure we land on at least one + // freed neighbour's freelist node header regardless of layout. + auto* p = static_cast(ps[1]); + // Use a volatile write loop rather than memset so the compiler + // does not emit a -Wstringop-overflow diagnostic on the + // intentionally out-of-bounds write. + for (size_t k = TINY_SIZE; k < TINY_SIZE * 4; k++) + const_cast(p)[k] = 0xAB; + + // Free the surviving slots and reallocate to drive freelist + // traversal; the corrupted neighbour will be encountered. + for (size_t i = 1; i < N; i += 2) + snmalloc::dealloc(ps[i]); + for (size_t i = 0; i < N; i++) + ps[i] = snmalloc::alloc(TINY_SIZE); + for (size_t i = 0; i < N; i++) + snmalloc::dealloc(ps[i]); + } + } + +#if defined(__linux__) + // Signal handler that runs in the forked child when snmalloc's + // mitigation paths abort/segfault. It flushes coverage data (if the + // process is instrumented) and then re-raises the signal with its + // default disposition so the parent observes WIFSIGNALED. Without + // this the abort kills the child before the LLVM profile runtime + // gets a chance to write its .profraw, so the detection paths show + // up as uncovered. + extern "C" void corruption_signal_handler(int sig) + { + if (&__llvm_profile_write_file != nullptr) + __llvm_profile_write_file(); + signal(sig, SIG_DFL); + raise(sig); + } + + // Run `fn` in a forked child and return 0 if the child died with a + // fatal signal (corruption detected) or 1 otherwise (corruption + // missed, or unexpected exit). + int run_in_child(const char* name, void (*fn)()) + { + pid_t pid = fork(); + if (pid < 0) + { + perror("fork"); + return 1; + } + if (pid == 0) + { + // Re-evaluate the LLVM profile filename so this child's + // .profraw doesn't collide with its siblings' or its parent's. + // The parent's `LLVM_PROFILE_FILE` (with `%p`) was resolved at + // startup using the parent's pid; without resetting it here, + // every fork+abort writes to the same path. Substitute `%p` + // with the child's pid explicitly because the profile runtime + // may have already cached the parent's expansion. + if ( + &__llvm_profile_set_filename != nullptr && + getenv("LLVM_PROFILE_FILE") != nullptr) + { + char buf[1024]; + const char* tmpl = getenv("LLVM_PROFILE_FILE"); + size_t out = 0; + for (size_t i = 0; tmpl[i] != '\0' && out + 16 < sizeof(buf); i++) + { + if (tmpl[i] == '%' && tmpl[i + 1] == 'p') + { + int n = snprintf( + buf + out, sizeof(buf) - out, "%d", static_cast(getpid())); + out += static_cast(n); + i++; + } + else + { + buf[out++] = tmpl[i]; + } + } + buf[out] = '\0'; + __llvm_profile_set_filename(buf); + } + // Install a coverage-flushing handler for the signals snmalloc + // raises on detected corruption. The handler re-raises with + // default disposition so the parent still sees WIFSIGNALED. + for (int s : {SIGABRT, SIGSEGV, SIGBUS, SIGILL}) + signal(s, corruption_signal_handler); + fn(); + // If we get here, none of the mitigations fired across all + // rounds. The parent will treat a clean exit as a test failure. + fprintf(stderr, "%s: corruption NOT detected after all rounds\n", name); + _exit(0); + } + int status = 0; + waitpid(pid, &status, 0); + if (WIFSIGNALED(status)) + { + int sig = WTERMSIG(status); + if (sig == SIGABRT || sig == SIGSEGV || sig == SIGBUS || sig == SIGILL) + { + printf("%s: detected (signal %d)\n", name, sig); + return 0; + } + fprintf(stderr, "%s: child died with unexpected signal %d\n", name, sig); + return 1; + } + if (WIFEXITED(status)) + { + fprintf( + stderr, + "%s: child exited normally (corruption not detected, exit %d)\n", + name, + WEXITSTATUS(status)); + return 1; + } + fprintf(stderr, "%s: unexpected child wait status 0x%x\n", name, status); + return 1; + } +#endif +} // namespace + +int main() +{ + setup(); + +#if !defined(__linux__) + printf( + "Skipping corruption-detection test: requires Linux fork()/waitpid()\n"); + return 0; +#else + if constexpr (!CHECK_CLIENT) + { + printf( + "Skipping corruption-detection test: SNMALLOC_CHECK_CLIENT off\n"); + return 0; + } + + int failures = 0; + failures += run_in_child("double_free", try_double_free); + failures += run_in_child("uaf_freelist", try_uaf_freelist_corruption); + failures += run_in_child("oob_into_neighbor", try_oob_into_neighbor); + failures += run_in_child("remote_double_free", try_remote_double_free); + failures += run_in_child("remote_uaf", try_remote_uaf); + failures += run_in_child("large_double_free", try_large_double_free); + + if (failures != 0) + { + fprintf( + stderr, + "FAILED: %d corruption-detection sub-test(s) reported the corruption " + "was not caught by the allocator's mitigations.\n", + failures); + return 1; + } + printf("PASSED\n"); + return 0; +#endif +} From 474c2ce0ccfa2f1e85da377fb9c507c40eb6b6b5 Mon Sep 17 00:00:00 2001 From: Matthew Parkinson Date: Sun, 10 May 2026 09:43:33 +0100 Subject: [PATCH 2/2] Fix corruption-detection test on Mac, Windows, and Linux MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three issues surfaced in CI for the new corruption-detection test: 1. Linux: `large_double_free` did not detect any corruption. The subtest used `LARGE_SIZE = MIN_CHUNK_SIZE * 4 = 64 KiB`, which on the default Linux config is `MAX_SMALL_SIZECLASS_SIZE` — i.e. the largest *small* sizeclass — so the allocations went through the slab free-list path and never reached the chunk-allocator double-free check at all. Use `MAX_SMALL_SIZECLASS_SIZE * 2` so the size unambiguously falls into the large range. Once the test actually exercises the right path, the existing `is_backend_owned()` check in `dealloc_remote` (gated on the `sanity_checks` mitigation, which is part of `full_checks` in a default `SNMALLOC_CHECK_CLIENT` build) flags the double-free. 2. Mac: `-Wunused-function` errors for every `try_*` helper. The helpers are referenced only from `run_in_child`, which is already gated on `__linux__`. Move the helpers and the LLVM profile externs inside the same `#if defined(__linux__)` block so non-Linux builds compile cleanly. The non-Linux `main` already prints a "skipping" message and returns 0. 3. Windows: `__attribute__((weak))` is not portable to MSVC and there is no `SNMALLOC_WEAK` macro in `defines.h`. The weak symbols are only used by the Linux-only fork harness for coverage-flush, so gating them on `__linux__` is the natural fix. Also use `static_cast(0xDEADBEEFu)`-style literals for the UAF freelist-corruption writes so MSVC does not warn about narrowing on 32-bit Windows (C4305/C4309). The exact bit pattern does not matter: any non-zero garbage in the freelist node header will fail domestication or the doubly-linked invariant check. Verified locally: all 6 subtests now detect corruption (including large_double_free, which detects via signal 4 / SIGILL from the sanity_checks mitigation). --- .../corruption_detection.cc | 56 +++++++++++-------- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/src/test/func/corruption_detection/corruption_detection.cc b/src/test/func/corruption_detection/corruption_detection.cc index 4410ca78c..5e80c76fb 100644 --- a/src/test/func/corruption_detection/corruption_detection.cc +++ b/src/test/func/corruption_detection/corruption_detection.cc @@ -17,12 +17,14 @@ * snmalloc detects free-list corruption by checking the integrity * of the obfuscated forward and backward edges of the intra-slab * free list when the list is later consumed (allocated from), and - * detects double-free of large allocations by inspecting the - * per-chunk metadata. Detection is therefore probabilistic per - * round, but deterministic at the scale used here: each scenario - * performs many rounds across many slabs, and at least one of them - * is overwhelmingly likely to traverse the corrupted edge or hit - * the metadata check before the test would otherwise complete. + * detects double-free of large allocations via the + * `is_backend_owned()` check on the per-chunk metadata in + * `dealloc_remote` (gated on the `sanity_checks` mitigation). + * Detection is therefore probabilistic per round, but deterministic + * at the scale used here: each scenario performs many rounds across + * many slabs, and at least one of them is overwhelmingly likely to + * traverse the corrupted edge or hit the metadata check before the + * test would otherwise complete. * * Each scenario runs in a forked child so that the expected abort * does not kill the test harness. Detection is reported as @@ -47,13 +49,13 @@ # include # include # include -#endif - -using namespace snmalloc; // Forward declarations of clang's source-based-coverage runtime // entry points. Declared as weak symbols so the test still links // against builds without `-fprofile-instr-generate -fcoverage-mapping`. +// Gated to Linux because (a) the entire fork-based test harness is +// Linux-only, and (b) `__attribute__((weak))` is not portable to +// MSVC and there is no equivalent `SNMALLOC_WEAK` macro. // // `__llvm_profile_set_filename` is needed because the LLVM profile // runtime resolves `%p` in `LLVM_PROFILE_FILE` exactly once at @@ -63,7 +65,11 @@ using namespace snmalloc; // calling `__llvm_profile_write_file`. extern "C" int __llvm_profile_write_file(void) __attribute__((weak)); extern "C" void __llvm_profile_set_filename(const char*) __attribute__((weak)); +#endif + +using namespace snmalloc; +#if defined(__linux__) namespace { // Per-scenario knobs. ROUNDS amplifies the per-round detection @@ -79,8 +85,10 @@ namespace constexpr size_t REMOTE_ROUNDS = 64; // A size that is guaranteed to fall outside every small sizeclass // and therefore exercises the chunk-allocator/metadata dealloc - // path rather than the slab free list. - constexpr size_t LARGE_SIZE = MIN_CHUNK_SIZE * 4; + // path rather than the slab free list. Using `MAX_SMALL_SIZECLASS_SIZE` + // directly would still produce a small allocation (it is the upper + // bound, inclusive), so use twice that. + constexpr size_t LARGE_SIZE = MAX_SMALL_SIZECLASS_SIZE * 2; void try_double_free() { @@ -127,9 +135,11 @@ namespace // with freelist_backward_edge enabled, a backward edge). // Either de-obfuscation produces a wild pointer that fails // domestication, or the doubly-linked invariant breaks. + // Use literals that fit in 32-bit uintptr_t too, so MSVC + // doesn't warn about narrowing on 32-bit Windows builds. auto* victim = static_cast(ps[N / 2]); - victim[0] = 0xDEADBEEFCAFEBABEULL; - victim[1] = 0xBADC0FFEE0DDF00DULL; + victim[0] = static_cast(0xDEADBEEFu); + victim[1] = static_cast(0xBADC0FFEu); // Drive the freelist by reallocating from the same sizeclass. void* qs[N]; @@ -199,8 +209,8 @@ namespace // the freelist node that the owning allocator will traverse // when it next allocates from this slab. auto* victim = static_cast(ps[N / 2]); - victim[0] = 0xDEADBEEFCAFEBABEULL; - victim[1] = 0xBADC0FFEE0DDF00DULL; + victim[0] = static_cast(0xDEADBEEFu); + victim[1] = static_cast(0xBADC0FFEu); void* qs[N]; for (size_t i = 0; i < N; i++) @@ -212,13 +222,12 @@ namespace void try_large_double_free() { - // Large allocations bypass the slab free list. Detection here - // comes from the chunk-allocator/metadata path: the second - // dealloc finds the per-chunk metadata in a state inconsistent - // with an owned live allocation. One round is normally enough, - // but loop a few times so a single missed detection (e.g. a - // metadata layout that masks the problem) still trips on a - // later round. + // Large allocations bypass the slab free list. The second + // dealloc reaches `dealloc_remote` with a metaentry that + // `claim_for_backend()` has marked `is_backend_owned()`, and + // the `sanity_checks` mitigation flags this directly. + // One round is normally enough, but loop a few times so a + // single missed detection still trips on a later round. for (size_t r = 0; r < 16; r++) { void* p = snmalloc::alloc(LARGE_SIZE); @@ -263,7 +272,6 @@ namespace } } -#if defined(__linux__) // Signal handler that runs in the forked child when snmalloc's // mitigation paths abort/segfault. It flushes coverage data (if the // process is instrumented) and then re-raises the signal with its @@ -359,8 +367,8 @@ namespace fprintf(stderr, "%s: unexpected child wait status 0x%x\n", name, status); return 1; } -#endif } // namespace +#endif int main() {