Add corruption-detection test for probabilistic mitigations by mjp41 · Pull Request #848 · microsoft/snmalloc

mjp41 · 2026-05-10T08:18:47Z

Adds a new functional test, func/corruption_detection, that validates snmalloc's probabilistic memory-safety mitigations actually fire on the corruption patterns they are designed to catch. Without it, regressions that silently weakened a mitigation would be invisible to the existing suite, since every other test exercises only the non-failing arm of the integrity checks.

Each scenario runs in a forked child so the expected abort does not kill the harness. Detection is reported as the child being killed by SIGABRT/SIGSEGV/SIGBUS/SIGILL; a clean exit means the corruption went undetected and the test fails.

Six scenarios are covered, spanning the local-thread, remote-thread and large-allocation paths:

double_free - small alloc, two local frees of the same
slot. Detected by freelist_backward_edge
when the resulting cycle is later
traversed.
uaf_freelist - small alloc, free, then write garbage
into the freed slot's first two words
(the obfuscated next/prev). Detected by
check_prev on the next freelist
consumption.
oob_into_neighbor - tiny allocs, free even slots, overrun
from an odd live slot into freed
neighbours. Detected by check_prev when
the neighbour is later allocated.
remote_double_free - small alloc, free locally, then free
again from a different thread (the
second free travels via the remote
message queue). Detected as
!meta->is_unused() in the dealloc path.
remote_uaf - small alloc, free via a different
thread, then write garbage through the
dangling pointer while the slot sits on
the owning allocator's pending-remote
queue. Detected by check_prev during
handle_message_queue_slow's drain - a
code path no other test exercises.
large_double_free - allocation larger than any small
sizeclass (handled by the chunk
allocator and per-chunk metadata rather
than the slab freelist), freed twice.
Detected as !meta->is_unused() in the
large-dealloc path.

The test is Linux-only (uses fork()/waitpid()) and is a no-op when SNMALLOC_CHECK_CLIENT is not defined, since the mitigations it relies on are then compiled out.

The test is also instrumented to cooperate with clang source-based coverage: the forked child re-resolves LLVM_PROFILE_FILE with its own pid (the parent's %p expansion is otherwise inherited and all children would write to the same file) and a signal handler flushes .profraw before re-raising the fatal signal. The runtime entry points are declared as weak symbols so the test still links in non-coverage builds.

Picked up automatically by make_tests so it runs as both func-corruption_detection-fast and func-corruption_detection-check; the fast variant immediately exits with the "skip" message because the mitigations are off.

Adds a new functional test, func/corruption_detection, that validates snmalloc's probabilistic memory-safety mitigations actually fire on the corruption patterns they are designed to catch. Without it, regressions that silently weakened a mitigation would be invisible to the existing suite, since every other test exercises only the non-failing arm of the integrity checks. Each scenario runs in a forked child so the expected abort does not kill the harness. Detection is reported as the child being killed by SIGABRT/SIGSEGV/SIGBUS/SIGILL; a clean exit means the corruption went undetected and the test fails. Six scenarios are covered, spanning the local-thread, remote-thread and large-allocation paths: * double_free - small alloc, two local frees of the same slot. Detected by freelist_backward_edge when the resulting cycle is later traversed. * uaf_freelist - small alloc, free, then write garbage into the freed slot's first two words (the obfuscated next/prev). Detected by check_prev on the next freelist consumption. * oob_into_neighbor - tiny allocs, free even slots, overrun from an odd live slot into freed neighbours. Detected by check_prev when the neighbour is later allocated. * remote_double_free - small alloc, free locally, then free again from a different thread (the second free travels via the remote message queue). Detected as !meta->is_unused() in the dealloc path. * remote_uaf - small alloc, free via a different thread, then write garbage through the dangling pointer while the slot sits on the owning allocator's pending-remote queue. Detected by check_prev during handle_message_queue_slow's drain - a code path no other test exercises. * large_double_free - allocation larger than any small sizeclass (handled by the chunk allocator and per-chunk metadata rather than the slab freelist), freed twice. Detected as !meta->is_unused() in the large-dealloc path. The test is Linux-only (uses fork()/waitpid()) and is a no-op when SNMALLOC_CHECK_CLIENT is not defined, since the mitigations it relies on are then compiled out. The test is also instrumented to cooperate with clang source-based coverage: the forked child re-resolves LLVM_PROFILE_FILE with its own pid (the parent's %p expansion is otherwise inherited and all children would write to the same file) and a signal handler flushes .profraw before re-raising the fatal signal. The runtime entry points are declared as weak symbols so the test still links in non-coverage builds. Picked up automatically by make_tests so it runs as both func-corruption_detection-fast and func-corruption_detection-check; the fast variant immediately exits with the "skip" message because the mitigations are off.

Three issues surfaced in CI for the new corruption-detection test: 1. Linux: `large_double_free` did not detect any corruption. The subtest used `LARGE_SIZE = MIN_CHUNK_SIZE * 4 = 64 KiB`, which on the default Linux config is `MAX_SMALL_SIZECLASS_SIZE` — i.e. the largest *small* sizeclass — so the allocations went through the slab free-list path and never reached the chunk-allocator double-free check at all. Use `MAX_SMALL_SIZECLASS_SIZE * 2` so the size unambiguously falls into the large range. Once the test actually exercises the right path, the existing `is_backend_owned()` check in `dealloc_remote` (gated on the `sanity_checks` mitigation, which is part of `full_checks` in a default `SNMALLOC_CHECK_CLIENT` build) flags the double-free. 2. Mac: `-Wunused-function` errors for every `try_*` helper. The helpers are referenced only from `run_in_child`, which is already gated on `__linux__`. Move the helpers and the LLVM profile externs inside the same `#if defined(__linux__)` block so non-Linux builds compile cleanly. The non-Linux `main` already prints a "skipping" message and returns 0. 3. Windows: `__attribute__((weak))` is not portable to MSVC and there is no `SNMALLOC_WEAK` macro in `defines.h`. The weak symbols are only used by the Linux-only fork harness for coverage-flush, so gating them on `__linux__` is the natural fix. Also use `static_cast<uintptr_t>(0xDEADBEEFu)`-style literals for the UAF freelist-corruption writes so MSVC does not warn about narrowing on 32-bit Windows (C4305/C4309). The exact bit pattern does not matter: any non-zero garbage in the freelist node header will fail domestication or the doubly-linked invariant check. Verified locally: all 6 subtests now detect corruption (including large_double_free, which detects via signal 4 / SIGILL from the sanity_checks mitigation).

mjp41 added 2 commits May 9, 2026 21:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add corruption-detection test for probabilistic mitigations#848

Add corruption-detection test for probabilistic mitigations#848
mjp41 wants to merge 2 commits intomicrosoft:mainfrom
mjp41:coverage_failures

mjp41 commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mjp41 commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant