Skip to content

Fix safepoint stack slot reuse#13469

Closed
angelnereira wants to merge 1 commit into
bytecodealliance:mainfrom
angelnereira:fix-gc-safepoint-slot-reuse
Closed

Fix safepoint stack slot reuse#13469
angelnereira wants to merge 1 commit into
bytecodealliance:mainfrom
angelnereira:fix-gc-safepoint-slot-reuse

Conversation

@angelnereira
Copy link
Copy Markdown
Contributor

Summary

Fixes #13461.

The safepoint spiller walks instructions backwards and can free a stack slot for a value defined by a safepoint instruction before assigning stack-map slots for values live across that same safepoint. If the freed slot is reused, the stack map can point at a slot that contains the instruction result rather than the value that must remain live across the call.

This changes the rewrite order so safepoint stack-map entries are assigned before result slots for that instruction are freed for reuse. It also adds the GC regression test from the issue to cover the null-reference trap that exposed this.

Testing

  • cargo fmt --all -- --check
  • cargo test -p cranelift-frontend safepoints
  • cargo build -p wasmtime-cli
  • target/debug/wasmtime wast -Wgc -Wfunction-references -Wwide-arithmetic -Wsimd -Wthreads -Wreference-types /tmp/safepoint-reload-aliased-ref-null.wast
  • WASMTIME_TEST_GC_KEYWORDS=safepoint-reload-aliased-ref-null cargo test -p wasmtime-cli --test wast safepoint-reload-aliased-ref-null

@angelnereira angelnereira requested review from a team as code owners May 24, 2026 04:51
@angelnereira angelnereira requested review from fitzgen and removed request for a team May 24, 2026 04:51
gfx added a commit to wado-lang/wasmtime that referenced this pull request May 24, 2026
Backport of bytecodealliance#13469 (fixes bytecodealliance#13461).

The safepoint spiller walks instructions backwards and can free a stack
slot for a value defined by a safepoint instruction before assigning
stack-map slots for values live across that same safepoint. If the freed
slot is reused, the stack map can point at a slot that contains the
instruction result rather than the value that must remain live across the
call. This surfaced as a `null reference` trap on a provably non-null GC
ref carried across a call safepoint.

Reorder the rewrite so safepoint stack-map entries are assigned before
result slots for that instruction are freed for reuse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gfx
Copy link
Copy Markdown
Contributor

gfx commented May 24, 2026

Confirmed — this fixes #13461 on our end.

We originally hit the null reference trap in Wado, whose compiler emits Wasm GC code with (ref null $t) values that stay live across call safepoints.

What I did:

Thanks for the quick turnaround.

gfx added a commit to wado-lang/wado that referenced this pull request May 24, 2026
Bump the vendor/wasmtime fork (wado-lang/wasmtime, gfx/wasmtime-45) to
pick up the backport of bytecodealliance/wasmtime#13469, which fixes the
Cranelift safepoint-spiller regression (#13461) that miscompiled GC refs
live across call safepoints into null reads.

This unblocks the wasmtime 44->45 upgrade: the 7 previously failing e2e
tests (serde_json_* and value_copy_nested_array_helper_chain) now pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gfx added a commit to wado-lang/wado that referenced this pull request May 24, 2026
Bump the vendor/wasmtime fork (wado-lang/wasmtime, gfx/wasmtime-45) to
pick up the backport of bytecodealliance/wasmtime#13469, which fixes the
Cranelift safepoint-spiller regression (#13461) that miscompiled GC refs
live across call safepoints into null reads.

This unblocks the wasmtime 44->45 upgrade: the 7 previously failing e2e
tests (serde_json_* and value_copy_nested_array_helper_chain) now pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gfx added a commit to wado-lang/wado that referenced this pull request May 24, 2026
Bump the vendor/wasmtime fork (wado-lang/wasmtime, gfx/wasmtime-45) to
pick up the backport of bytecodealliance/wasmtime#13469, which fixes the
Cranelift safepoint-spiller regression (#13461) that miscompiled GC refs
live across call safepoints into null reads.

This unblocks the wasmtime 44->45 upgrade: the 7 previously failing e2e
tests (serde_json_* and value_copy_nested_array_helper_chain) now pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the cranelift Issues related to the Cranelift code generator label May 24, 2026
@fitzgen
Copy link
Copy Markdown
Member

fitzgen commented May 26, 2026

Hi @angelnereira, thanks for the PR. However, it seems like the write up is a AI text. Please review https://github.com/bytecodealliance/governance/blob/main/AI_TOOL_POLICY.md, in particular regardless whether you are using AI as a tool yourself, you must review and own that output, and you must not simply use AI output for comments, PR descriptions, etc. You must fully own your contributions and take responsibility for them, not foist the work of understanding what the LLM did on project maintainers.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is too large and effectively useless. Maintainers cannot understand it, and if it ever regressed in the future due to a reintroduction of this bug or one like it, we wouldn't be able to diagnose what is happening. Please make a test case that is no more than ~100 lines long. Which you should be able to do as the contributor taking responsibility for your own pull request and understanding the bug it is fixing.

@angelnereira
Copy link
Copy Markdown
Contributor Author

Hi @angelnereira, thanks for the PR. However, it seems like the write up is a AI text. Please review https://github.com/bytecodealliance/governance/blob/main/AI_TOOL_POLICY.md, in particular regardless whether you are using AI as a tool yourself, you must review and own that output, and you must not simply use AI output for comments, PR descriptions, etc. You must fully own your contributions and take responsibility for them, not foist the work of understanding what the LLM did on project maintainers.

Thanks for pointing this out.

My native language is Spanish, so I sometimes use tools to help translate or synthesize comments in English. That said, I understand the concern and the policy: I am responsible for fully reviewing, understanding, and owning anything I post.

I’ll be more careful going forward and make sure my comments and PR descriptions are written and reviewed by me, and only posted when I can fully stand behind them.

Sorry for the noise, and thanks for the clarification.

@fitzgen
Copy link
Copy Markdown
Member

fitzgen commented May 26, 2026

My native language is Spanish, so I sometimes use tools to help translate or synthesize comments in English. That said, I understand the concern and the policy: I am responsible for fully reviewing, understanding, and owning anything I post.

To be clear, using an LLM to translate a human-written comment from Spanish to English is perfectly fine. Having the LLM write an english comment based on a Spanish prompt is not.

Thanks! Appreciate that you are receptive to this feedback.

@angelnereira angelnereira force-pushed the fix-gc-safepoint-slot-reuse branch from 4c52f1d to 748f399 Compare May 27, 2026 00:10
@angelnereira
Copy link
Copy Markdown
Contributor Author

Thanks for the review. I removed the large WAST regression
test and replaced it with a focused unit test for the safepoint spiller.

The new test directly checks the slot-reuse condition fixed here: a safepoint result's stack slot must not be reused for another
value that is live across that same safepoint.

I verified it with:

cargo test -p cranelift-frontend safepoint_reserves_live_slots_before_freeing_result_slots
cargo test -p cranelift-frontend safepoints

  The safepoint spiller walks instructions backwards. Before this change, it
  could free a stack slot for the value defined by a safepoint instruction before
  reserving stack-map slots for values live across that same safepoint. That made
  it possible to reuse the same slot for both values.

  Rewrite safepoints before rewriting the instruction results, so live-across
  values reserve their stack-map slots before result slots are returned to the
  free list.

  Replace the large WAST regression test with a focused safepoint-spiller unit
  test that directly checks this slot-reuse condition.

  Testing:
  - cargo test -p cranelift-frontend safepoint_reserves_live_slots_before_freeing_result_slots
  - cargo test -p cranelift-frontend safepoints
@angelnereira angelnereira force-pushed the fix-gc-safepoint-slot-reuse branch from 748f399 to 33c80eb Compare May 27, 2026 00:16
Comment on lines +3149 to +3166
let mut spiller = SafepointSpiller::default();
spiller.liveness.post_order.push(block0);
spiller.liveness.live_across_any_safepoint.insert(live);
spiller
.liveness
.safepoints
.insert(call, [live].into_iter().collect());

let result_slot = spiller
.stack_slots
.get_or_create_stack_slot(&mut func, result);
spiller.rewrite(&mut func);

let live_slot = spiller.stack_slots.get(live).unwrap();
assert_ne!(
result_slot, live_slot,
"the safepoint result slot must not be reused for a value live across that same safepoint"
);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is a little too low-level to really be very useful, because the bug is not in the low-level get-or-create-stack-slot APIs, it is in the order that those APIs are called when rewriting the whole function based on the liveness analysis.

It should be possible to make a test at the whole CLIF function level which asserts the expected output CLIF after running the safepoint spiller via assert_eq_output!(...), similar to e.g. the needs_stack_map_and_loop test in this test module, but which exercises this bug and checks for regressions. If I understand correctly, what is needed is something like this:

block0(v0: i64):
    v1 = call f(v0)
    ;; v1 needs inclusion in stack maps
    v2 = call f(v1)
    ;; v2 needs inclusion in stack maps
    v3 = call f(v2)
    return v3

That is, we have two values that need inclusion in stack maps, have the same type and non-overlapping live ranges and therefore could possibly reuse the same stack slot, and the live range for one ends at a safepoint.

This might not be the exact shape necessary to trigger the bug. It might require another call or that the values have longer live ranges across additional safepoints. I'm not exactly sure, but you should be able to come up with something based off this initial starting point. Basically just look at the low-level API call sequence you're currently making and craft a CLIF function that will trigger that same low-level API call sequence.

Please make sure that the invalid stack slot reuse is present in this test without the fix, and then that the invalid stack slot reuse goes away after the fix is reapplied.

@vouillon
Copy link
Copy Markdown
Contributor

I think this is superseded by #13480. The root cause is that loop-invariant values aren't tracked properly in the rewrite walk. Reordering rewrite_safepoint and rewrite_def only covers the case where the def is also a safepoint. In #13480's regression test the slot is freed by a plain def (an iconst), so the reorder doesn't help. #13480 fixes the general case by reserving the slots for loop-invariant values up front, after which the order of the two calls no longer matters.

@fitzgen
Copy link
Copy Markdown
Member

fitzgen commented May 27, 2026

Closing in favor of #13498 but if you can create a test case that still fails and isn't fixed by that PR, then please open a new PR/issue! Thanks

@fitzgen fitzgen closed this May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cranelift: GC reference reads back as null after a call safepoint (regression from #13228)

4 participants