Skip to content

Define a new "wasm-multivalue" calling convention#268

Open
alexcrichton wants to merge 2 commits into
WebAssembly:mainfrom
alexcrichton:wasm-multivalue-calling-convention
Open

Define a new "wasm-multivalue" calling convention#268
alexcrichton wants to merge 2 commits into
WebAssembly:mainfrom
alexcrichton:wasm-multivalue-calling-convention

Conversation

@alexcrichton
Copy link
Copy Markdown
Collaborator

This commit is an attempt to define a new calling convention for WebAssembly which I've provisionally decided to call "wasm-multivalue". This new calling convention is inspired by discussions on #247 and #88, and the primary motivation of this calling convention is to be able to use multi-return signatures for intrinsics/functions in the component model.

This is a definition of a new, parallel, calling convention to the existing, now called "C", calling convention. The intention is that this avoids breakage to existing programs. The new calling convention is opt-in, requiring an annotation. The multivalue target feature enables usage of "wasm-multivalue", but enabling or disabling multivalue-the-target-feature has no effect on the "C" calling convention.

The technical definition of this new "wasm-multivalue" calling convention is to alter the previous calling convention in three ways:

  • Primarily, struct returns are now returned directly if all fields are (optionally recursively) scalars. This means that returning a struct-by-value gets translated to multiple return values.

  • To account for Pass small structs in parameters instead of memory #88 this new calling convention additionally passes two-scalar-argument struct definitions directly instead of indirectly, matching what native calling conventions do for example.

  • Finally, __int128 and long double return values are now returned directly instead of indirectly.

On an implementation side of things, what I'd like to do is to gain consensus on this calling convention definition in the tool-conventions repository as a first step. Once the definition is settled I plan to go to LLVM/Clang to implement this calling convention, and after that I plan to go to the Rust compiler and implement the calling convention there. My plan for Rust is to implement extern "wasm-multivalue" fn(..) syntactically, and for C to use
__attribute__((wasm_multivalue)) as the opt-in. Note that this is all bikesheddable, of course.

This commit is an attempt to define a new calling convention for
WebAssembly which I've provisionally decided to call "wasm-multivalue".
This new calling convention is inspired by discussions on WebAssembly#247 and WebAssembly#88,
and the primary motivation of this calling convention is to be able to
use multi-return signatures for intrinsics/functions in the component
model.

This is a definition of a new, parallel, calling convention to the
existing, now called "C", calling convention. The intention is that this
avoids breakage to existing programs. The new calling convention is
opt-in, requiring an annotation. The `multivalue` target feature enables
usage of "wasm-multivalue", but enabling or disabling
`multivalue`-the-target-feature has no effect on the "C" calling convention.

The technical definition of this new "wasm-multivalue" calling
convention is to alter the previous calling convention in three ways:

* Primarily, `struct` returns are now returned directly if all fields
  are (optionally recursively) scalars. This means that returning a
  `struct`-by-value gets translated to multiple return values.

* To account for WebAssembly#88 this new calling convention additionally passes
  two-scalar-argument `struct` definitions directly instead of
  indirectly, matching what native calling conventions do for example.

* Finally, `__int128` and `long double` return values are now returned
  directly instead of indirectly.

On an implementation side of things, what I'd like to do is to gain
consensus on this calling convention definition in the tool-conventions
repository as a first step. Once the definition is settled I plan to go
to LLVM/Clang to implement this calling convention, and after that I
plan to go to the Rust compiler and implement the calling convention
there. My plan for Rust is to implement `extern "wasm-multivalue"
fn(..)` syntactically, and for C to use
`__attribute__((wasm_multivalue))` as the opt-in. Note that this is all
bikesheddable, of course.
@tlively
Copy link
Copy Markdown
Member

tlively commented May 24, 2026

SGTM! I like that making this opt-in lets us avoid the question of how large a struct should be allowed to be returned by value.

Comment thread BasicCABI.md Outdated
Comment thread BasicCABI.md Outdated
@alexcrichton
Copy link
Copy Markdown
Collaborator Author

I've got a concrete implementation of this at llvm/llvm-project#200076, although I'm by no means an LLVM expert so that's unlikely its final form. It's at least the general thrust, however.

@dschuff
Copy link
Copy Markdown
Member

dschuff commented May 28, 2026

This sounds good to me too. But I don't think we can/should avoid the question of picking concrete size though, right? Once this starts being used by component model intrinsics, we'll want it to be stable, no?

@dschuff
Copy link
Copy Markdown
Member

dschuff commented May 28, 2026

Oh, I guess @tlively you're saying that the convention will be to just return everything byval, and just never make this the default, and then the opt-in would have to be done by the programmer or whoever defines an API?
It seems like it might be nice to have it as the default in the future though if it's really beneficial.

One thing that we could do: The GlobalOpt pass knows how to change functions to use the FastCC calling convention when they are private enough. Maybe we can make FastCC an alias for this, or make a TTI hook to choose which CC to upgrade to instead of just using FastCC. But that might only make sense if the calling convention was stable and known to be almost always beneficial.
(Or we could have some more sophisticated analysis in GlobalOpt to decide whether it's beneficial... or just have FastCC and wasm-multivalue CC be the same except for the number of return values used... anyway there are several options, but we wouldn't need to do anything fancy if wasm-multivalue were stable and mostly an improvement).

@alexcrichton
Copy link
Copy Markdown
Collaborator Author

@dschuff to confirm you mean a limit to the number of return values, right? If so, I was wondering the same, but testing locally it looked like there was no limit to the number of parameters that would be generated so I figured I'd follow the same. I think it'd be reasonable to pick a limit, however, and start spilling to the stack afterwards. Implementing that scale of ABI change is probably going a bit beyond my LLVM abilities personally, but otherwise, yes, we'll want to consider this stable in relatively short-order to be able to use it in the component model.

I do agree that it'd probably be worthwhile to make this the default ABI one day (or at least a variant thereof). I also agree that it'd be reasonable to switch internal functions to this ABI automatically, and this is something I plan on measuring for Rust at some point and probably applying (the precise ABI of fn() is perma-unstable unlike extern "C" fn() in Rust, exactly to be able to make changes like this).

For now though my plan is to start relatively conservative and mostly just unblock the ability to use multi-return in wasm modules (and subsequently component model intrinsics/lowerings/etc). I don't have immediate plans personally to investigate optimization passes or plan out a switch for this to become the default ABI.

@fitzgen
Copy link
Copy Markdown
Contributor

fitzgen commented May 28, 2026

I think at minimum the limit would need to match the web spec’s implementation limits, no?

@alexcrichton
Copy link
Copy Markdown
Collaborator Author

Testing locally, currently there's no limit in LLVM to match the web 1000 parameter limit. IIRC historically LLVM can also emit more locals than the web allows, so I'm not sure how many preexisting limits are adhered to within LLVM.

If a limit were to be added, though, it'd probably be something like ~400 ABI-level parmeters and ~400 field structs. Half of the 1000 limit due to some types becoming two ABI values (e.g. i128) and a little less to account for things like return poitners and such being injected.

I'm happy to write this down, but I'm also a bit wary of diverging too much from the preexisting ABI. In practice I suspect we can retroactively apply limits at any time since 400 parameters is a lot and this isn't super widespread yet, too.

@tlively
Copy link
Copy Markdown
Member

tlively commented May 28, 2026

Oh, I guess @tlively you're saying that the convention will be to just return everything byval, and just never make this the default, and then the opt-in would have to be done by the programmer or whoever defines an API?
It seems like it might be nice to have it as the default in the future though if it's really beneficial.

Right. Having it be opt-in completely avoids the question of how many return values to support before spilling. I agree that it would be nice to have a default-on or FastCC multivalue ABI that only uses a reasonable, finite number of return values (2? 4? 8?), but by starting with something opt-in we can make incremental progress while continuing to defer a decision on that to the future.

@alexcrichton, another consideration is that there are several known bugs with the multivalue implementation at the moment (e.g. llvm/llvm-project#98323, llvm/llvm-project#92995). The last status here was that @sunfishcode was looking into some of them.

@dschuff
Copy link
Copy Markdown
Member

dschuff commented May 28, 2026

Yeah, previously we had discussed having an ABI that was meant to be similar to existing machine ABIs that can pass and return a limited number of elements in registers, for a performance boost compared to using the stack. The idea was that this could hopefully be implemented by also using registers for at least most of those parameters/return values, but that beyond a small number (probably less than 10, as Thomas mentioned), keeping parameters that would certainly need to be in memory as explicitly in memory made sense. The reason there is currently no limit (on the number of parameters used to pass struct elements, or the number of return values) is just that nobody has done the performance measurement work to figure out what the best number is.

It sounds like you are looking for something a little different though, i.e. that a potentially-huge number of params or returns is what you might actually want? As Thomas says, this has the nice property that nobody has to do that performance measurement (and that performance maybe isn't really the motivation for your use case anyway?).

I think I would really like us to actually just do some experiments though, and pick a small number to use as default. As I mentioned, I think it can unlock some easy performance wins for LTO and potentially as a default. We hadn't really considered the possibility of a large or unlimited number of params. It's certainly a simpler ABI, but I would want to look at the consequences (in terms of code size, generated code size, and performance) of that. I could imagine a code size blowup (pushing 100 values onto the stack at the callsites vs just passing a pointer), a performance cliff (2x copies for all those parameters? hitting some pathological case in the register allocator?); or potentially even a performance gain (having 100 values on the wasm stack instead of linear memory means no bounds checks in wasm64 mode). But we don't know.

@dschuff
Copy link
Copy Markdown
Member

dschuff commented May 28, 2026

I'm happy to write this down, but I'm also a bit wary of diverging too much from the preexisting ABI. In practice I suspect we can retroactively apply limits at any time since 400 parameters is a lot and this isn't super widespread yet, too.

Do you mean the preexisting mult-value ABI? Don't worry about divergence from that, it's not used anywhere and as Thomas mentioned, it doesn't even work yet.

@QuantumSegfault
Copy link
Copy Markdown

QuantumSegfault commented May 28, 2026

What's the rationale for limiting the parameter aggregate split to only two scalars? I understand there should be a limit, but I imagine a higher limit, say 4, would allow for better utilization of the new ABI (the example in mind is 3D math & graphics; vec3, vec4, quat, color).

@dschuff
Copy link
Copy Markdown
Member

dschuff commented May 28, 2026

If that were the limit (it isn't yet), it would be because we found empirically that it performed better. It would be interesting to know for example, how many parameters (if any?) can be passed in registers in wasm implementations and whether the builtin spilling is any worse than passing in wasm memory would be.
But again, nobody has done this experimentation yet. I agree it seems likely that we'd want at least 4 though.

@dschuff
Copy link
Copy Markdown
Member

dschuff commented May 28, 2026

Oh, sorry I missed that the current proposal and PR does actually use exactly 2 scalar arguments. I think what I said above about our old discussions still applies; I think it might make sense to have more than 2, based on performance and/or your usability argument. @alexcrichton may have just picked it based on the component models' needs, I would actually be curious if there was more to it than that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants