Define a new "wasm-multivalue" calling convention#268
Conversation
This commit is an attempt to define a new calling convention for WebAssembly which I've provisionally decided to call "wasm-multivalue". This new calling convention is inspired by discussions on WebAssembly#247 and WebAssembly#88, and the primary motivation of this calling convention is to be able to use multi-return signatures for intrinsics/functions in the component model. This is a definition of a new, parallel, calling convention to the existing, now called "C", calling convention. The intention is that this avoids breakage to existing programs. The new calling convention is opt-in, requiring an annotation. The `multivalue` target feature enables usage of "wasm-multivalue", but enabling or disabling `multivalue`-the-target-feature has no effect on the "C" calling convention. The technical definition of this new "wasm-multivalue" calling convention is to alter the previous calling convention in three ways: * Primarily, `struct` returns are now returned directly if all fields are (optionally recursively) scalars. This means that returning a `struct`-by-value gets translated to multiple return values. * To account for WebAssembly#88 this new calling convention additionally passes two-scalar-argument `struct` definitions directly instead of indirectly, matching what native calling conventions do for example. * Finally, `__int128` and `long double` return values are now returned directly instead of indirectly. On an implementation side of things, what I'd like to do is to gain consensus on this calling convention definition in the tool-conventions repository as a first step. Once the definition is settled I plan to go to LLVM/Clang to implement this calling convention, and after that I plan to go to the Rust compiler and implement the calling convention there. My plan for Rust is to implement `extern "wasm-multivalue" fn(..)` syntactically, and for C to use `__attribute__((wasm_multivalue))` as the opt-in. Note that this is all bikesheddable, of course.
|
SGTM! I like that making this opt-in lets us avoid the question of how large a struct should be allowed to be returned by value. |
|
I've got a concrete implementation of this at llvm/llvm-project#200076, although I'm by no means an LLVM expert so that's unlikely its final form. It's at least the general thrust, however. |
|
This sounds good to me too. But I don't think we can/should avoid the question of picking concrete size though, right? Once this starts being used by component model intrinsics, we'll want it to be stable, no? |
|
Oh, I guess @tlively you're saying that the convention will be to just return everything byval, and just never make this the default, and then the opt-in would have to be done by the programmer or whoever defines an API? One thing that we could do: The GlobalOpt pass knows how to change functions to use the |
|
@dschuff to confirm you mean a limit to the number of return values, right? If so, I was wondering the same, but testing locally it looked like there was no limit to the number of parameters that would be generated so I figured I'd follow the same. I think it'd be reasonable to pick a limit, however, and start spilling to the stack afterwards. Implementing that scale of ABI change is probably going a bit beyond my LLVM abilities personally, but otherwise, yes, we'll want to consider this stable in relatively short-order to be able to use it in the component model. I do agree that it'd probably be worthwhile to make this the default ABI one day (or at least a variant thereof). I also agree that it'd be reasonable to switch internal functions to this ABI automatically, and this is something I plan on measuring for Rust at some point and probably applying (the precise ABI of For now though my plan is to start relatively conservative and mostly just unblock the ability to use multi-return in wasm modules (and subsequently component model intrinsics/lowerings/etc). I don't have immediate plans personally to investigate optimization passes or plan out a switch for this to become the default ABI. |
|
I think at minimum the limit would need to match the web spec’s implementation limits, no? |
|
Testing locally, currently there's no limit in LLVM to match the web 1000 parameter limit. IIRC historically LLVM can also emit more locals than the web allows, so I'm not sure how many preexisting limits are adhered to within LLVM. If a limit were to be added, though, it'd probably be something like ~400 ABI-level parmeters and ~400 field structs. Half of the 1000 limit due to some types becoming two ABI values (e.g. I'm happy to write this down, but I'm also a bit wary of diverging too much from the preexisting ABI. In practice I suspect we can retroactively apply limits at any time since 400 parameters is a lot and this isn't super widespread yet, too. |
Right. Having it be opt-in completely avoids the question of how many return values to support before spilling. I agree that it would be nice to have a default-on or FastCC multivalue ABI that only uses a reasonable, finite number of return values (2? 4? 8?), but by starting with something opt-in we can make incremental progress while continuing to defer a decision on that to the future. @alexcrichton, another consideration is that there are several known bugs with the multivalue implementation at the moment (e.g. llvm/llvm-project#98323, llvm/llvm-project#92995). The last status here was that @sunfishcode was looking into some of them. |
|
Yeah, previously we had discussed having an ABI that was meant to be similar to existing machine ABIs that can pass and return a limited number of elements in registers, for a performance boost compared to using the stack. The idea was that this could hopefully be implemented by also using registers for at least most of those parameters/return values, but that beyond a small number (probably less than 10, as Thomas mentioned), keeping parameters that would certainly need to be in memory as explicitly in memory made sense. The reason there is currently no limit (on the number of parameters used to pass struct elements, or the number of return values) is just that nobody has done the performance measurement work to figure out what the best number is. It sounds like you are looking for something a little different though, i.e. that a potentially-huge number of params or returns is what you might actually want? As Thomas says, this has the nice property that nobody has to do that performance measurement (and that performance maybe isn't really the motivation for your use case anyway?). I think I would really like us to actually just do some experiments though, and pick a small number to use as default. As I mentioned, I think it can unlock some easy performance wins for LTO and potentially as a default. We hadn't really considered the possibility of a large or unlimited number of params. It's certainly a simpler ABI, but I would want to look at the consequences (in terms of code size, generated code size, and performance) of that. I could imagine a code size blowup (pushing 100 values onto the stack at the callsites vs just passing a pointer), a performance cliff (2x copies for all those parameters? hitting some pathological case in the register allocator?); or potentially even a performance gain (having 100 values on the wasm stack instead of linear memory means no bounds checks in wasm64 mode). But we don't know. |
Do you mean the preexisting mult-value ABI? Don't worry about divergence from that, it's not used anywhere and as Thomas mentioned, it doesn't even work yet. |
|
What's the rationale for limiting the parameter aggregate split to only two scalars? I understand there should be a limit, but I imagine a higher limit, say 4, would allow for better utilization of the new ABI (the example in mind is 3D math & graphics; vec3, vec4, quat, color). |
|
If that were the limit (it isn't yet), it would be because we found empirically that it performed better. It would be interesting to know for example, how many parameters (if any?) can be passed in registers in wasm implementations and whether the builtin spilling is any worse than passing in wasm memory would be. |
|
Oh, sorry I missed that the current proposal and PR does actually use exactly 2 scalar arguments. I think what I said above about our old discussions still applies; I think it might make sense to have more than 2, based on performance and/or your usability argument. @alexcrichton may have just picked it based on the component models' needs, I would actually be curious if there was more to it than that. |
This commit is an attempt to define a new calling convention for WebAssembly which I've provisionally decided to call "wasm-multivalue". This new calling convention is inspired by discussions on #247 and #88, and the primary motivation of this calling convention is to be able to use multi-return signatures for intrinsics/functions in the component model.
This is a definition of a new, parallel, calling convention to the existing, now called "C", calling convention. The intention is that this avoids breakage to existing programs. The new calling convention is opt-in, requiring an annotation. The
multivaluetarget feature enables usage of "wasm-multivalue", but enabling or disablingmultivalue-the-target-feature has no effect on the "C" calling convention.The technical definition of this new "wasm-multivalue" calling convention is to alter the previous calling convention in three ways:
Primarily,
structreturns are now returned directly if all fields are (optionally recursively) scalars. This means that returning astruct-by-value gets translated to multiple return values.To account for Pass small structs in parameters instead of memory #88 this new calling convention additionally passes two-scalar-argument
structdefinitions directly instead of indirectly, matching what native calling conventions do for example.Finally,
__int128andlong doublereturn values are now returned directly instead of indirectly.On an implementation side of things, what I'd like to do is to gain consensus on this calling convention definition in the tool-conventions repository as a first step. Once the definition is settled I plan to go to LLVM/Clang to implement this calling convention, and after that I plan to go to the Rust compiler and implement the calling convention there. My plan for Rust is to implement
extern "wasm-multivalue" fn(..)syntactically, and for C to use__attribute__((wasm_multivalue))as the opt-in. Note that this is all bikesheddable, of course.