Skip to content

AVX types (__m256{,i,d}) should be castable to each other #26352

@btolsch

Description

@btolsch

Version of emscripten/emsdk:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 5.0.1-git (8c5f43157a3f069ade75876e23061330521eabde)
clang version 23.0.0git (/startdir/llvm-project b447f5d9763010f8c6806c578533291aef2bd484)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /opt/emscripten-llvm/bin

Failing command line in full:
For the file test.c:

#include <immintrin.h>

int
main(int argc, char** argv) {
  __m256i data, mask;
  data = _mm256_setzero_si256();
  mask = _mm256_setzero_si256();
  data = (__m256i)_mm256_and_ps((__m256)data, (__m256)mask);
  return 0;
}
$ emcc -msimd128 -mavx test.c
test.c:8:33: error: used type '__m256' where arithmetic or pointer type is required
    8 |   data = (__m256i)_mm256_and_ps((__m256)data, (__m256)mask);
      |                                 ^       ~~~~
test.c:8:47: error: used type '__m256' where arithmetic or pointer type is required
    8 |   data = (__m256i)_mm256_and_ps((__m256)data, (__m256)mask);
      |                                               ^       ~~~~
2 errors generated.
emcc: error: '/opt/emscripten-llvm/bin/clang -target wasm32-unknown-emscripten -fignore-exceptions -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr
 --sysroot=/home/btolsch/.cache/emscripten/sysroot -DEMSCRIPTEN -D__SSE__=1 -D__SSE2__=1 -D__SSE3__=1 -D__SSSE3__=1 -D__SSE4_1__=1 -D__SSE4_2__=1 -D__AVX__=1 -Xclang -iwithsysroot/include/fakesd
l -Xclang -iwithsysroot/include/compat -msimd128 -c test.c -o /tmp/emscripten_temp_xnre9uu1/test.o' failed (returned 1)

Desired behavior
Both clang and gcc allow casting between __m256, __m256i, and __m256d. This is because they are declared as vector types (i.e. __attribute__((__vector_size(n)))) and not structs or unions. The emscripten definitions are structs like:

typedef struct {
  __m128d v0;
  __m128d v1;
} __m256d;

These can't be cast to each other, so instead you have to do something like *(__m256*)&data or memcpy to a new variable (which is hopefully optimized out). Moreover, the __m128* definitions are vectors, so this problem is specific to the AVX types.

The struct definitions are convenient for the implementation, which entirely falls back to 128-bit operations, but can break existing AVX code which has no SSE or scalar fallback of its own.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions