of Intel

[PyTorch Prerequisites](https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpu/2-7.html)
> [!NOTE]
> Developers who want to run PyTorch deep learning workloads need to install only the drivers and pip install PyTorch wheels binaries. The runtime package for the Intel® Deep Learning Essentials is installed automatically during the pip installation of the PyTorch wheels binaries.
> — [Intel](https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpu/2-7.html)

Dr. Suarez found [CTranslate2](https://github.com/OpenNMT/CTranslate2) on [stream](https://www.youtube.com/watch?v=SsGOUWvDfC0) through [cibuildwheel](https://cibuildwheel.pypa.io/en/stable/). My guess being OpenBLAS is deprecated; I've no experience with oneDNN or other oneapi resource(s) other than [Level Zero](https://github.com/oneapi-src/level-zero), but haven't used it yet.

Found [this](https://github.com/uxlfoundation/oneAPI-spec/issues/275); of interest may be this [file](https://github.com/uxlfoundation/foundation/blob/main/hardware/presentations/2025-02-20%20-%20Hardware%20Acceleration%20for%20PyTorch.pdf).
> [!IMPORTANT]
> Developers building PyTorch from source code need to install both the driver and Intel Deep Learning Essentials.
> — [Intel](https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpu/2-7.html)

![Image](https://github.com/user-attachments/assets/7d43da29-a37a-4223-9ff4-3c51b75c99bf)

Instead of the installer shown above, I'm using the standalone [installer](https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler-download.html) available for the [compiler](https://www.intel.com/content/www/us/en/docs/dpcpp-cpp-compiler/get-started-guide/2025-1/overview.html). If Intel's `Deep Neural Network Library` and `Math Kernel Library` are useful, please comment below.

```
$ ocloc query CL_DEVICE_EXTENSIONS
cl_ext_float_atomics cl_intel_accelerator cl_intel_command_queue_families cl_intel_device_attribute_query cl_intel_driver_diagnostics cl_intel_mem_force_host_memory cl_intel_required_subgroup_size cl_intel_spirv_subgroups cl_intel_split_work_group_barrier cl_intel_subgroup_local_block_io cl_intel_subgroups cl_intel_subgroups_char cl_intel_subgroups_long cl_intel_subgroups_short cl_intel_unified_shared_memory cl_khr_byte_addressable_store cl_khr_create_command_queue cl_khr_device_uuid cl_khr_extended_bit_ops cl_khr_external_memory cl_khr_fp16 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_icd cl_khr_il_program cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_integer_dot_product cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_priority_hints cl_khr_spir cl_khr_spirv_linkonce_odr cl_khr_spirv_no_integer_wrap_decoration cl_khr_subgroup_ballot cl_khr_subgroup_clustered_reduce cl_khr_subgroup_extended_types cl_khr_subgroup_non_uniform_arithmetic cl_khr_subgroup_non_uniform_vote cl_khr_subgroup_shuffle cl_khr_subgroup_shuffle_relative cl_khr_suggested_local_work_size cl_khr_throttle_hints
```
```
$ ocloc query OCL_DRIVER_VERSION
1.0.032413
```
```
$ ocloc query CL_DEVICE_OPENCL_C_ALL_VERSIONS
"OpenCL C":1.0.0 "OpenCL C":1.1.0 "OpenCL C":1.2.0 "OpenCL C":3.0.0
```
```
$ ocloc query CL_DEVICE_OPENCL_C_FEATURES
__opencl_c_atomic_order_acq_rel:3.0.0 __opencl_c_atomic_order_seq_cst:3.0.0 __opencl_c_atomic_scope_all_devices:3.0.0 __opencl_c_atomic_scope_device:3.0.0 __opencl_c_ext_fp16_global_atomic_load_store:3.0.0 __opencl_c_ext_fp16_global_atomic_min_max:3.0.0 __opencl_c_ext_fp16_local_atomic_load_store:3.0.0 __opencl_c_ext_fp16_local_atomic_min_max:3.0.0 __opencl_c_ext_fp32_global_atomic_add:3.0.0 __opencl_c_ext_fp32_global_atomic_min_max:3.0.0 __opencl_c_ext_fp32_local_atomic_add:3.0.0 __opencl_c_ext_fp32_local_atomic_min_max:3.0.0 __opencl_c_generic_address_space:3.0.0 __opencl_c_int64:3.0.0 __opencl_c_integer_dot_product_input_4x8bit:3.0.0 __opencl_c_integer_dot_product_input_4x8bit_packed:3.0.0 __opencl_c_program_scope_global_variables:3.0.0 __opencl_c_subgroups:3.0.0 __opencl_c_work_group_collective_functions:3.0.0
```
```
$ ocloc query CL_DEVICE_PROFILE
FULL_PROFILE
```

```
:: initializing oneAPI environment...
   Initializing Visual Studio command-line environment...
   Visual Studio version 17.13.6 environment configured.
   "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\"
   Visual Studio command-line environment initialized for: 'x64'
:  compiler -- latest
:  debugger -- latest
:  dev-utilities -- latest
:  dpl -- latest
:  ocloc -- latest
:  tbb -- latest
:  umf -- latest
:: oneAPI environment initialized ::

C:\Program Files (x86)\Intel\oneAPI>ocloc query SUPPORTED_DEVICES

C:\Program Files (x86)\Intel\oneAPI>
```
```
$ icx
Intel(R) oneAPI DPC++/C++ Compiler for applications running on Intel(R) 64, Version 2025.1.0 Build 20250317
Copyright (C) 1985-2025 Intel Corporation. All rights reserved.

icx: error: no input files
```
```
$ icpx
icpx: error: no input files
```

With `C:\Program Files (x86)\Intel\oneAPI\compiler\2025.1\bin\common_clang64.dll`, does this mean icx/icpx is clang-compatible? Is it usable in other projects? That [aside](https://github.com/queso-fuego/uefi-dev/issues/11); would really like to use it if it includes feature(s) facilitating hardware acceleration.

if needed: https://www.intel.com/content/www/us/en/developer/articles/technical/vectorization-llvm-gcc-cpus-gpus.html

> ... As a continuous effort, more performance tuning and optimizations will be added into Intel oneAPI LLVM-based compilers and GCC compilers for Intel CPUs AVX-512 and AVX-512-FP16/VNNI ISA and Intel GPUs Gen12 ISA.
> — [Intel](https://www.intel.com/content/www/us/en/developer/articles/tool/pytorch-prerequisites-for-intel-gpu/2-7.html)

Without Visual Studio Build Tools 2022 available in Linux, compilation fails if needing `vcruntime.h` when using `icx` or `icpx`. Noticed `-std=` as expected with `icx` in linux seems to be `-Qstd=` with `icx` in Windows.

https://intel.github.io/intel-extension-for-pytorch/

> **Note:** The current implementation of the DPC++ extension only supports Linux.
> — [Intel](https://intel.github.io/intel-extension-for-pytorch/xpu/2.7.10+xpu/tutorials/features/DPC++_Extension.html)

As for pufferlib - `bbd22d` - if starting with `device = xpu` in `pufferlib/config/ocean/target.ini`, linux shows `AssertionError: Torch not compiled with XPU enabled` which confirms the possibility. Officially without windows [support](https://github.com/PufferAI/PufferLib/issues/39#issuecomment-2811369687) as of `2.0`; after `pip install -e . --break-system-packages`, getting `LINK : error LNK2001: unresolved external symbol PyInit_ocean\target\binding` with [this](https://github.com/PufferAI/PufferLib/tree/ca73cbf324872aa2f09eda048772169053a8525a) merge commit.

Found Intel's [install](https://pytorch-extension.intel.com/installation?request=platform) through their [tutorial](https://intel.github.io/intel-extension-for-pytorch/xpu/2.7.10+xpu/tutorials/getting_started.html) and [example](https://intel.github.io/intel-extension-for-pytorch/xpu/2.7.10+xpu/tutorials/examples.html), seemingly without any [Known Issue](https://intel.github.io/intel-extension-for-pytorch/xpu/2.7.10+xpu/tutorials/known_issues.html) after both `pip install` commands completed successfully:

```
C:\Program Files (x86)\Intel\oneAPI>python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];"
[W530 13:29:29.000000000 OperatorEntry.cpp:161] Warning: Warning only once for all operators,  other operators may also be overridden.
  Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: aten::geometric_(Tensor(a!) self, float p, *, Generator? generator=None) -> Tensor(a!)
    registered at C:\actions-runner\_work\pytorch\pytorch\pytorch\build\aten\src\ATen\RegisterSchema.cpp:6
  dispatch key: XPU
  previous kernel: registered at C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:37
       new kernel: registered at I:\frameworks.ai.pytorch.ipex-gpu\build\Release\csrc\gpu\csrc\gpu\xpu\ATen\RegisterXPU_0.cpp:186 (function operator ())
2.7.0+xpu
2.7.10+xpu
C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\xpu\__init__.py:60: UserWarning: XPU device count is zero! (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\pytorch\c10\xpu\XPUFunctions.cpp:115.)
  return torch._C._xpu_getDeviceCount()

C:\Program Files (x86)\Intel\oneAPI>
```
(not including `torchvision` and `torchaudio` in either `pip install` and guessing Microsoft [runtime](https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist?view=msvc-170#visual-studio-2015-2017-2019-and-2022) isn't needed as already using `Visual Studio Build Tools 2022`)

As for Linux, Intel has `pip`, `source` and `docker` selections [if needed](https://pytorch-extension.intel.com/installation?platform=gpu&version=v2.7.10%2Bxpu&os=linux%2Fwsl2&package=docker).

May need Level Zero:
```
garner@linux:~$ python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/__init__.py", line 122, in <module>
    from .utils._proxy_module import *
  File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/utils/_proxy_module.py", line 2, in <module>
    import intel_extension_for_pytorch._C
ImportError: libze_loader.so.1: cannot open shared object file: No such file or directory
garner@linux:~$
```

After installing the generated `.deb` - `level-zero_1.9.9+l22.1_amd64.deb` - by checking out the `level-zero` tag `v1.9.9`:

```
garner@linux:~$ python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/__init__.py", line 122, in <module>
    from .utils._proxy_module import *
  File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/utils/_proxy_module.py", line 2, in <module>
    import intel_extension_for_pytorch._C
ImportError: /opt/intel/compiler/2025.1/lib/libur_loader.so.0: version `LIBUR_LOADER_0.10' not found (required by /usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/lib/../../../../libsycl.so.8)
garner@linux:~$
```

Pending [pytorch issue](https://github.com/pytorch/pytorch/issues/149953); got this:
```
garner@linux:/opt/puffer$ puffer train puffer_target
Traceback (most recent call last):
  File "/usr/local/bin/puffer", line 5, in <module>
    from pufferlib.pufferl import main
  File "/opt/puffer/pufferlib/pufferl.py", line 28, in <module>
    import torch
  File "/usr/local/lib/python3.12/dist-packages/torch/__init__.py", line 409, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: /opt/intel/compiler/2025.1/lib/libur_loader.so.0: version `LIBUR_LOADER_0.10' not found (required by /usr/local/lib/python3.12/dist-packages/torch/lib/../../../../libsycl.so.8)
garner@linux:/opt/puffer$
```
Note if [legacy](https://github.com/intel/compute-runtime/blob/master/LEGACY_PLATFORMS.md) hardware; Linux Mint has `intel-opencl-icd` `(23.43.27642.40-1ubuntu3)` at present instead of [24.35](https://github.com/intel/compute-runtime/releases/tag/24.35.30872.22).

Is this as expected?
```
Processing triggers for libc-bin (2.39-0ubuntu8.4) ...
/sbin/ldconfig.real: /usr/local/lib/libccl.so.1 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbb.so.12 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libmpi.so.12 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_5.so.3 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbbbind.so.3 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libpti_view.so.0.10 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libmpijava.so.1 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libpstloffload.so.1 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libmpicxx.so.12 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libur_adapter_opencl.so.0 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libOpenCL.so.1 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libur_adapter_level_zero.so.0 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_0.so.3 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc.so.2 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libsycl-preview.so.8 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libmpifort.so.12 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libur_loader.so.0 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc_proxy.so.2 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libsycl.so.8 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libumf.so.0 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libhwloc.so.15 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtcm.so.1 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtcm_debug.so.1 is not a symbolic link
```
Possible `libsycl` issue as `/opt/intel/compiler/2025.1/lib/libur_loader.so.0.11.10` exists. Would that be added [here](https://github.com/intel/llvm/issues)?

---

> As for pufferlib - `bbd22d` - if starting with `device = xpu` in `pufferlib/config/ocean/target.ini`, linux shows `AssertionError: Torch not compiled with XPU enabled` which confirms the possibility. Officially without windows [support](https://github.com/PufferAI/PufferLib/issues/39#issuecomment-2811369687) as of `2.0`; after `pip install -e . --break-system-packages`, getting `LINK : error LNK2001: unresolved external symbol PyInit_ocean\target\binding` with [this](https://github.com/PufferAI/PufferLib/tree/ca73cbf324872aa2f09eda048772169053a8525a) merge commit.
> — [elevatorguy](https://github.com/PufferAI/PufferLib/issues/213#issue-3043241452)

Didn't take note yesterday in windows, but somehow got past `libur_loader` as the blocker in linux.
```
 garner@linux:~$ puffer train puffer_squared
/home/garner/.local/lib/python3.12/site-packages/torch/xpu/__init__.py:120: UserWarning: XPU device count is zero! (Triggered internally at /pytorch/c10/xpu/XPUFunctions.cpp:115.)
  torch._C._xpu_init()
...
RuntimeError: No XPU devices are available.
```
The `...` being a Traceback; used `--user` with `pip` when reinstalling intel's torch - uninstalled pufferlib yesterday first in linux.

Today, added `AppData\Roaming\Python\Python313\Scripts` to PATH; during reinstall of intel's torch in windows, followed a similar process to yesterday but didn't notice a path difference with `pip`'s `--user`.

Merged in `c951bfd` [here](https://github.com/pufferai/pufferlib/tree/fe5170470b8666e85c595de7a7c94294ab414b8d) resulting in the same `LINK` error as above using `python setup.py build_ext --inplace`.

Needed `set DISTUTILS_USE_SDK=1`.

```
D:\puffer>"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.43.34808\bin\HostX64\x64\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\jayg8\AppData\Local\Programs\Python\Python313\libs /LIBPATH:C:\Users\jayg8\AppData\Local\Programs\Python\Python313 /LIBPATH:C:\Users\jayg8\AppData\Local\Programs\Python\Python313\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\tbb\latest\env\..\lib" "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\compiler\latest\lib\clang\19\lib\windows" "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\compiler\latest\opt\compiler\lib" "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\compiler\latest\lib" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.43.34808\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.22621.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x64" "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\umf\latest\lib" "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\tcm\latest\lib" /EXPORT:PyInit_binding build\temp.win-amd64-cpython-313\Release\pufferlib\ocean\school\binding.obj raylib-5.5_win64_msvc16/lib/raylibdll.lib /OUT:build\lib.win-amd64-cpython-313\pufferlib\ocean\school\binding.cp313-win_amd64.pyd /IMPLIB:build\temp.win-amd64-cpython-313\Release\pufferlib\ocean\school\binding.cp313-win_amd64.lib
   Creating library build\temp.win-amd64-cpython-313\Release\pufferlib\ocean\school\binding.cp313-win_amd64.lib and object build\temp.win-amd64-cpython-313\Release\pufferlib\ocean\school\binding.cp313-win_amd64.exp
Generating code
Finished generating code
```
To achieve linkage, some changes need to be made to `setup.py` as this fails:
```
"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.43.34808\bin\HostX64\x64\link.exe" /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:C:\Users\jayg8\AppData\Local\Programs\Python\Python313\libs /LIBPATH:C:\Users\jayg8\AppData\Local\Programs\Python\Python313 /LIBPATH:C:\Users\jayg8\AppData\Local\Programs\Python\Python313\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\tbb\latest\env\..\lib" "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\compiler\latest\lib\clang\19\lib\windows" "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\compiler\latest\opt\compiler\lib" "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\compiler\latest\lib" "/LIBPATH:C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.43.34808\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.22621.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x64" "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\umf\latest\lib" "/LIBPATH:C:\Program Files (x86)\Intel\oneAPI\tcm\latest\lib" /EXPORT:PyInit_ocean\school\binding build\temp.win-amd64-cpython-313\Release\pufferlib\ocean\school\binding.obj raylib-5.5_win64_msvc16/lib/raylibdll.lib /OUT:build\lib.win-amd64-cpython-313\pufferlib\ocean\school\binding.cp313-win_amd64.pyd /IMPLIB:build\temp.win-amd64-cpython-313\Release\pufferlib\ocean\school\binding.cp313-win_amd64.lib -fwrapv -O2
```
This [setup.py](https://github.com/PufferAI/PufferLib/blob/ca73cbf324872aa2f09eda048772169053a8525a/setup.py) is [different already](https://github.com/PufferAI/PufferLib/compare/2.0...3.0#files_bucket) but further:

Inserting `export_symbols=[ path.rstrip('.c').replace('/', '.').replace('\\','_') ],` [to](https://setuptools.pypa.io/en/latest/userguide/ext_modules.html#setuptools.Extension) `Extension`, `python setup.py build_ext --inplace` results in two errors of unresolved external symbol - `PyInit_ocean\target\binding` and `pufferlib.ocean_target_binding` in windows.

Just hardcoding the first parameter to "binding" results in successful linking; runtime error(s), though. `ImportError: cannot import name 'binding' from 'pufferlib.ocean.target' (unknown location)`

---

> Neo requires:
> 
> [Intel(R) Graphics Compiler for OpenCL(TM)](https://github.com/intel/intel-graphics-compiler)
> [Intel(R) Graphics Memory Management](https://github.com/intel/gmmlib)
> Please visit their repositories for building and instalation instructions.
> — [Intel](https://github.com/intel/compute-runtime/blob/master/BUILD.md)

Have [yet to](https://github.com/intel/compute-runtime/blob/master/BUILD.md#optional---building-neo-with-support-for-gen8-gen9-and-gen11-devices) clone `gmmlib` and `intel-graphics-compiler`.

> Starting from release [24.35.30872.22](https://github.com/intel/compute-runtime/releases/tag/24.35.30872.22) regular packages support Gen12 and later devices.
> 
> Support for Gen8, Gen9 and Gen11 devices will be delivered via packages with legacy1 suffix:
> 
> intel-opencl-icd-legacy1_24.35.30872.22_amd64.deb
> intel-level-zero-gpu-legacy1_1.3.30872.22_amd64.deb
> — [Intel](https://github.com/intel/compute-runtime/blob/master/LEGACY_PLATFORMS.md)

Without a non-zero `XPU device count`, may need to reinstall linux as both the non-legacy and `legacy1` were installed, right? (used `apt-get remove` on one of the pairs)

`dpkg --list intel*` gives the following:
```
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                                  Version                                    Architecture Description
+++-=====================================-==========================================-============-================================================================================
ii  intel-igc-core                        1.0.17537.20                               amd64        Intel(R) Graphics Compiler for OpenCL(TM)
ii  intel-igc-opencl                      1.0.17537.20                               amd64        Intel(R) Graphics Compiler for OpenCL(TM)
ii  intel-level-zero-gpu-legacy1          1.3.30872.22                               amd64        Intel(R) Graphics Compute Runtime for oneAPI Level Zero.
ii  intel-level-zero-gpu-legacy1-dbgsym   1.3.30872.22                               amd64        debug symbols for intel-level-zero-gpu-legacy1
ii  intel-media-va-driver:amd64           24.1.0+dfsg1-1                             amd64        VAAPI driver for the Intel GEN8+ Graphics family
un  intel-media-va-driver-non-free        <none>                                     <none>       (no description available)
ii  intel-microcode                       3.20250512.0ubuntu0.24.04.1                amd64        Processor microcode firmware for Intel CPUs
un  intel-opencl                          <none>                                     <none>       (no description available)
rc  intel-opencl-icd                      24.35.30872.22                             amd64        Intel graphics compute runtime for OpenCL
ii  intel-opencl-icd-legacy1              24.35.30872.22                             amd64        Intel graphics compute runtime for OpenCL
ii  intel-opencl-icd-legacy1-dbgsym       24.35.30872.22                             amd64        debug symbols for intel-opencl-icd-legacy1
garner@linux:/opt/neo$
```
---
![Image](https://github.com/user-attachments/assets/6aba5ddb-7592-4858-8aba-1f54c1fc2307)

This was with stock `torch` - version 2.7.0.

With Intel's torch, the familiar `RuntimeError: No XPU devices are available.` occurs. As for `puffer eval puffer_target --load-model-path latest`: `AssertionError: Torch not compiled with CUDA enabled`. Unfamiliar error with `puffer train puffer_target --train.device=cpu`:
```
Windows fatal exception: code 0xc0000139

Thread 0x00001db0 (most recent call first):
  File "D:\puffer\pufferlib\pufferl.py", line 781 in run
  File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\threading.py", line 1041 in _bootstrap_inner
  File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\threading.py", line 1012 in _bootstrap

Current thread 0x0000266c (most recent call first):
  File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\autograd\graph.py", line 824 in _engine_run_backward
  File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\autograd\__init__.py", line 353 in backward
  File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\_tensor.py", line 648 in backward
  File "D:\puffer\pufferlib\pufferl.py", line 428 in train
  File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\distributed\elastic\multiprocessing\errors\__init__.py", line 355 in wrapper
  File "D:\puffer\pufferlib\pufferl.py", line 914 in train
  File "D:\puffer\pufferlib\pufferl.py", line 1203 in main
  File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Scripts\puffer.exe\__main__.py", line 7 in <module>
  File "<frozen runpy>", line 88 in _run_code
  File "<frozen runpy>", line 198 in _run_module_as_main
C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\autograd\graph.py:824: UserWarning: XPU device count is zero! (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\pytorch\c10\xpu\XPUFunctions.cpp:115.)
  return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
```
![Image](https://github.com/user-attachments/assets/5342113a-0780-40cb-8f72-b16307b110a7)

Changed to 2026 build tools. With torch `2.10.0`:

<img width="1245" height="354" alt="Image" src="https://github.com/user-attachments/assets/7b06729d-b421-4b6c-91be-2f7930f063c2" />

> C:\Users\jayg8\AppData\Local\Programs\Python\Python312\Lib\site-packages\torch\_inductor\lowering.py:1904: FutureWarning: `torch._prims_common.check` is deprecated and will be removed in the future. Please use `torch._check*` functions instead.

No longer needing `intel-extension-for-pytorch` with torch `2.10.0`, have fresh linux install without oneapi just yet, `puffer eval --train.device=cpu --load-model-path=latest`:
[insert screenshot here]
> /home/garner/.local/lib/python3.12/site-packages/torch/_inductor/lowering.py:1904: FutureWarning: `torch._prims_common.check` is deprecated and will be removed in the future. Please use `torch._check*` functions instead.

(ubuntu 24.04 instead of linux mint)

`sudo apt-get install intel-gpu-tools` for `intel_gpu_top`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

of Intel #213

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

of Intel #213

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions