You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Developers who want to run PyTorch deep learning workloads need to install only the drivers and pip install PyTorch wheels binaries. The runtime package for the Intel® Deep Learning Essentials is installed automatically during the pip installation of the PyTorch wheels binaries.
— Intel
Dr. Suarez found CTranslate2 on stream through cibuildwheel. My guess being OpenBLAS is deprecated; I've no experience with oneDNN or other oneapi resource(s) other than Level Zero, but haven't used it yet.
Developers building PyTorch from source code need to install both the driver and Intel Deep Learning Essentials.
— Intel
Instead of the installer shown above, I'm using the standalone installer available for the compiler. If Intel's Deep Neural Network Library and Math Kernel Library are useful, please comment below.
$ icx
Intel(R) oneAPI DPC++/C++ Compiler for applications running on Intel(R) 64, Version 2025.1.0 Build 20250317
Copyright (C) 1985-2025 Intel Corporation. All rights reserved.
icx: error: no input files
$ icpx
icpx: error: no input files
With C:\Program Files (x86)\Intel\oneAPI\compiler\2025.1\bin\common_clang64.dll, does this mean icx/icpx is clang-compatible? Is it usable in other projects? That aside; would really like to use it if it includes feature(s) facilitating hardware acceleration.
... As a continuous effort, more performance tuning and optimizations will be added into Intel oneAPI LLVM-based compilers and GCC compilers for Intel CPUs AVX-512 and AVX-512-FP16/VNNI ISA and Intel GPUs Gen12 ISA.
— Intel
Without Visual Studio Build Tools 2022 available in Linux, compilation fails if needing vcruntime.h when using icx or icpx. Noticed -std= as expected with icx in linux seems to be -Qstd= with icx in Windows.
Note: The current implementation of the DPC++ extension only supports Linux.
— Intel
As for pufferlib - bbd22d - if starting with device = xpu in pufferlib/config/ocean/target.ini, linux shows AssertionError: Torch not compiled with XPU enabled which confirms the possibility. Officially without windows support as of 2.0; after pip install -e . --break-system-packages, getting LINK : error LNK2001: unresolved external symbol PyInit_ocean\target\binding with this merge commit.
Found Intel's install through their tutorial and example, seemingly without any Known Issue after both pip install commands completed successfully:
C:\Program Files (x86)\Intel\oneAPI>python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];"
[W530 13:29:29.000000000 OperatorEntry.cpp:161] Warning: Warning only once for all operators, other operators may also be overridden.
Overriding a previously registered kernel for the same operator and the same dispatch key
operator: aten::geometric_(Tensor(a!) self, float p, *, Generator? generator=None) -> Tensor(a!)
registered at C:\actions-runner\_work\pytorch\pytorch\pytorch\build\aten\src\ATen\RegisterSchema.cpp:6
dispatch key: XPU
previous kernel: registered at C:\actions-runner\_work\pytorch\pytorch\pytorch\aten\src\ATen\VmapModeRegistrations.cpp:37
new kernel: registered at I:\frameworks.ai.pytorch.ipex-gpu\build\Release\csrc\gpu\csrc\gpu\xpu\ATen\RegisterXPU_0.cpp:186 (function operator ())
2.7.0+xpu
2.7.10+xpu
C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\xpu\__init__.py:60: UserWarning: XPU device count is zero! (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\pytorch\c10\xpu\XPUFunctions.cpp:115.)
return torch._C._xpu_getDeviceCount()
C:\Program Files (x86)\Intel\oneAPI>
(not including torchvision and torchaudio in either pip install and guessing Microsoft runtime isn't needed as already using Visual Studio Build Tools 2022)
As for Linux, Intel has pip, source and docker selections if needed.
May need Level Zero:
garner@linux:~$ python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/__init__.py", line 122, in <module>
from .utils._proxy_module import *
File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/utils/_proxy_module.py", line 2, in <module>
import intel_extension_for_pytorch._C
ImportError: libze_loader.so.1: cannot open shared object file: No such file or directory
garner@linux:~$
After installing the generated .deb - level-zero_1.9.9+l22.1_amd64.deb - by checking out the level-zero tag v1.9.9:
garner@linux:~$ python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];"
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/__init__.py", line 122, in <module>
from .utils._proxy_module import *
File "/usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/utils/_proxy_module.py", line 2, in <module>
import intel_extension_for_pytorch._C
ImportError: /opt/intel/compiler/2025.1/lib/libur_loader.so.0: version `LIBUR_LOADER_0.10' not found (required by /usr/local/lib/python3.12/dist-packages/intel_extension_for_pytorch/lib/../../../../libsycl.so.8)
garner@linux:~$
garner@linux:/opt/puffer$ puffer train puffer_target
Traceback (most recent call last):
File "/usr/local/bin/puffer", line 5, in <module>
from pufferlib.pufferl import main
File "/opt/puffer/pufferlib/pufferl.py", line 28, in <module>
import torch
File "/usr/local/lib/python3.12/dist-packages/torch/__init__.py", line 409, in <module>
from torch._C import * # noqa: F403
^^^^^^^^^^^^^^^^^^^^^^
ImportError: /opt/intel/compiler/2025.1/lib/libur_loader.so.0: version `LIBUR_LOADER_0.10' not found (required by /usr/local/lib/python3.12/dist-packages/torch/lib/../../../../libsycl.so.8)
garner@linux:/opt/puffer$
Note if legacy hardware; Linux Mint has intel-opencl-icd(23.43.27642.40-1ubuntu3) at present instead of 24.35.
Is this as expected?
Processing triggers for libc-bin (2.39-0ubuntu8.4) ...
/sbin/ldconfig.real: /usr/local/lib/libccl.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbb.so.12 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libmpi.so.12 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_5.so.3 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbbind.so.3 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libpti_view.so.0.10 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libmpijava.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libpstloffload.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libmpicxx.so.12 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libur_adapter_opencl.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libOpenCL.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libur_adapter_level_zero.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_0.so.3 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc.so.2 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libsycl-preview.so.8 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libmpifort.so.12 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libur_loader.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc_proxy.so.2 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libsycl.so.8 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libumf.so.0 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libhwloc.so.15 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtcm.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/local/lib/libtcm_debug.so.1 is not a symbolic link
Possible libsycl issue as /opt/intel/compiler/2025.1/lib/libur_loader.so.0.11.10 exists. Would that be added here?
As for pufferlib - bbd22d - if starting with device = xpu in pufferlib/config/ocean/target.ini, linux shows AssertionError: Torch not compiled with XPU enabled which confirms the possibility. Officially without windows support as of 2.0; after pip install -e . --break-system-packages, getting LINK : error LNK2001: unresolved external symbol PyInit_ocean\target\binding with this merge commit.
— elevatorguy
Didn't take note yesterday in windows, but somehow got past libur_loader as the blocker in linux.
garner@linux:~$ puffer train puffer_squared
/home/garner/.local/lib/python3.12/site-packages/torch/xpu/__init__.py:120: UserWarning: XPU device count is zero! (Triggered internally at /pytorch/c10/xpu/XPUFunctions.cpp:115.)
torch._C._xpu_init()
...
RuntimeError: No XPU devices are available.
The ... being a Traceback; used --user with pip when reinstalling intel's torch - uninstalled pufferlib yesterday first in linux.
Today, added AppData\Roaming\Python\Python313\Scripts to PATH; during reinstall of intel's torch in windows, followed a similar process to yesterday but didn't notice a path difference with pip's --user.
Merged in c951bfdhere resulting in the same LINK error as above using python setup.py build_ext --inplace.
Inserting export_symbols=[ path.rstrip('.c').replace('/', '.').replace('\\','_') ],toExtension, python setup.py build_ext --inplace results in two errors of unresolved external symbol - PyInit_ocean\target\binding and pufferlib.ocean_target_binding in windows.
Just hardcoding the first parameter to "binding" results in successful linking; runtime error(s), though. ImportError: cannot import name 'binding' from 'pufferlib.ocean.target' (unknown location)
Without a non-zero XPU device count, may need to reinstall linux as both the non-legacy and legacy1 were installed, right? (used apt-get remove on one of the pairs)
dpkg --list intel* gives the following:
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-=====================================-==========================================-============-================================================================================
ii intel-igc-core 1.0.17537.20 amd64 Intel(R) Graphics Compiler for OpenCL(TM)
ii intel-igc-opencl 1.0.17537.20 amd64 Intel(R) Graphics Compiler for OpenCL(TM)
ii intel-level-zero-gpu-legacy1 1.3.30872.22 amd64 Intel(R) Graphics Compute Runtime for oneAPI Level Zero.
ii intel-level-zero-gpu-legacy1-dbgsym 1.3.30872.22 amd64 debug symbols for intel-level-zero-gpu-legacy1
ii intel-media-va-driver:amd64 24.1.0+dfsg1-1 amd64 VAAPI driver for the Intel GEN8+ Graphics family
un intel-media-va-driver-non-free <none> <none> (no description available)
ii intel-microcode 3.20250512.0ubuntu0.24.04.1 amd64 Processor microcode firmware for Intel CPUs
un intel-opencl <none> <none> (no description available)
rc intel-opencl-icd 24.35.30872.22 amd64 Intel graphics compute runtime for OpenCL
ii intel-opencl-icd-legacy1 24.35.30872.22 amd64 Intel graphics compute runtime for OpenCL
ii intel-opencl-icd-legacy1-dbgsym 24.35.30872.22 amd64 debug symbols for intel-opencl-icd-legacy1
garner@linux:/opt/neo$
This was with stock torch - version 2.7.0.
With Intel's torch, the familiar RuntimeError: No XPU devices are available. occurs. As for puffer eval puffer_target --load-model-path latest: AssertionError: Torch not compiled with CUDA enabled. Unfamiliar error with puffer train puffer_target --train.device=cpu:
Windows fatal exception: code 0xc0000139
Thread 0x00001db0 (most recent call first):
File "D:\puffer\pufferlib\pufferl.py", line 781 in run
File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\threading.py", line 1041 in _bootstrap_inner
File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\threading.py", line 1012 in _bootstrap
Current thread 0x0000266c (most recent call first):
File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\autograd\graph.py", line 824 in _engine_run_backward
File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\autograd\__init__.py", line 353 in backward
File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\_tensor.py", line 648 in backward
File "D:\puffer\pufferlib\pufferl.py", line 428 in train
File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\distributed\elastic\multiprocessing\errors\__init__.py", line 355 in wrapper
File "D:\puffer\pufferlib\pufferl.py", line 914 in train
File "D:\puffer\pufferlib\pufferl.py", line 1203 in main
File "C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Scripts\puffer.exe\__main__.py", line 7 in <module>
File "<frozen runpy>", line 88 in _run_code
File "<frozen runpy>", line 198 in _run_module_as_main
C:\Users\jayg8\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\autograd\graph.py:824: UserWarning: XPU device count is zero! (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\pytorch\c10\xpu\XPUFunctions.cpp:115.)
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
Changed to 2026 build tools. With torch 2.10.0:
C:\Users\jayg8\AppData\Local\Programs\Python\Python312\Lib\site-packages\torch_inductor\lowering.py:1904: FutureWarning: torch._prims_common.check is deprecated and will be removed in the future. Please use torch._check* functions instead.
No longer needing intel-extension-for-pytorch with torch 2.10.0, have fresh linux install without oneapi just yet, puffer eval --train.device=cpu --load-model-path=latest:
[insert screenshot here]
/home/garner/.local/lib/python3.12/site-packages/torch/_inductor/lowering.py:1904: FutureWarning: torch._prims_common.check is deprecated and will be removed in the future. Please use torch._check* functions instead.
(ubuntu 24.04 instead of linux mint)
sudo apt-get install intel-gpu-tools for intel_gpu_top.
PyTorch Prerequisites
Note
Developers who want to run PyTorch deep learning workloads need to install only the drivers and pip install PyTorch wheels binaries. The runtime package for the Intel® Deep Learning Essentials is installed automatically during the pip installation of the PyTorch wheels binaries.
— Intel
Dr. Suarez found CTranslate2 on stream through cibuildwheel. My guess being OpenBLAS is deprecated; I've no experience with oneDNN or other oneapi resource(s) other than Level Zero, but haven't used it yet.
Found this; of interest may be this file.
Important
Developers building PyTorch from source code need to install both the driver and Intel Deep Learning Essentials.
— Intel
Instead of the installer shown above, I'm using the standalone installer available for the compiler. If Intel's
Deep Neural Network LibraryandMath Kernel Libraryare useful, please comment below.With
C:\Program Files (x86)\Intel\oneAPI\compiler\2025.1\bin\common_clang64.dll, does this mean icx/icpx is clang-compatible? Is it usable in other projects? That aside; would really like to use it if it includes feature(s) facilitating hardware acceleration.if needed: https://www.intel.com/content/www/us/en/developer/articles/technical/vectorization-llvm-gcc-cpus-gpus.html
Without Visual Studio Build Tools 2022 available in Linux, compilation fails if needing
vcruntime.hwhen usingicxoricpx. Noticed-std=as expected withicxin linux seems to be-Qstd=withicxin Windows.https://intel.github.io/intel-extension-for-pytorch/
As for pufferlib -
bbd22d- if starting withdevice = xpuinpufferlib/config/ocean/target.ini, linux showsAssertionError: Torch not compiled with XPU enabledwhich confirms the possibility. Officially without windows support as of2.0; afterpip install -e . --break-system-packages, gettingLINK : error LNK2001: unresolved external symbol PyInit_ocean\target\bindingwith this merge commit.Found Intel's install through their tutorial and example, seemingly without any Known Issue after both
pip installcommands completed successfully:(not including
torchvisionandtorchaudioin eitherpip installand guessing Microsoft runtime isn't needed as already usingVisual Studio Build Tools 2022)As for Linux, Intel has
pip,sourceanddockerselections if needed.May need Level Zero:
After installing the generated
.deb-level-zero_1.9.9+l22.1_amd64.deb- by checking out thelevel-zerotagv1.9.9:Pending pytorch issue; got this:
Note if legacy hardware; Linux Mint has
intel-opencl-icd(23.43.27642.40-1ubuntu3)at present instead of 24.35.Is this as expected?
Possible
libsyclissue as/opt/intel/compiler/2025.1/lib/libur_loader.so.0.11.10exists. Would that be added here?Didn't take note yesterday in windows, but somehow got past
libur_loaderas the blocker in linux.The
...being a Traceback; used--userwithpipwhen reinstalling intel's torch - uninstalled pufferlib yesterday first in linux.Today, added
AppData\Roaming\Python\Python313\Scriptsto PATH; during reinstall of intel's torch in windows, followed a similar process to yesterday but didn't notice a path difference withpip's--user.Merged in
c951bfdhere resulting in the sameLINKerror as above usingpython setup.py build_ext --inplace.Needed
set DISTUTILS_USE_SDK=1.To achieve linkage, some changes need to be made to
setup.pyas this fails:This setup.py is different already but further:
Inserting
export_symbols=[ path.rstrip('.c').replace('/', '.').replace('\\','_') ],toExtension,python setup.py build_ext --inplaceresults in two errors of unresolved external symbol -PyInit_ocean\target\bindingandpufferlib.ocean_target_bindingin windows.Just hardcoding the first parameter to "binding" results in successful linking; runtime error(s), though.
ImportError: cannot import name 'binding' from 'pufferlib.ocean.target' (unknown location)Have yet to clone
gmmlibandintel-graphics-compiler.Without a non-zero
XPU device count, may need to reinstall linux as both the non-legacy andlegacy1were installed, right? (usedapt-get removeon one of the pairs)dpkg --list intel*gives the following:This was with stock
torch- version 2.7.0.With Intel's torch, the familiar
RuntimeError: No XPU devices are available.occurs. As forpuffer eval puffer_target --load-model-path latest:AssertionError: Torch not compiled with CUDA enabled. Unfamiliar error withpuffer train puffer_target --train.device=cpu:Changed to 2026 build tools. With torch
2.10.0:No longer needing
intel-extension-for-pytorchwith torch2.10.0, have fresh linux install without oneapi just yet,puffer eval --train.device=cpu --load-model-path=latest:[insert screenshot here]
(ubuntu 24.04 instead of linux mint)
sudo apt-get install intel-gpu-toolsforintel_gpu_top.