intel-extension-for-pytorch: RuntimeError: Number of dpcpp devices should be greater than zero!

Hello, I used the gpu configuration oneAPI is installed correctly I am in a python virtual environment ai_tr I have this issue with Pytorch, the two import are on the top of the file : import torch import intel_extension_for_pytorch as ipex

I runned: source ${ONEAPI_ROOT}/setvars.s with output :

(ai_tr) axel@Artishima:~/ai_tr/cod$ source ${ONEAPI_ROOT}/setvars.sh

:: WARNING: setvars.sh has already been run. Skipping re-execution.
   To force a re-execution of setvars.sh, use the '--force' option.
   Using '--force' can result in excessive use of your environment variables.

usage: source setvars.sh [--force] [--config=file] [--help] [...]
  --force        Force setvars.sh to re-run, doing so may overload environment.
  --config=file  Customize env vars using a setvars.sh configuration file.
  --help         Display this help message and exit.
  ...            Additional args are passed to individual env/vars.sh scripts
                 and should follow this script's arguments.

  Some POSIX shells do not accept command-line options. In that case, you can pass
  command-line options via the SETVARS_ARGS environment variable. For example:

  $ SETVARS_ARGS="ia32 --config=config.txt" ; export SETVARS_ARGS
  $ . path/to/setvars.sh

  The SETVARS_ARGS environment variable is cleared on exiting setvars.sh.

With --force :

(ai_tr) axel@Artishima:~/ai_tr/cod$ source ${ONEAPI_ROOT}/setvars.sh --force

:: initializing oneAPI environment ...
   -bash: BASH_VERSION = 5.1.16(1)-release
   args: Using "$@" for setvars.sh arguments: --force
:: advisor -- latest
:: ccl -- latest
:: compiler -- latest
:: dal -- latest
:: debugger -- latest
:: dev-utilities -- latest
:: dnnl -- latest
:: dpcpp-ct -- latest
:: dpl -- latest
:: ipp -- latest
:: ippcp -- latest
:: mkl -- latest
:: mpi -- latest
:: tbb -- latest
:: vpl -- latest
:: vtune -- latest
:: oneAPI environment initialized ::

But this error keep appearing whenether I try to run my training python file: xpu /home/axel/ai_tr/lib/python3.10/site-packages/intel_extension_for_pytorch/xpu/lazy_init.py:73: UserWarning: DPCPP Device count is zero! (Triggered internally at /build/intel-pytorch-extension/csrc/gpu/runtime/Device.cpp:120.) _C._initExtension() /home/axel/ai_tr/lib/python3.10/site-packages/torch/nn/modules/module.py:985: UserWarning: dpcppSetDevice: device_id is out of range (Triggered internally at /build/intel-pytorch-extension/csrc/gpu/runtime/Device.cpp:159.) return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) Traceback (most recent call last): File "/home/axel/ai_tr/cod/train.py", line 190, in <module> m = model.to(device) File "/home/axel/ai_tr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 987, in to return self._apply(convert) File "/home/axel/ai_tr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 639, in _apply module._apply(fn) File "/home/axel/ai_tr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 662, in _apply param_applied = fn(param) File "/home/axel/ai_tr/lib/python3.10/site-packages/torch/nn/modules/module.py", line 985, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) RuntimeError: Number of dpcpp devices should be greater than zero!

Everything related to mkl is installed correctl and path are set correctly and working, I am on ubuntu 22.04 using torch 13.1 on WSL 2 on windows 11 with intel drivers installed on windows 11 on Arc 770 with i9 13900K.

The error is trigerred here :

#the line below is triggering the error
m = model.to(device)
m = ipex.optimize(m)
# print the number of parameters in the model
print(sum(p.numel() for p in m.parameters())/1e6, 'M parameters')

# create a PyTorch optimizer
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate)```

Also the name optimize is a weird naming.

It does seems to be an out of range issue, I have no idea how to solve this issue.

About this issue

  • Original URL
  • State: open
  • Created a year ago
  • Comments: 22 (9 by maintainers)

Most upvoted comments

please use oneapi basekit 2023.0 with 1.13.10. Also, it seems like you don’t have level-0 installed. Please install it as well. Driver version better to be 540, as shown in the installation guide.

DPCPP runtime version: 2023.1.0 <====================
MKL version: 2023.1.0 <===========================
GPU models and configuration: 
 
Intel OpenCL ICD version: 23.05.25593.18-1.fc38
Level Zero version: N/A <==========================

dnf/yum install -y intel-opencl level-zero intel-level-zero-gpu

By the way, we will have a new release soon. Probably you can try the new version directly soon.