there’s a couple of things you can try. Can you confirm that nvidia-smi reports a device and matching CUDA version?
Secondly, you can set the environment variable SYCL_PI_TRACE=-1 before running the sycl-ls command. There will be a lot of output, but in the first few lines there should be something along the lines of
SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_cuda.so [ PluginVersion: 12.21.1 ]
If there isn’t, then hopefully there will be information about why it didn’t load.
From my nvidia-smi get this:
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3070 On | 00000000:01:00.0 On | N/A |
| 0% 50C P8 12W / 220W | 316MiB / 8192MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+
±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1411 G /usr/lib/xorg/Xorg 76MiB |
| 0 N/A N/A 1873 C+G …libexec/gnome-remote-desktop-daemon 155MiB |
| 0 N/A N/A 1918 G /usr/bin/gnome-shell 69MiB |
±--------------------------------------------------------------------------------------+
So when enable the SYCL trace and detected this, it not have libpi_cuda.so.
SYCL_PI_TRACE[-1]: dlopen(/opt/intel/oneapi/compiler/2023.2.0/linux/lib/libpi_cuda.so) failed with </opt/intel/oneapi/compiler/2023.2.0/linux/lib/libpi_cuda.so: cannot open shared object file: No such file or directory>
Question:
Suppose when install the One-API based kit will package this libpi_cuda.so? Or I need to manually patch is?
If need manually patch it, can you guide me or any guide?