When $sycl-ls not detected NVIDIA GPU

Hi,

Can someone help?

Here is problem.
When I execute ‘sycl-ls’ not detected NVIDIA GPU.

cbrd@cbrd-MPG-B460-Trident-AS-MS-B926:~$ sycl-ls
[opencl:cpu:0] Intel(R) OpenCL, Intel(R) Core™ i7-10700F CPU @ 2.90GHz 3.0 [2023.16.6.0.22_223734]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core™ i7-10700F CPU @ 2.90GHz 3.0 [2023.16.6.0.22_223734]
[opencl:acc:2] Intel(R) FPGA Emulation Platform for OpenCL™, Intel(R) FPGA Emulation Device 1.2 [2023.16.6.0.22_223734]

On my pcie:
cbrd@cbrd-MPG-B460-Trident-AS-MS-B926:~$ lspci | grep NVIDIA
01:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GA104 High Definition Audio Controller (rev a1)

Can give me some tips?

Hi @wei.seng.yeap,

there’s a couple of things you can try. Can you confirm that nvidia-smi reports a device and matching CUDA version?

Secondly, you can set the environment variable SYCL_PI_TRACE=-1 before running the sycl-ls command. There will be a lot of output, but in the first few lines there should be something along the lines of

SYCL_PI_TRACE[basic]: Plugin found and successfully loaded: libpi_cuda.so [ PluginVersion: 12.21.1 ]

If there isn’t, then hopefully there will be information about why it didn’t load.

Thanks Duncan.

From my nvidia-smi get this:
±--------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
|-----------------------------------------±---------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3070 On | 00000000:01:00.0 On | N/A |
| 0% 50C P8 12W / 220W | 316MiB / 8192MiB | 0% Default |
| | | N/A |
±----------------------------------------±---------------------±---------------------+

±--------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1411 G /usr/lib/xorg/Xorg 76MiB |
| 0 N/A N/A 1873 C+G …libexec/gnome-remote-desktop-daemon 155MiB |
| 0 N/A N/A 1918 G /usr/bin/gnome-shell 69MiB |
±--------------------------------------------------------------------------------------+

So when enable the SYCL trace and detected this, it not have libpi_cuda.so.

SYCL_PI_TRACE[-1]: dlopen(/opt/intel/oneapi/compiler/2023.2.0/linux/lib/libpi_cuda.so) failed with </opt/intel/oneapi/compiler/2023.2.0/linux/lib/libpi_cuda.so: cannot open shared object file: No such file or directory>

Question:
Suppose when install the One-API based kit will package this libpi_cuda.so? Or I need to manually patch is?

If need manually patch it, can you guide me or any guide?

Thanks!!

@wei.seng.yeap

libpi_cuda.so is installed by CodePlay plugin installer.

You can use --install-dir option to point the installer to the location where oneAPI is installed.

@vitduck ,

Can you point me where to get the CodePlay Plugin installer?

Thanks!

You can download the binary plugin here