According to this page: oneAPI for CUDA® - Codeplay Software Ltd, OneMKL is supported with NVIDIA GPU. I used a simple example provided with OneAPI (fcorr_1d_buffers.cpp) fails with the error below. It seems to happen right when the oneapi::mkl::rng::generate() function is called around L48 of the code.
Running on: NVIDIA GeForce RTX 3050 Laptop GPU terminate called after throwing an instance of ‘sycl::_V1::runtime_error’
*** what(): Native API failed. Native API returns: -42 (PI_ERROR_INVALID_BINARY) -42 (PI_ERROR_INVALID_BINARY)*** Aborted (core dumped)
The exact line used to compile is:
clang++ -O2 -DMKL_ILP64 -fsycl -fsycl-targets=nvptx64-nvidia-cuda -qmkl=parallel -o CMakeFiles/CodePlay.dir/simple-sycl-app.cpp.o -c simple-sycl-app.cpp
The exact line used to link is:
clang++ -O2 -DMKL_ILP64 -fsycl -fsycl-targets=nvptx64-nvidia-cuda -qmkl=parallel CMakeFiles/CodePlay.dir/simple-sycl-app.cpp.o -o CodePlay
Results are same with the -O2 flag omitted. The MKL flags are needed for the MKL portions of the code. The contents of simple-sycl-app.cpp are the same as fcorr_1d_buffers.cpp (example provided in OneAPI).
You will need to build oneMKL with the right backend to use these custom kernels including for RNG.
The project includes CMake flags to enable these, I believe the appropriate one is ENABLE_CURAND_BACKEND
Just to check, did you get the oneMKL binaries with the oneAPI base toolkit release?
As a follow up, can oneMKL be compiled with multiple backends (e.g. ENABLE_MKLCPU_BACKEND, ENABLE_CURAND_BACKEND and ENABLE_CUBLAS_BACKEND) enabled? I assume this is what I need to run common code (either on CPU or GPU, selected at runtime) on my laptop which has an intel CPU and NVIDIA GPU.
Did you resolve your problem? Same issue here. I’ve rebuilt MKL library adding cuFFT, cuBLAS & cuRAND support but with no effect. Exception: Native API failed. Native API returns: -42 (PI_ERROR_INVALID_BINARY) -42 (PI_ERROR_INVALID_BINARY) still occurs.
Hi @vandyke,
is your sycl::queue using a device selector which selects the NVIDIA GPU? It could be that the library is correctly compiled for NVIDIA backend, but your queue defaults to selecting an OpenCL or Level Zero backend, hence the “invalid binary” error.
You can check this by running your application with the environment variable SYCL_PI_TRACE=1, for example:
SYCL_PI_TRACE=1 ./my-app
This should print (among other things) something like:
SYCL_PI_TRACE[all]: Selected device: -> final score = 1500
SYCL_PI_TRACE[all]: platform: NVIDIA CUDA BACKEND
SYCL_PI_TRACE[all]: device: NVIDIA A100-SXM4-40GB
The score and device name will differ for you, but it should be the NVIDIA CUDA BACKEND.
or using the default selector (used by the default constructor of sycl::queue) and narrowing down the list of devices using the ONEAPI_DEVICE_SELECTOR environment variable. For example:
Thx for Your response. I’ve discovered where the problem exists and this is not problem (i think…) with selected backend. I tried compiling & run an example shown at this link:
Hi @vandyke,
I see now what the issue is. You are using a function which is part of the Vector Math domain in oneMKL API specification. Unfortunately, the oneMKL interfaces library does not yet implement the full API specification and the Vector Math domain is one of the missing parts. You can see more information in the README of the project:
The instructions from intel.com you’re following assume you’re using the last one, and unfortunately don’t fully apply to the second one as its implementation is still in progress.