I would like to use the nvidia profiler
nvprof to profile some SYCL code generated with the ptx64 backend to run on an NVidia GPU. I have no problems running the profiler on code that uses CUDA, but when I view the generated timeline of the SYCL executable, it’s empty. Is there something special I need to do, or is this not possible at all?
If this is not possible, are there alternatives for profiling SYCL code on NVidia (or AMD) hardware? The profiling section of the manual only mentions Intel hardware.