Can code generated w/ ptx64 backend be profiled by nvprof?

leggett · 13 September 2019 19:07

I would like to use the nvidia profiler nvprof to profile some SYCL code generated with the ptx64 backend to run on an NVidia GPU. I have no problems running the profiler on code that uses CUDA, but when I view the generated timeline of the SYCL executable, it’s empty. Is there something special I need to do, or is this not possible at all?

If this is not possible, are there alternatives for profiling SYCL code on NVidia (or AMD) hardware? The profiling section of the manual only mentions Intel hardware.

rod · 16 September 2019 07:29

There’s an article on how to manually profile with the Community Edition that might help you, this will be migrated to the developer website documentation shortly.

duncan · 16 September 2019 09:22

Additionally, the nvprof tool doesn’t seem to work with OpenCL kernels, which is what ComputeCpp ultimately ends up running on nvidia devices. There used to be workarounds but as far as I’m aware none of them work as of a couple of years now.

If you’d like to profile on nvidia devices, I would recommend contacting nvidia directly to request that their profiling tools work for both OpenCL and CUDA. I would also say that at the moment performance is likely to appear to be quite bad for nvidia devices as at the moment our profiling will tell you that kernels take multiples of 100ms to run, and basically never less than that. We don’t have a timeframe for fixing this unfortunately but we are aware of it.

Topic		Replies	Views
How to profile the SYCL SYCL development	2	281	19 February 2024
Cannot profile SYCL kernel using host device SYCL development	1	828	16 September 2019
How can we disable 'full profile' SYCL development	1	756	25 July 2019
Sycl to OpenCL code SYCL development	1	624	25 July 2019
Performance FAQ and best practices SYCL performance	3	113	15 July 2024

Can code generated w/ ptx64 backend be profiled by nvprof?

Related topics