How to debug using GDB on the host device if sycl::queue(host_selector) isn't supported anymore?

This article explains how to debug your code with GDB on the host device by instantiating a sycl::queue with the sycl::host_selector{}.

However, when trying to do this with the following snippet of code:

#include <CL/sycl.hpp>
using namespace sycl;

static const int N = 16;

int main()
{
    queue q(sycl::host_selector{});
    std::cout << "Device: " << q.get_device().get_info<info::device::name>() << std::endl;

    std::vector<int> v(N);
    for(int i=0; i<N; i++) v[i] = i;

    buffer<int, 1> buf(v.data(), v.size());
    q.submit([&] (handler &h)
    {
        auto A = buf.get_access<access::mode::read_write>(h);
        h.parallel_for(range<1>(N), [=](id<1> i)
        {
            A[i] *= 2;
        });
    });

    buf.get_access<access::mode::read>(); // <--- Host Accessor to Synchronize Memory

    for(int i=0; i<N; i++) std::cout << v[i] << std::endl;

    return 0;
}

I get the runtime error:

No device of requested type available. Please check https://software.intel.com/content/www/us/en/develop/articles/intel-oneapi-dpcpp-system-requirements.html -1 (PI_ERROR_DEVICE_NOT_FOUND)

and a compiler warning about host_selector not being supported anymore.

So I think I understand that host_selector isn’t usable anymore but is there an alternative to debug a SYCL application using GDB? I couldn’t find anything on the internet.

I managed to debug this test application with the help of sycl::out and printing values in the terminal but that’s far less convenient than being able to use GDB.

Is there a recent alternative to host_selector and the host device in general ?

I’m not sure how I can check the version of my SYCL / Intel DPCPP installation but all SYCL headers are installed in /opt/intel/oneapi/compiler/2023.2.1/linux/include/sycl/ so I assume this is a 2023 version of SYCL. I installed Intel Base OneAPI kit using this method.

Hi @Adhesive_Bagels,
indeed sycl::host_selector is not part of the SYCL 2020 specification, which is implemented by the DPC++ compiler (including the compiler version you have, 2023.2.1).

In SYCL 2020, the way to run on a CPU would be to use a CPU target device. You can select one using the built-in CPU device selector sycl::cpu_selector_v like this:

sycl::queue q{sycl::cpu_selector_v};
q.submit([&](handler& cgh) {
  // your task here
});

or with the DPC++ implementation you can also use the default device selector:

sycl::queue q{};
q.submit([&](handler& cgh) {
  // your task here
});

and narrow down the list of available devices when starting the application with the environment variable ONEAPI_DEVICE_SELECTOR=*:cpu documented here.

The CPU device is implemented by the DPC++ in Intel’s oneAPI Base Toolkit using the OpenCL backend and drivers. If you installed the toolkit through a package manager, the OpenCL CPU support should be already available. You can confirm by running sycl-ls which should list a opencl:cpu device like this (among others):

[opencl:cpu:1] Intel(R) OpenCL, 12th Gen Intel(R) Core(TM) i9-12900K 3.0 [2023.16.6.0.22_223734]

Hi @rbielski , thanks for your answer.

I could select the CPU for the execution of my code using cpu_selector_v but I still couldn’t debug my code line by line using GDB (using the integration of GDB in my QtCreator IDE to be precise).

I tried using GDB on the commandline (and not through QtCreator) but even after putting down a breakpoint in the code of my functor kernel class (and compiling the code with cpu_selector_v) GDB doesn’t break.

How can I debug my code line by line?

Hi @Adhesive_Bagels,
yes, it should certainly work. Did you compile with -g -O0 for debugging? The following example code works for me:

#include <sycl/sycl.hpp>
#include <vector>
#include <iostream>

constexpr static size_t N{1024};

int main() {
  std::vector<int> v(N);
  for (size_t i{0}; i<N; ++i) {
    v[i] = i;
  }

  sycl::queue q{sycl::cpu_selector_v};
  sycl::buffer buf{v};
  q.submit([&](sycl::handler& cgh) {
    auto acc{buf.get_access(cgh,sycl::read_write)};
    cgh.parallel_for(N, [=](sycl::id<1> id) {
      int value{acc[id]};
      acc[id] = 2 * value;
      acc[id] += 1;
    });
  });

  auto acc{buf.get_host_access(sycl::read_only)};
  for (size_t i : {0, 63, 255, 511, 1023}) {
    std::cout << "v[" << i << "] = " << acc[i] << std::endl;
  }

  return 0;
}

Compiled and debugged in the following way:

$ icpx -fsycl -g -O0 -o test ./test.cpp
$ gdb ./test 
(gdb) list test.cpp:19
14	  sycl::buffer buf{v};
15	  q.submit([&](sycl::handler& cgh) {
16	    auto acc{buf.get_access(cgh,sycl::read_write)};
17	    cgh.parallel_for(N, [=](sycl::id<1> id) {
18	      int value{acc[id]};
19	      acc[id] = 2 * value;
20	      acc[id] += 1;
21	    });
22	  });
23	
(gdb) break test.cpp:19
Breakpoint 1 at 0x406502: file ./test.cpp, line 19.
(gdb) run
Starting program: /tmp/test 
[Switching to Thread 0x7fff923ff640 (LWP 12195)]
Thread 8 "test" hit Breakpoint 1, main::{lambda(sycl::_V1::handler&)#1}::operator()(sycl::_V1::handler&) const::{lambda(sycl::_V1::id<1>)#1}::operator()(sycl::_V1::id<1>) const (this=0x7fff923fe230, id=...) at test.cpp:19
19	      acc[id] = 2 * value;
(gdb) print id
$1 = {<sycl::_V1::detail::array<1>> = {common_array = {448}}, <No data fields>}
(gdb) print value
$2 = 448
(gdb) continue
Continuing.
[Switching to Thread 0x7fff5bfff640 (LWP 12210)]
Thread 23 "test" hit Breakpoint 1, main::{lambda(sycl::_V1::handler&)#1}::operator()(sycl::_V1::handler&) const::{lambda(sycl::_V1::id<1>)#1}::operator()(sycl::_V1::id<1>) const (this=0x7fff5bffe230, id=...) at test.cpp:19
19	      acc[id] = 2 * value;
(gdb) print id
$3 = {<sycl::_V1::detail::array<1>> = {common_array = {256}}, <No data fields>}
(gdb) print value
$4 = 256

You can see I could set a breakpoint and stop there and investigate the stack inside the kernel function in different threads.

I’m using the DPC++ compiler version 2023.2.1 and gdb version 12.1.

Hope this helps,
Rafal

Even with your simple example, I cannot seem to get my GDB to hit the breakpoint:

$ icpx -fsycl -g -O0 -o test ./test.cpp
$ gdb /test
(gdb) list test.cpp:19
14	  sycl::buffer buf{v};
15	  q.submit([&](sycl::handler& cgh) {
16	    auto acc{buf.get_access(cgh,sycl::read_write)};
17	    cgh.parallel_for(N, [=](sycl::id<1> id) {
18	      int value{acc[id]};
19	      acc[id] = 2 * value;
20	      acc[id] += 1;
21	    });
22	  });
23	
(gdb) break test.cpp:19
Breakpoint 1 at 0x406692: file ./test.cpp, line 19.
(gdb) run
Starting program: /home/bagels/test 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdc3ac700 (LWP 8472)]
[New Thread 0x7fffd88c2700 (LWP 8473)]
[New Thread 0x7fffd84c1700 (LWP 8474)]
[New Thread 0x7fffc5f55700 (LWP 8475)]
[New Thread 0x7fffc5b54700 (LWP 8476)]
[New Thread 0x7fffc5352700 (LWP 8478)]
[New Thread 0x7fffc5753700 (LWP 8477)]
[New Thread 0x7fffc4f51700 (LWP 8479)]
//v[0] = 1
v[63] = 127
v[255] = 511
v[511] = 1023
v[1023] = 2047
[Thread 0x7fffdc3ac700 (LWP 8472) exited]
[Thread 0x7fffc5352700 (LWP 8478) exited]
[Thread 0x7fffc4f51700 (LWP 8479) exited]
[Thread 0x7fffc5753700 (LWP 8477) exited]
[Thread 0x7fffc5b54700 (LWP 8476) exited]
[Thread 0x7fffc5f55700 (LWP 8475) exited]
[Thread 0x7fffd88c2700 (LWP 8473) exited]
[Thread 0x7ffff596ff80 (LWP 8468) exited]
[Inferior 1 (process 8468) exited normally]

My GDB doesn’t hit the breakpooint. Maybe this has to do with my version of GDB? I’m using GDB 9.2 but this post gives a working example with GDB 8.1.

I’m also using DPC++ 2023.2.1.

Hi @Adhesive_Bagels,
that’s odd, but could indeed be due to the gdb version or due to the OpenCL driver version. Could you post the output of your sycl-ls and clinfo?

Hi @rbielski

Here’s the output of sycl-ls:

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2023.16.7.0.21_160000]
[opencl:cpu:1] Intel(R) OpenCL, 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz 3.0 [2023.16.7.0.21_160000]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Iris(R) Xe Graphics 3.0 [23.30.26918.9]

and the pastebin of clinfo.

Hi @Adhesive_Bagels,
thanks for the extra details, your OpenCL installation looks good. I just tested gdb 9.2 on Ubuntu 20.04 and confirmed that indeed the breakpoint doesn’t work in that version. I see the same behaviour as you.

Fortunately, the oneAPI base toolkit comes packaged with a gdb-oneapi installation which is built on top of a recent gdb version (gdb 13.1 for oneAPI 2023.2). It’s a version of gdb that’s extended by functionality to debug Intel GPUs, but it still contains all the elements of the standard gdb. Just use it like the regular version:

gdb-oneapi ./test
(gdb) break test.cpp:19
(gdb) run

Hi @rbielski,

Using gdb-oneapi works as expected and the debugger hits the breakpoint.

Thanks for the help!