SYCL CPU backend implementation and SIMD

I’m just starting with SYCL and wanted to know if CodePlay’s SYCL implementation uses any kind of SIMD instructions on the CPU backend. According to the docs when there is no acceleration device (GPU, FPGA…) by default it falls back to a CPU implementation which I assume that it uses multithreading but not sure if it has any kind of SIMD.

Thanks in advance.

Hi @daniel.vansa, there is a host device available in SYCL implementations, ComputeCpp is no exception. The host device exactly corresponds to the code output by the host compiler, so if your host compiler outputs vector instructions, that’s what you’ll get. I think ComputeCpp launches about 8 threads.

That said, I would warn you that the host device is pretty slow. It’s a fairly limited use case, since there are good OpenCL CPU implementations (for example, pocl and Intel’s CPU runtime) which can run user code efficiently - in contrast, ComputeCpp simply calls a function in the host device implementation, which means it’s very reliant on the host compiler. It is also harder to provide optimised versions of some functions (for example, barriers, which a CPU OpenCL implementation could implement efficiently).

1 Like

Makes sense. Thanks for clarification.