Question about USM

Please provide a working USM example for ComputeCpp 2.0

Thanks!

[Computecpp:CC0008]: a variable of type ‘float *’ cannot be captured by a SYCL kernel…

Can you be a bit more specific about what example code you are talking about? We don’t currently provide any USM example code with the ComputeCpp samples.
Thank you.

I got the CC0008 error when trying to compile a USM example. I appreciate it if a USM example(e.g. vector add) is available for test and understanding.

Can you post your code here please?

Currently the only example of USM that exists is in the SYCL Academy code exercises but keep in mind there are currently variations between ComputeCpp and DPC++, and therefore there are two solution files (one for ComputeCpp and one for DPC++).

Please see the USM example below.

It is fine that there is some difference in namespaces, but that may be the only difference between different USM designs. An elegant design.

We’ll need to take a look at that code to see if it works with ComputeCpp. In the meantime can you use the two versions in the Academy exercises I linked to?

Okay.
Sorry, I am not comfortable with your USM design.

@rod Thanks for sharing the SQL Academy example. I am using USM with DPCPP to port the GPU multi-d array library gtensor to SYCL, see https://github.com/wdmapp/gtensor/pull/19. I was hoping to test with ComputeCpp 2 as well, trying to figure out if I can jerry rig compatibility between the two.

So far I have it working with some toy examples, here: https://github.com/bd4/sycl-test/commit/d79208bcec2068559b8a911c7afed58a14a52ef3. The header difference is awkward because I can’t import CL/sycl.hpp first to test for COMPUTECPP, since according to the example the wrapper header must be included first, so I define my own pre-processor var to test for. In gtensor it’s easy enough work around the includes, and the namespace differences just require a few ifdefs since there are already malloc/free interfaces to support CUDA and HIP so it’s only needed in one place. The usm_wrapper is more awkward, because the usm memory pointers exist inside a complex nested CRTP data structure, and they can only be used inside kernel lambdas (in particular for q.memcpy the non-wrapped version is required). What is the latest thinking on this - is it an interim implementation, or is this on the table as a change to USM before it becomes an official part of the next SYCL spec?

The current gtensor port is doing things that are not strictly allowed by SYCL spec - passing non-standard layout non-trivially copyable types. But in practice they should be simple enough that they don’t cause any problems (and in fact work with Intel Oneapi beta6). I had to remove usage of std::tuple in one of the data structures, after that it basically just worked. The CRTP data structure approach used in gtensor also works fine in CUDA and HIP.

@bdallen thanks for the note. I’d start by saying that the part of the SYCL spec for USM is not yet finalised. So while we have done some work on an initial implementation we are waiting for the final agreed spec before we complete our implementation. It’s clear that the early implementation we have done and the DPC++ one have helped the group to progress towards the final spec.

The wrapper is an interim implementation and we will work towards implementing ComputeCpp to be compliant with the spec once it is finalised. It’s hard to put any timescales on that at the moment, hopefully the timelines for the spec will be clearer soon.