SYCL problem when defining a data class containig buffer as member

Ryan_xjh · 11 December 2020 06:37

I wrote a template data class containing buffer and accessor, which can automatically create buffer and the corresponding accessor based on the template arguments. But when I used an instance of that class in SYCL kernel, it emitted errors that SYCL cannot capture non standard_layout class. How can I fix that?

rod · 11 December 2020 09:21

Hi,
Can you post a code snippet of what you are trying to do so that we can figure out a solution for you?
Thanks,
Rod.

Ryan_xjh · 11 December 2020 10:11

Hi, Rod

Here’s what I am trying to do.

template <class T,size_t N>
struct Data{
	buffer<T, N> buf;
	
	Data(){
		//initialize buf...
	}

	T& operator()(const int i) const{
		//get accessor from buf and return accessor[i]; 
	}
};

struct Func{
	Data<..> data;
	
	Func(Data<..> _data):data(_data){
		
	} 
	
	void operator()(const int i) const{
		//do some computation using data(i)
	}
};

class my_kernel;

template <class Functor, size_t N>
void kernel(Functor functor){
	//...
    	cgh.parallel_for<class my_kernel>(nd_range<1>(N,N),[=] (nd_item<1> item) {
			  	size_t global_id = item.get_local_linear_id();	
			  	functor(i);
        	});	
	//...
}

int main(){
	//...
	kernel(/*A instance of Func as parameter*/);
	//...
}

Is there a way I can make it?
Thanks,
Ryan.

duncan · 11 December 2020 12:48

You can format code by indenting it with four spaces. You might also be able to introduce a block with three backquotes (```).

To put it simply, you can’t have buffers in device code. Specifically, types you instantiate on device must be standard_layout and trivially_copyable, which buffers are not. In this case you’ll need to split the buffer and accessor holding-classes apart. Personally, when programming in SYCL, I find it useful to wrap the buffer class a lot of the time, but just use raw accessors in the kernel interface, so I can just pull the accessor out of the buffer easily then pass it to the kernel.

Ryan_xjh · 11 December 2020 13:20

Thanks, Duncan. It seems like we can only get an accessor from a target buffer in a command groud since the function get_access has a handler as argument. Is it possible to get accessors outside queue.summit() so that I can directly use them in device kernels?

//build a buffer and get its accessor here
queue.submit([&] (cl::sycl::handler& cgh){
     //... do something
     cgh.parallel_for<class my_kernel1>(nd_range<1>(5,5),[=] (nd_item<1> item) {
			  	size_t global_id = item.get_local_linear_id();	
          		//use the accessor here
     });			
});

duncan · 11 December 2020 13:42

In your example, I would simply construct the accessor at the line // ... do something, since that code is still host code at that point. If you want an alternative you could look for placeholder accessors in the specification, though they’re used much less frequently (but might match your problem better).

Ryan_xjh · 11 December 2020 13:52

I had considered using a placeholder accessor. But if I am not wrong, we need to use cgh.require() to link it with a buffer, which means it still happens inside a command groud just like using get_access() at the line // ... do something.

duncan · 11 December 2020 14:30

Yes, so in this instance I would create the accessor at the // do something line. Is there a reason why you can’t do that in this instance?

Ryan_xjh · 11 December 2020 14:56

I want to build an abstract above SYCL. Just like the codes I fisrt posted(I have formatted it), I want to design a template struct myData that contains accesor and the underlining data can be accessed using operator()(const int i). Then I will have a function object class Func which have a myData type instance fData as member so that I can do some computation for the specific data of fData using operator()(const int i) (such as fData(i) = i * 5 etc). Once I have myData and Func set, I can just simple pass the Func object to device kernel to run.
To put it simply, I want to simplify programming of SYCL by exposing a data api (what kind of data you wanna set) and a function object api (what kind of computation you wanna run on the set data).

duncan · 16 December 2020 21:00

Hi @Ryan_xjh,

I am not sure you can write the code as you have intended it here. Accessor construction must happen on the host, and buffers can only exist on the host, so you can’t really have your Data struct as it currently exists in SYCL.

I’m kind of assuming that you want to do this abstraction in order to paper over the differences between SYCL and some other kind of compute solution, like CUDA or similar, otherwise I don’t really see the point. In my opinion, ultimately you just end up sort of rewriting the API you’re abstracting but a little different for a small gain. I might be wrong in this instance, of course, but in this answer I am suggesting a nontrivial amount of work which (again IMO) generally would only pay off if there were some radically different systems underneath that you wanted to pretend were the same to the users writing compute kernels.

Looking at your sample in particular, I you have // do some computation using data(i) in Func::operator()(const int). If that’s the case, instead of passing in an index, and letting the user operator dereference data_ (which I’m assuming in 90% of cases would be return acc[i]), why not pass them a (mutable?) reference to a templated data type? That way you would be able to write the SYCL code which constructs the accessors in a way that you control, passing them to the user-written function object. You’d then be able to basically eliminate the Data class.

Of course if you’d like to do more complicated stuff like halos then this idea won’t really work. At this stage, I would consider templating the user kernel on a pointer type of some kind, then instantiating the user template for each API you are abstracting over with the appropriate pointer type (for example, cl::sycl::multi_ptr<T>). That might get you to the stage where, for example, you could instantiate kernels for both CUDA and SYCL from the same user code. In this scheme Data would be a using declaration, or in a more extreme case a wrapper that papers over the differences between how APIs treat data buffers.

Honestly, though, it’s quite a lot of faff. To get the library Eigen working on both CUDA and SYCL (which we added support for) was a lot of development effort.

Apologies if I have misunderstood your intentions here, but I’d rather not suggest some flippantly easy solution that ends up biting you later on.

Actually on second thoughts: you could maybe split Data into two classes, one which is a HostData class, and provides some DeviceData class from a member function, which hide the buffer and accessor respectively. Then you really are just rewriting the SYCL API though! (In this scheme, you would use placeholder accessors; in the HostData::makeDeviceData() function you would bind the buffer to the accessor, then require() it at the same time - then the user Func would be able to access the member through DeviceData::operator() but again I see this as a straight rewrite of SYCL.)

Topic		Replies	Views
Compile error: reference to non-static member function must be called SYCL development	2	2662	4 October 2020
Explicit copy from host buffer to device buffer uses uninitialized data SYCL development	2	2875	13 December 2019
SYCL problem with device pointer SYCL development	7	2383	12 July 2019
Passing multiple accessors to the kernel SYCL development	5	1391	18 May 2021
Direct initialization of device memory for sycl::buffer oneAPI for NVIDIA GPUs	1	757	1 August 2023

SYCL problem when defining a data class containig buffer as member

Related topics