Passing multiple accessors to the kernel

Hello,

I’ve been looking to find a way to pass an arbitrary number of accessors to the kernel but I’ve been having trouble with it. The code compiles and executes properly with the Host device, however it throws an exception during runtime on CL devices.

The general idea is to receive a variable number of iterators through a template to then create a buffer and accessor for each. Finally, an operation will be applied with the received data and stored somewhere else.

The general structure of the code is as follows:

First the iterators are passed by means of a tuple to the function.

    template <typename ... Input, typename Output, typename Operation>
    void my_function(
        std::tuple<Input...> in_iters,
        Output out,
        Operation && op) const
    {

Then each buffer for each iterator is created and stored in an std::array.

    std::array in_buffers = {std::apply([size](const auto&... inputs){
        std::array collection{sycl::buffer<T,1>{inputs, inputs + size}...};
        return collection;
    }, in_iters)};

A similar approach is taken with the creation of the accessors.

std::array in_accs = {std::apply([&] (auto&... buffers) {
      std::array accessors{buffers.template get_access<sycl::access::mode::read>(cgh)...};
      return accessors;
      },in_buffers)};

Finally we have the kernel code that calls the lambda function with all of the other accessors and stores the result.

    cgh.template parallel_for<myKernel>(sycl::range<1>{size}, [=] (sycl::id<1> index) {
        out_acc[index] = std::apply([&](const auto &...accessors){
            return op(accessors[index]...);
    }, in_accs););

When run on a device different than the host, the error CL_INVALID_ARG_SIZE is caught. The problem occurs when ‘in_accs’ is used. I supposed that it’s because the variable is not an accessor but an array. I’ve also tried to use an accessor array (sycl::accessor[.]) with no luck. I’ve seen that DPC++ can execute this code but I’m unsure if it’s because of an extension in the compiler. I was wondering if there’s another approach to this since access to all of the variables at the same time is required due to the way the lambda function is structured.

Thank you in advance

Apologies for the time to repond to this question, there is no obvious solution. The code is indeed valid C++ however it looks like the code may cause some sort of layout problem.
Can you provide the compile command and output and the host compiler and version you are using? This might help us to understand a bit more.

CL_INVALID_ARG_SIZE indicates that this is probably a miscommunication between the device compiler and the host.

Hello Rod,

Thanks for the response, I’ll try to provide the requested information. I’m currently using the CMake module available in the repository of ComputeCpp-SDK; and ComputeCpp Community Edition v2.4.0. Two compilers were used, both producing the same issue.

gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
clang version 10.0.0-4ubuntu1

The CMake output is the following:

cmake ..  -DComputeCpp_DIR=~/Desktop/ComputeCpp-CE-2.4.0-x86_64-linux-gnu/ -DSYCL_LANGUAGE_VERSION=202001
-- The C compiler identification is GNU 9.3.0
-- The CXX compiler identification is GNU 9.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Detected processor is x86_64
-- CMAKE_C_COMPILER: /usr/bin/cc
-- CMAKE_CXX_COMPILER: /usr/bin/c++
-- CMAKE_CXX_COMPILER_ID: GNU
-- C++ standard not set
-- C++ standard set to 17
-- C++ Compiler is GNU version 9.3.0
-- Looking for CL_VERSION_2_2
-- Looking for CL_VERSION_2_2 - found
-- Found OpenCL: /usr/lib/x86_64-linux-gnu/libOpenCL.so (found version "2.2") 
-- platform - your system can support ComputeCpp
-- Found ComputeCpp: /home/yago/Desktop/ComputeCpp-CE-2.4.0-x86_64-linux-gnu (found version "CE 2.4.0 2021/02/19") 
-- compute++ flags - -O2;-mllvm;-inline-threshold=1000;-intelspirmetadata;-DSYCL_LANGUAGE_VERSION=202001;-sycl-target;spir64
-- Configuring done
-- Generating done
-- Build files have been written to: /home/yago/Desktop/grppi/build

The output of the make command is the following:

make sycl_add_sequences
Scanning dependencies of target sycl_add_sequences_main.cpp_0_ih
[ 33%] Building ComputeCpp integration header file /home/yago/Desktop/grppi/build/samples/sycl/map/sycl_add_sequences/sycl_add_sequences_main.cpp.sycl
In file included from /home/yago/Desktop/grppi/samples/sycl/map/sycl_add_sequences/main.cpp:26:
In file included from /home/yago/Desktop/grppi/include/grppi/grppi.h:20:
In file included from /home/yago/Desktop/grppi/include/grppi/dyn/dynamic_execution.h:20:
In file included from /home/yago/Desktop/grppi/include/grppi/dyn/../native/parallel_execution_native.h:19:
/home/yago/Desktop/grppi/include/grppi/dyn/../native/worker_pool.h:48:19: warning: explicitly defaulted move assignment operator is implicitly deleted [-Wdefaulted-function-deleted]
    worker_pool & operator=(worker_pool &&) noexcept = default;
                  ^
/home/yago/Desktop/grppi/include/grppi/dyn/../native/worker_pool.h:87:15: note: move assignment operator of 'worker_pool' is implicitly deleted because field 'num_threads_' is of const-qualified type 'const int'
    const int num_threads_;
              ^
remark: [Computecpp:CC0027]: Some memcpy/memset intrinsics added by the llvm optimizer were replaced by serial functions. This is a workaround for OpenCL drivers that do not support those intrinsics. This may impact performance, consider using -no-serial-memop. [-Rsycl-serial-memop]
1 warning generated.
[ 33%] Built target sycl_add_sequences_main.cpp_0_ih
Scanning dependencies of target sycl_add_sequences
[ 66%] Building CXX object samples/sycl/map/sycl_add_sequences/CMakeFiles/sycl_add_sequences.dir/main.cpp.o
[100%] Linking CXX executable sycl_add_sequences
[100%] Built target sycl_add_sequences

I hope this is of help, and thank you for your support.

Thank you for the information, that was helpful and we managed to make a repro case. There are a couple of issues that need to be resolved in ComputeCpp and we plan to fix those fairly soon. I’ll update this thread when they are resolved and ready in a new release of ComputeCpp. Hopefully it shouldn’t take too long, it will be dependent on the timing of the next release.

1 Like

That’s great to hear. I’ll be awaiting the update for when it comes. Thank you for taking the time to look into it.

We have not forgotten about this however we’ve not been able to address the issues yet. I’ll update you when we do.

1 Like