ComputeCpp adds address space attribute to raw pointers, thus breaking templates

Minimal working example below. I’m not sure if this is a compiler bug or a disagreement between compilers on how to interpret the specification. The code will happily compile with the Intel SYCL compiler but fail with ComputeCpp 1.1.4.

Code:

#include <cstddef>
    
#include <CL/sycl.hpp>

template <class TAcc>
auto get_pointer(TAcc acc)
{
    return static_cast<typename TAcc::value_type*>(acc.get_pointer());
}


template <class T>
auto add(T* ptr, T val, int idx);

template <>
auto add<int>(int* ptr, int val, int idx)
{
    ptr[idx] += val;
}

template <class T>
auto call_add(T* ptr, int idx)
{
    add(ptr, 42, idx);
}

class adder;

auto main() -> int
{
    auto queue = cl::sycl::queue{cl::sycl::default_selector{}};
    auto buf = cl::sycl::buffer<int>{cl::sycl::range<1>{1024}};
    
    queue.submit([&](cl::sycl::handler& cgh)
    {
        auto acc = buf.get_access<cl::sycl::access::mode::read_write>(cgh);

        cgh.single_task<adder>([=]()
        {
            auto raw_ptr = get_pointer(acc);
            for(auto i = 0; i < 1024; ++i)
                call_add(raw_ptr, i);
        });
    });
    queue.wait();

    return 0;
}

Compiler command line:

compute++ -std=c++2a -sycl-driver -sycl-target spirv64 -I/home/jan/software/sycl/computecpp/include template-ptr.cpp -ftemplate-backtrace-limit=0 -o template-ptr -lComputeCpp

Error message:

template-ptr.cpp:24:5: error: no matching function for call to 'add'                                                                                                                                        
    add(ptr, 42, idx);                                                                                                                                                                                      
    ^~~                                                                                                                                                                                                     
template-ptr.cpp:22:6: note: in instantiation of function template specialization 'call_add<__global int>' requested here                                                                                   
auto call_add(T* ptr, int idx)                                                                                                                                                                                          
    ^                                                                                                                                                                                                      
/home/jan/software/sycl/computecpp/include/SYCL/apis.h:1301:5: note: in instantiation of function template specialization 'cl::sycl::kernelgen_single_task<adder, (lambda at template-ptr.cpp:38:32)>'      
  requested here                                                                                                                                                                                        
    kernelgen_single_task<                                                                                                                                                                                  
    ^                                                                                                                                                                                                       
template-ptr.cpp:38:13: note: in instantiation of function template specialization 'cl::sycl::handler::single_task<adder, (lambda at template-ptr.cpp:38:32)>' requested here                               
        cgh.single_task<adder>([=]()                                                                                                                                                                        
            ^                                                                                                                                                                                               
template-ptr.cpp:13:6: note: candidate template ignored: deduced conflicting types for parameter 'T' ('__global int' vs. 'int')                                                                             
auto add(T* ptr, T val, int idx);                                                                                                                                                                           
     ^                                                                                                                                                                                                      
1 error generated.

Hello Jan,

Thanks for reporting this! This is an issue that we’ve been aware of for a while, and unfortunately it’s another tricky one. Clang considers things like these address spaces to be part of the type which results in error messages like the one you get in this code. As far as I am aware, Intel consider pointers to be in the “generic” address space, which is an OpenCL 2.0 only feature. The benefit there is that it can compile code like your sample here - the downside is that it won’t run on OpenCL 1.2 devices.

Incidentally I’ve seen your issues on the Alpaka project and on the SYCL spec GitHub - you’re grappling with some interesting questions!

I hope this helps,
Duncan.

(Also, I don’t want to jump into your other thread on GitHub uninvited, but the idea of grabbing the address of the local memory from a pure OpenCL C kernel won’t really work on (say) most non-CPU platforms, as far as I know. The atomics stuff is looking hard as well - there’s not really any way to do atomics in OpenCL 1.2 without knowing exactly which address space your pointer is qualified with.)
(Looking at one final thing, I’d be very surprised if a function with one argument can’t be inferred correctly. The problem here is that there are two arguments which conflict in the type deduction.)

Hi Duncan,

thanks for clearing things up. I wasn’t aware that this is actually coming from differences between OpenCL versions. Is there a known workaround for this problem? (The template instantiation issue, not atomics).

I don’t want to jump into your other thread on GitHub uninvited

Please do! Getting input from the SYCL perspective would certainly benefit the discussion :slight_smile:

Hi Jan,

I’m afraid I don’t know of any workarounds that will fix this current situation. There is an argument that code like this should “just work” which I know lots of people in the company agree with. However, quite how that translates into “actually making it work” is a much harder proposition. I think we will be able to make it work, I just couldn’t say when.

There is one last thing to try, actually - it might look a bit unclean, but you could try adding an explicit template parameter at the call_add callsite, so that add has something like call_add<T>(ptr, val, index);, I appreciate this isn’t a good solution in general but it might get you past this hurdle for now.

Thank you for tagging me in the thread, as and when I see stuff I think I can help with I’ll leave a comment.

Cheers,
Duncan.