Minimal working example below. I’m not sure if this is a compiler bug or a disagreement between compilers on how to interpret the specification. The code will happily compile with the Intel SYCL compiler but fail with ComputeCpp 1.1.4.
Code:
#include <cstddef>
#include <CL/sycl.hpp>
template <class TAcc>
auto get_pointer(TAcc acc)
{
return static_cast<typename TAcc::value_type*>(acc.get_pointer());
}
template <class T>
auto add(T* ptr, T val, int idx);
template <>
auto add<int>(int* ptr, int val, int idx)
{
ptr[idx] += val;
}
template <class T>
auto call_add(T* ptr, int idx)
{
add(ptr, 42, idx);
}
class adder;
auto main() -> int
{
auto queue = cl::sycl::queue{cl::sycl::default_selector{}};
auto buf = cl::sycl::buffer<int>{cl::sycl::range<1>{1024}};
queue.submit([&](cl::sycl::handler& cgh)
{
auto acc = buf.get_access<cl::sycl::access::mode::read_write>(cgh);
cgh.single_task<adder>([=]()
{
auto raw_ptr = get_pointer(acc);
for(auto i = 0; i < 1024; ++i)
call_add(raw_ptr, i);
});
});
queue.wait();
return 0;
}
Compiler command line:
compute++ -std=c++2a -sycl-driver -sycl-target spirv64 -I/home/jan/software/sycl/computecpp/include template-ptr.cpp -ftemplate-backtrace-limit=0 -o template-ptr -lComputeCpp
Error message:
template-ptr.cpp:24:5: error: no matching function for call to 'add'
add(ptr, 42, idx);
^~~
template-ptr.cpp:22:6: note: in instantiation of function template specialization 'call_add<__global int>' requested here
auto call_add(T* ptr, int idx)
^
/home/jan/software/sycl/computecpp/include/SYCL/apis.h:1301:5: note: in instantiation of function template specialization 'cl::sycl::kernelgen_single_task<adder, (lambda at template-ptr.cpp:38:32)>'
requested here
kernelgen_single_task<
^
template-ptr.cpp:38:13: note: in instantiation of function template specialization 'cl::sycl::handler::single_task<adder, (lambda at template-ptr.cpp:38:32)>' requested here
cgh.single_task<adder>([=]()
^
template-ptr.cpp:13:6: note: candidate template ignored: deduced conflicting types for parameter 'T' ('__global int' vs. 'int')
auto add(T* ptr, T val, int idx);
^
1 error generated.