Error compiling kernel containing image accessor

#1

Trying to compile the following kernel results in different errors on AMD and Intel runtimes.

    compute_queue.submit([&](cl::sycl::handler& cgh)
        {
            auto old_lattice = latticeImages[Buffer::Front]->get_access<cl::sycl::float4, cl::sycl::access::mode::read>(cgh);
            auto new_lattice = latticeImages[Buffer::Back]->get_access<cl::sycl::float4, cl::sycl::access::mode::write>(cgh);
            
            cgh.parallel_for<kernels::ConwayStep>(cl::sycl::range<2>{ old_lattice.get_range() },
                                                  [=](const cl::sycl::item<2> item)
            {
                using namespace cl::sycl;
                using elem_type = cl::sycl::float4::element_type;
        
                sampler sampler(coordinate_normalization_mode::unnormalized,
                                addressing_mode::repeat,
                                filtering_mode::nearest);
        
                auto old = [=](cl::sycl::id<2> id) { return old_lattice.read((cl::sycl::int2)id, sampler).r(); };
        
                auto id = item.get_id();
        
                std::array<elem_type, 8> neighbours =
                    { old(id + id<2>(-1,+1)), old(id + id<2>(0,+1)), old(id + id<2>(+1,+1)),
                      old(id + id<2>(-1,0)),                         old(id + id<2>(+1,0)),
                      old(id + id<2>(-1,-1)), old(id + id<2>(0,-1)), old(id + id<2>(+1,-1))
                    };
                elem_type self = old(id);
        
                auto count = std::count_if(neighbours.cbegin(), neighbours.cend(), [](const cl::sycl::cl_float val) { return val > 0.5f; });
        
                auto val = self > 0.5f ?
                    (count < 2 || count > 3 ? 0.f : 1.f) :
                    (count == 3 ? 1.f : 0.f);
        
                new_lattice.write((cl::sycl::int2)id, cl::sycl::float4{ val, val, val, 1.f });
            });
        });

Using AMD Radeon Software 18.2.3 on Windows results in

Exception thrown at 0x00007FFA6D717A0C (amdocl64.dll) in SYCL-Conway.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFE8.

inside amdocl64.dll which cannot be caught in any way.

Using Intel runtime (still on Windows) it results in a cl::sycl::exception which has a log entry containing:

+        message    "Error: [ComputeCpp:RT0100] Failed to build program (<Build log for program 00000264885F2510 device 0 (size 363):\nerror: undefined reference to `ZN2cl4sycl7samplerC1E_ns1NS0_29coordinate_normalizatio...    std::basic_string<char,std::char_traits<char>,std::allocator<char> >

Which seems as if not just the abacus methods are missing, but also sampler references would be missing too. Unfortunately, we have no Linux box which is SYCL and also OpenGL enabled to test this code on Linux.

Code extracted from original repo can be found here [OneDrive]. (may still contain minor bugs inside the kernel as we’ve never had the chance to actually run it, however it should work.)

#2

Hi Máté,

In your Conway code, you’re creating the sampler inside the parallel_for - it should instead be created outside the parallel_for but still inside the scope of the command group. We’re sorry there isn’t a better error for this - see the Gaussian blur sample for some image code.

Cheers,
Duncan.

2 Likes
#3

Hi Duncan,

thanks for your suggestion. Indeed, the missing symbol went away once I moved the sampler into the command group handler scope. There was another error however: using addressing_mode::repeat only works in tandem with coordinate_normalization_mode::normalized which was not the case. Would the spec allow for the CTOR to throw if one initializes the sampler in an invalid state? Since it’s not valid to create an instance in device code, it might as well throw.

Thanks once again. Along with Gordons advise the code finally works.

#4

Ah, that’s good to hear. Having the constructor throw sounds like a good idea, I looked at the spec and the way it’s written seems to indicate that construction should succeed but later fail on use. Personally, I don’t like that, it’s not very C++ to have constructed-but-invalid-state objects. I agree that throwing is an appropriate behaviour.

Personally, I think the best way for you to proceed now is to make an issue on the SYCL specification. You could point them to this discussion, too. If the specification says that making such a sampler is no allowed, ComputeCpp (and all other implementations) will make it throw.

Best,
Duncan.