Running code on cpu_selector

aatif.kiani · 15 April 2019 14:54

Hi,
I am trying to run the matrix multiplication code on CPU_SELECTOR but i am getting this error on runtime

Allocate memory…
Running on Intel® Core™ i7-8750H CPU @ 2.20GHz - Intel® Corporation
The Device Max Work Group Size is : 8192
The order is : 16777216
The blockSize is : 64
Internal compiler error invalid llvm.linker.options
Please report the issue on Intel OpenCL forum
https://software.intel.com/en-us/forums/opencl for assistance.

Any pointers about what i am missing here?

Any help is appreciated.
Thank you,
Aatif

rod · 16 April 2019 07:56

Hi Aatif,
The error message is suggesting this is an error within the OpenCL driver provided by Intel. It would be best to go to their website to ask for help.
What Intel OpenCL drivers are you using including the version number?
What is the output of computecpp_info?
Thanks.

aatif.kiani · 16 April 2019 08:25

Hi Rod,

Thank you for the response. So i was able to resolve this error. I had to uninstall the already installed drivers for my machine and reinstalled opencl drivers from intel then it worked. Thank you

aatif.kiani · 17 April 2019 07:49

Hi Rod,

I was able to run my code on CPU and GPU, but the results i am getting are not what we would expect. I am doing a matrix multiplication, using the code you guys provide in the sample. Here are the timings of this operation.

Please have a look and let me know if they are possible or we could be doing something wrong.

rod · 17 April 2019 11:09

The sample is purely for learning purposes and is not very optimized.
What Cuda code are you comparing this to?

We have some BLAS benchmarks that include matrix multiplication that I will share with you. The PR is in progress but will be available soon. That will offer a better way to compare.

aatif.kiani · 17 April 2019 11:30

I used the already existing implementation of sgeam method in cublas.
Please do share the code that you are referring to.

rod · 22 April 2019 12:39

The benchmark code and README has now been pushed and is available here. Amongst other things the README explains how to run the GEMM benchmarks. Let me know how you get on.

Topic		Replies	Views
Poor performance on matrix multiplication oneAPI for NVIDIA GPUs	8	1107	7 November 2024
Matrix-multiply: Exception on size >= 8192 SYCL development	1	851	3 August 2020
Working example on Windows 10 + Nvidia SYCL development	11	2059	19 April 2019
Segmentation fault in LLVM with a simple SYCL parallel reduction when running on CUDA device through pocl	6	1203	22 December 2020
GPU optimisation of memory accesses	2	105	12 June 2024

Running code on cpu_selector

Related topics