ComputeCPP supports intel arria 10 fpga?

Hello there,
This is my first post at codeplay.
I am a ML developer working with FPGA platform. As of now, my code is written in OpenCL. I came across a suggestion to give a try at ComputeCPP-tensorflow-openCL acceleration. I tried running the executable computecpp_info from bin folder.
I received the following output which clearly says that the ‘device is Not supported’ because of unavailability of SPIR.

Toolchain information:

GLIBC version: 2.27
GLIBCXX: 20160609
This version of libstdc++ is supported.


Device Info:

Discovered 2 devices matching:
platform :
device type :

Device 0:

Device is supported : NO - Device does not support SPIR
CL_DEVICE_NAME : GeForce RTX 2080 Ti
CL_DEVICE_VENDOR : NVIDIA Corporation
CL_DRIVER_VERSION : 410.57
CL_DEVICE_TYPE : CL_DEVICE_TYPE_GPU

Device 1:

Device is supported : NO - Device does not support SPIR
CL_DEVICE_NAME : a10gx : Arria 10 Reference Platform (acla10_ref0)
CL_DEVICE_VENDOR : Intel® Corporation
CL_DRIVER_VERSION : 19.2
CL_DEVICE_TYPE : CL_DEVICE_TYPE_ACCELERATOR

Please find attached the output of clinfo for additional information in case required.

Number of platforms 2
Platform Name NVIDIA CUDA
Platform Vendor NVIDIA Corporation
Platform Version OpenCL 1.2 CUDA 10.0.154
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer
Platform Extensions function suffix NV

Platform Name Intel® FPGA SDK for OpenCL™
Platform Vendor Intel® Corporation
Platform Version OpenCL 1.0 Intel® FPGA SDK for OpenCL™, Version 19.2
Platform Profile EMBEDDED_PROFILE
Platform Extensions cl_khr_byte_addressable_store cles_khr_int64 cl_khr_icd
Platform Extensions function suffix IntelFPGA

Platform Name NVIDIA CUDA
Number of devices 1
Device Name GeForce RTX 2080 Ti
Device Vendor NVIDIA Corporation
Device Vendor ID 0x10de
Device Version OpenCL 1.2 CUDA
Driver Version 410.57
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Topology (NV) PCI-E, 02:00.0
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 68
Max clock frequency 1635MHz
Compute Capability (NV) 7.5
Device Partition (core)
Max number of sub-devices 1
Supported partition types None
Max work item dimensions 3
Max work item sizes 1024x1024x64
Max work group size 1024
Preferred work group size multiple 32
Warp size (NV) 32
Preferred / native vector sizes
char 1 / 1
short 1 / 1
int 1 / 1
long 1 / 1
half 0 / 0 (n/a)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 11520638976 (10.73GiB)
Error Correction support No
Max memory allocation 2880159744 (2.682GiB)
Unified memory for Host and Device No
Integrated memory (NV) No
Minimum alignment for any data type 128 bytes
Alignment of base address 4096 bits (512 bytes)
Global Memory cache type Read/Write
Global Memory cache size 1114112 (1.062MiB)
Global Memory cache line size 128 bytes
Image support Yes
Max number of samplers per kernel 32
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 2048 images
Max 2D image size 32768x32768 pixels
Max 3D image size 16384x16384x16384 pixels
Max number of read image args 256
Max number of write image args 32
Local memory type Local
Local memory size 49152 (48KiB)
Registers per block (NV) 65536
Max number of constant args 9
Max constant buffer size 65536 (64KiB)
Max size of kernel argument 4352 (4.25KiB)
Queue properties
Out-of-order execution Yes
Profiling Yes
Prefer user sync for interop No
Profiling timer resolution 1000ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Kernel execution timeout (NV) No
Concurrent copy and kernel execution (NV) Yes
Number of async copy engines 3
printf() buffer size 1048576 (1024KiB)
Built-in kernels
Device Extensions cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer

Platform Name Intel® FPGA SDK for OpenCL™
Number of devices 1
Device Name a10gx : Arria 10 Reference Platform (acla10_ref0)
Device Vendor Intel® Corporation
Device Vendor ID 0x1172
Device Version OpenCL 1.0 Intel® FPGA SDK for OpenCL™, Version 19.2
Driver Version 19.2
Device OpenCL C Version OpenCL C 1.0
Device Type Accelerator
Device Profile EMBEDDED_PROFILE
Device Available Yes
Compiler Available No
Max compute units 1
Max clock frequency 1000MHz
Max work item dimensions 3
Max work item sizes 2147483647x2147483647x2147483647
Max work group size 2147483647
Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 0 / 0 (n/a)
float 1 / 1
double 0 / 0 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Correctly-rounded divide and sqrt operations No
Double-precision Floating-point support (n/a)
Address bits 64, Little-Endian
Global memory size 2147482624 (2GiB)
Error Correction support No
Max memory allocation 2147482624 (2GiB)
Unified memory for Host and Device No
Minimum alignment for any data type 1024 bytes
Alignment of base address 8192 bits (1024 bytes)
Global Memory cache type Read-Only
Global Memory cache size 32768 (32KiB)
Global Memory cache line size 0 bytes
Image support No
Local memory type Local
Local memory size 16384 (16KiB)
Max number of constant args 8
Max constant buffer size 536870656 (512MiB)
Max size of kernel argument 256
Queue properties
Out-of-order execution No
Profiling Yes
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Device Extensions cl_khr_byte_addressable_store cles_khr_int64 cl_khr_icd

NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, …) No platform
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, …) No platform
clCreateContext(NULL, …) [default] No platform
clCreateContext(NULL, …) [other] Success [NV]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) No platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) Invalid device type for platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) No platform

Still I just want to confirm once from the experts that is it possible to use intel arria 10 fpga with ComputeCPP?
Thanks in advance :slight_smile:

Best Regards, Mohit

Hi Mohit, welcome.
ComputeCpp is only able to support OpenCL drivers that can accept SPIR or SPIR-V instructions. It looks like the device you are trying to use does not have drivers that support this. ComputeCpp is capable of supporting Intel CPUs and GPUs, but not FPGAs.
I’d add however that I’m not sure FPGA devices are capable of running TensorFlow anyway. There seem to be some research papers on the topic but nothing to indicate any support. I think you would be best asking Intel themselves about this.
Rod.

Hi Rod,
Many thanks for the quick clarification.
Have a nice time.
Best Regards, Mohit