CLBlast: Float16 GEMM on Adreno 330

Followed the samples/haxpy.c example to create float16 matrices and tried CLBlastSgemm on Adreno 330. Getting Error number -1011 CL_INVALID_D3D9_RESOURCE_NV or CL_INVALID_DX9_RESOURCE_INTEL.

About this issue

Original URL
State: closed
Created 7 years ago
Comments: 35 (31 by maintainers)

Most upvoted comments

apologies for the long long delay, something else came up and I haven’t had a chance to look at this. I’ve done some sanity checks, and the problem I was having was a bug in my code causing alpha to be set wrong.

GuntherSchuler on Oct 22, 2018

OK, maybe I should buy such a dev-kit as well if they are not too expensive. But that’ll mean a bit longer term. Which board do you have/recommend?

There is an interface in CLBlast to set the tuning parameters manually. You could do that just before launching your kernel, possibly in an if-statement: e.g. if m == 16 then set_parameters(A) else set_parameters(B). But right now this would require re-compilation every time. I’m working on the preparation_for_size_specific_parameters branch to make it possible to save multiple compiled kernels in the cache, so that would speed-up things significantly.

Still, I think there might be an issue related to the kernel not be optimal in some sense to the hardware. I’ll have to get access first in order to investigate further.

CNugteren on Sep 28, 2017

I have not tested that yet unfortunately. I’ll see if I can make time for it soon.

CNugteren on Sep 25, 2017

I checked the supported extensions in NVIDIA 1080 Ti(Pascal architecture), looks like it doesnt support cl_khr_fp16

=== 1 OpenCL platform(s) found: ===
  PROFILE = FULL_PROFILE
  VERSION = OpenCL 1.2 CUDA 8.0.0
  NAME = NVIDIA CUDA
  VENDOR = NVIDIA Corporation
  EXTENSIONS = cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_fp64 cl_khr_byte_addressable_store cl_khr_icd cl_khr_gl_sharing cl_nv_compiler_options cl_nv_device_attribute_query cl_nv_pragma_unroll cl_nv_copy_opts cl_nv_create_buffer

sivagnanamn on Aug 9, 2017