-
Notifications
You must be signed in to change notification settings - Fork 548
Description
Hi,
I'm integrating ArrayFire into an existing OpenCL application which uses two NVidia GPUs on a Intel MS Windows 10 host. The application OpenCL context includes both GPUs, and I would like ArrayFire to share the same context to allow efficient access to existing OpenCL buffers.
Issue:
I am able to create the custom AF device using afcl::addDevice(dev, ctx, que)
and set the active device using afcl::setDevice(dev, ctx)
and can see the corresponding changes in output from af::info()
.
Furthermore, I can perform simple operations such as array creation and element-wise addition. However, an exception is thrown if I attempt to call af::matmul
.
CLBlast: OpenCL error: clGetProgramInfo: -30
In function void __cdecl opencl::gemm<float>(class opencl::Array<float> &,af_mat_prop,af_mat_prop,const float *,const class opencl::Array<float> &,const class opencl::Array<float> &,const float *)
In file src\backend\opencl\blas.cpp:130
CLBlast Error (-30): CL_INVALID_VALUE
Detail:
Once the custom device has been added and selected, the following AF test code is executed. Evaluation of tmp3
works ok, but the exception is raised in the call to af::matmul()
. The complete console output, including stacktrace, can be found here: af_exception.txt
// Placeholder for addDevice() and setDevice() calls
// Add custom device/context/queue
// afcl::addDevice(dev, ctx, que);
// Set the active device
// afcl::setDevice(dev, ctx);
// Setup test arrays
auto tmp1 = af::identity({ 3,3 }, f32);
auto tmp2 = af::constant(1.0F, { 3, 3 });
fmt::print("\nEvaluate tmp1 + tmp2...\n\n");
auto tmp3 = tmp1 + tmp2;
tmp3.eval();
af::print("tmp1 + tmp2 = ", tmp3);
fmt::print("\nEvaluate matmul(tmp1,tmp2)...\n\n");
auto tmp4 = af::matmul(tmp1, tmp2);
tmp4.eval();
af::print("matmul(tmp1,tmp2) = ", tmp4);
If the custom OpenCL context ctx
is created with a single GPU, then the above code works fine.
Apologies for not providing a minimum working example at this stage, but I should be able to work one up if required.
Many thanks,
Russ