Fix convolve3 launch configuration in CUDA backend #2519

9prady9 · 2019-05-22T08:56:13Z

Batched input of Convolve3 was incorrectly folding batch size
into CUDA launch grid's y & z dimensions. Since the batch for 3d
inputs can happen along 4th dimension only, folding it onto x dimesion
of CUDA grid is sufficient.

Have to merge arrayfire/arrayfire-data#14 before this is merged.

syurkevi · 2019-05-27T04:38:36Z

Would it be possible to update the convolve3 documentation to describe the expected batching behavior?

9prady9 · 2019-05-28T04:30:47Z

@syurkevi We are already explain the batching behavior for all convolve functions in our documentation. This change is an internal fix for CUDA kernel launch configuration and doesn't change the external batching behavior.

syurkevi · 2019-05-29T15:27:56Z

@syurkevi We are already explain the batching behavior for all convolve functions in our documentation. This change is an internal fix for CUDA kernel launch configuration and doesn't change the external batching behavior.

The documentation you linked is copy-pasted from convolve2 and doesn't apply to convolve3. It would be nice to update the correct batching behavior to the documentation as it is difficult to understand what it should be without reading the source code.

Batched input of Convolve3 was incorrectly folding batch size into CUDA launch grid's y & z dimensions. Since the batch for 3d inputs can happen along 4th dimension only, folding it onto x dimesion of CUDA grid is sufficient.

9prady9 · 2019-05-30T15:11:01Z

9prady9 · 2019-05-30T15:11:33Z

@arrayfire/core-devel Alright, here's the new updated documentation for convolution. have a look and share feedback pls.

9prady9 added CUDA fix labels May 22, 2019

9prady9 requested a review from umar456 May 22, 2019 08:56

umar456 requested review from syurkevi and removed request for umar456 May 26, 2019 23:22

9prady9 added 2 commits May 30, 2019 13:43

Fix convolve3 launch configuration in CUDA backend

83c45b1

Batched input of Convolve3 was incorrectly folding batch size into CUDA launch grid's y & z dimensions. Since the batch for 3d inputs can happen along 4th dimension only, folding it onto x dimesion of CUDA grid is sufficient.

Fix documentation of convolution functions

ec6a481

9prady9 force-pushed the fix_conv3_launch branch from e5f0c32 to ec6a481 Compare May 30, 2019 15:09

syurkevi approved these changes May 30, 2019

View reviewed changes

umar456 merged commit c889f36 into arrayfire:master May 30, 2019

9prady9 deleted the fix_conv3_launch branch May 30, 2019 20:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix convolve3 launch configuration in CUDA backend #2519

Fix convolve3 launch configuration in CUDA backend #2519

Uh oh!

9prady9 commented May 22, 2019

Uh oh!

syurkevi commented May 27, 2019

Uh oh!

9prady9 commented May 28, 2019

Uh oh!

syurkevi commented May 29, 2019

Uh oh!

9prady9 commented May 30, 2019

Uh oh!

9prady9 commented May 30, 2019

Uh oh!

Uh oh!

Fix convolve3 launch configuration in CUDA backend #2519

Fix convolve3 launch configuration in CUDA backend #2519

Uh oh!

Conversation

9prady9 commented May 22, 2019

Uh oh!

syurkevi commented May 27, 2019

Uh oh!

9prady9 commented May 28, 2019

Uh oh!

syurkevi commented May 29, 2019

Uh oh!

9prady9 commented May 30, 2019

Uh oh!

9prady9 commented May 30, 2019

Uh oh!

Uh oh!