Skip to content

Fix convolve3 launch configuration in CUDA backend #2519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 30, 2019

Conversation

9prady9
Copy link
Member

@9prady9 9prady9 commented May 22, 2019

Batched input of Convolve3 was incorrectly folding batch size
into CUDA launch grid's y & z dimensions. Since the batch for 3d
inputs can happen along 4th dimension only, folding it onto x dimesion
of CUDA grid is sufficient.

Have to merge arrayfire/arrayfire-data#14 before this is merged.

@9prady9 9prady9 requested a review from umar456 May 22, 2019 08:56
@umar456 umar456 requested review from syurkevi and removed request for umar456 May 26, 2019 23:22
@syurkevi
Copy link
Contributor

Would it be possible to update the convolve3 documentation to describe the expected batching behavior?

@9prady9
Copy link
Member Author

9prady9 commented May 28, 2019

@syurkevi We are already explain the batching behavior for all convolve functions in our documentation. This change is an internal fix for CUDA kernel launch configuration and doesn't change the external batching behavior.

@syurkevi
Copy link
Contributor

@syurkevi We are already explain the batching behavior for all convolve functions in our documentation. This change is an internal fix for CUDA kernel launch configuration and doesn't change the external batching behavior.

The documentation you linked is copy-pasted from convolve2 and doesn't apply to convolve3. It would be nice to update the correct batching behavior to the documentation as it is difficult to understand what it should be without reading the source code.

9prady9 added 2 commits May 30, 2019 13:43
Batched input of Convolve3 was incorrectly folding batch size
into CUDA launch grid's y & z dimensions. Since the batch for 3d
inputs can happen along 4th dimension only, folding it onto x dimesion
of CUDA grid is sufficient.
@9prady9 9prady9 force-pushed the fix_conv3_launch branch from e5f0c32 to ec6a481 Compare May 30, 2019 15:09
@9prady9
Copy link
Member Author

9prady9 commented May 30, 2019

Screenshot from 2019-05-30 20-37-33
Screenshot from 2019-05-30 20-37-38
Screenshot from 2019-05-30 20-37-59
Screenshot from 2019-05-30 20-38-12
Screenshot from 2019-05-30 20-38-21
Screenshot from 2019-05-30 20-38-27

@9prady9
Copy link
Member Author

9prady9 commented May 30, 2019

@arrayfire/core-devel Alright, here's the new updated documentation for convolution. have a look and share feedback pls.

@umar456 umar456 merged commit c889f36 into arrayfire:master May 30, 2019
@9prady9 9prady9 deleted the fix_conv3_launch branch May 30, 2019 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants