Dilated convolve2 and backward gradients #2359

syurkevi · 2018-11-21T21:34:57Z

Adds dilation to forward convolve2 with cudnn integration.
Also adds functions to obtain convolve2 backwards gradients with respect to the filter or input data.

umar456

Did a quick run of the code. I am suspicious about some of the operations on the cpu backend and the number of times reorder is called. I think we can do something more efficient there. Is there a MKL library that does this operation?

include/af/defines.h

src/api/c/pool.cpp

src/backend/opencl/wrap.cpp

test/convolve.cpp

test/pool.cpp

src/backend/cpu/kernel/convolve.hpp

src/backend/cpu/convolve.cpp

9prady9

I looked at the style, code flow in general. Haven't looked at the kernels themselves. I will give it another pass once soon.

include/af/ml.h

include/af/signal.h

include/af/ml.h

src/api/c/convolve.cpp

src/backend/cuda/convolve.cpp

src/backend/cpu/pool.cpp

src/backend/cpu/kernel/wrap.hpp

src/backend/cpu/kernel/unwrap.hpp

src/backend/cpu/convolve.cpp

src/backend/cuda/cudnn.hpp

src/backend/cuda/platform.hpp

9prady9 · 2018-11-29T12:54:13Z

also, rebase your branch.

umar456

Made another pass. Need to stop off of this for now but I will give it another go soon.

include/af/ml.h

src/api/c/convolve.cpp

src/backend/cpu/convolve.cpp

umar456 · 2018-12-19T01:47:11Z

src/backend/cpu/kernel/wrap.hpp

+                const T* iptr = iptr_ + col * istrides[d];
+
+                // Calculate input window index
+                dim_t winy = (col / nx);


maybe increment winy after nx iterations instead of performing a division. same for winx.

that would bring in a branch statement based on value of nx. Do you think it would be faster ?

src/backend/cuda/convolve.cpp

umar456 · 2018-12-19T02:05:35Z

src/backend/opencl/kernel/unwrap_dilated.cl

+
+        // Compute output index local to column
+        const int outIdx = IS_COLUMN ?
+            (i * get_local_size(0) + get_local_id(0)) :


you can just increment i with local size and initialize it with local_id

Rebased from master and minor tweaks

src/.clang-format

src/api/c/convolve.cpp

src/api/cpp/convolve.cpp

src/backend/cpu/convolve.cpp

src/backend/cpu/kernel/unwrap.hpp

src/backend/cpu/kernel/wrap.hpp

src/backend/cuda/convolve.cpp

src/backend/cuda/convolve.hpp

src/backend/cpu/convolve.cpp

src/backend/cpu/convolve.hpp

src/backend/cuda/convolve.cpp

src/backend/cuda/platform.hpp

The unique_handle move constructors were failing on Windows because the move constructor didn't zero the other handle.

Addressed changes

src/backend/opencl/blas.hpp

WilliamTambellini

just questions

WilliamTambellini · 2019-08-28T20:28:36Z

src/backend/cuda/CMakeLists.txt

@@ -12,6 +12,7 @@ dependency_check(CUDA_FOUND "CUDA not found.")

 find_cuda_helper_libs(nvrtc)
 find_cuda_helper_libs(nvrtc-builtins)
+find_cuda_helper_libs(cudnn)


any way to specify a minimum version here ?

Unfortunately not with this interface. This is only possible if we make a findcudnn package. We should think about adding that optionally because there are easy ways to get around this dependenciy. It would make the build processes simpler for those who are not interested in a 300 MB dependency.

WilliamTambellini · 2019-08-29T23:38:48Z

Could you just confirm that after merging that one, we ll just need a local cudnn sdk (include+lib) in order to compile afcuda ?
https://developer.nvidia.com/rdp/cudnn-download

umar456 · 2019-08-29T23:53:03Z

Yes, you will need to have cuDNN installed on the system. Currently we are installing it in the cuda directly inline. If there are users who which to specify the location of cuDNN please make an issue and we can discuss.

WilliamTambellini · 2019-08-30T00:02:44Z

On my side thats ok to install it in /usr/local/cuda.
Does it link with static or dyn cudnn ?
Have you updated the packaging script(s) in order to embed libcudnn.so in the af installer ?

umar456 · 2019-08-30T00:06:29Z

Not with this commit. It should be done in the next couple of days. We are dynamically linking against cudnn although I don't think it is necessary. We should look into static linking.

umar456 requested changes Nov 21, 2018

View reviewed changes

9prady9 previously requested changes Nov 23, 2018

View reviewed changes

9prady9 reviewed Nov 29, 2018

View reviewed changes

src/backend/cuda/cudnn.hpp Outdated Show resolved Hide resolved

src/backend/cuda/platform.hpp Outdated Show resolved Hide resolved

syurkevi force-pushed the dilated_conv2 branch from 7fafa72 to 27f75f3 Compare December 4, 2018 21:57

umar456 previously requested changes Dec 19, 2018

View reviewed changes

syurkevi force-pushed the dilated_conv2 branch 2 times, most recently from 3e4afb4 to ccad15f Compare January 15, 2019 20:42

syurkevi assigned syurkevi and umar456 and unassigned syurkevi Jan 16, 2019

9prady9 force-pushed the dilated_conv2 branch 2 times, most recently from 71783d3 to c24c314 Compare March 4, 2019 13:16

9prady9 requested review from umar456 and 9prady9 March 4, 2019 13:17