Skip to content

Ragged reduction #2786

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Apr 7, 2020
Merged

Ragged reduction #2786

merged 9 commits into from
Apr 7, 2020

Conversation

syurkevi
Copy link
Contributor

@syurkevi syurkevi commented Mar 9, 2020

Addresses #2782 .

TODO:

  • tests
  • other ragged functions(min, sum, product, count, any, all)


\note NaN values are ignored
*/
AFAPI void max(array &val, array &idx, const array &in, const int dim, const array &ragged_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opinion: I think ragged_len should come in after the in array.

@@ -804,6 +820,68 @@ af_err af_imax(af_array *val, af_array *idx, const af_array in, const int dim) {
return ireduce_common<af_max_t>(val, idx, in, dim);
}

template<af_op_t op>
static af_err rreduce_common(af_array *val, af_array *idx, const af_array in,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will an overload to reduce_common not work?


template<af_op_t op, typename T>
void rreduce(Array<T> &out, Array<uint> &loc, const Array<T> &in,
const int dim, const Array<uint> &rlen) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function still needed? It looks like you can combine this with ireduce.

@WilliamTambellini
Copy link
Contributor

WilliamTambellini commented Mar 10, 2020

@syurkevi sounds good:
ArrayFire v3.7.0 (CUDA, 64-bit Linux, build 4206fa3)
Platform: CUDA Runtime 10.2, Driver: 440.33.01
[0] Quadro RTX 3000, 5935 MB, CUDA Compute 7.5
Benchmark max at f32
in max ragmax
4 16384 0.038 0.037
8 16384 0.034 0.036
16 16384 0.035 0.037
32 16384 0.037 0.040
64 16384 0.074 0.081
128 16384 0.148 0.159
256 16384 0.293 0.309

Though f16 not really faster than f32:
Device 0 isHalfAvailable ? yes
ArrayFire v3.7.0 (CUDA, 64-bit Linux, build 4206fa3)
Platform: CUDA Runtime 10.2, Driver: 440.33.01
[0] Quadro RTX 3000, 5935 MB, CUDA Compute 7.5
Benchmark max at f16
in max ragmax
4 16384 0.040 0.037
8 16384 0.035 0.037
16 16384 0.036 0.039
32 16384 0.038 0.040
64 16384 0.067 0.070
128 16384 0.143 0.152
256 16384 0.282 0.298

Could you please just add a minimalist bench like this one:
https://gist.github.com/WilliamTambellini/0f9309ab27b7076369f24c3217af4ffd
?

@WilliamTambellini
Copy link
Contributor

@syurkevi could you please at least rebase for me to see what needs to be finished and advise accordingly ?

@9prady9
Copy link
Member

9prady9 commented Mar 16, 2020

@syurkevi I took care of rebase from latest master. If you are adding more ragged functions and need to touch the ireduce kernel. You can find the kernels in the file src/backend/cuda/kernel/ireduce.cuh and the kernel wrappers inside src/backend/cuda/kernel/ireduce.hpp. The backend source file is src/backend/cuda/ireduce.cpp and not ireduce.cu.

If you face any issues while editing/adding new things to kernels, please ping me. I can guide you through the nvrtc related changes.

@WilliamTambellini
Copy link
Contributor

Thank you @9prady9
@syurkevi I am only interested by the max reduction, as agreed with @umar456 when ordering the support. Could you please limit that PR to the max/raggedmax reduction ?
Kind regards

@syurkevi syurkevi force-pushed the ragged_max branch 2 times, most recently from e502b36 to 6599dd2 Compare March 21, 2020 08:10
@umar456 umar456 added this to the 3.8.0 milestone Mar 24, 2020
@umar456 umar456 changed the title Ragged reduction [WIP] Ragged reduction Mar 27, 2020
@syurkevi syurkevi force-pushed the ragged_max branch 2 times, most recently from aa0a856 to 2307ccf Compare March 31, 2020 21:57
@9prady9 9prady9 merged commit 0a66851 into arrayfire:master Apr 7, 2020
@umar456 umar456 mentioned this pull request Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants