-
Notifications
You must be signed in to change notification settings - Fork 548
Ragged reduction #2786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ragged reduction #2786
Conversation
include/af/algorithm.h
Outdated
|
||
\note NaN values are ignored | ||
*/ | ||
AFAPI void max(array &val, array &idx, const array &in, const int dim, const array &ragged_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opinion: I think ragged_len
should come in after the in array.
@@ -804,6 +820,68 @@ af_err af_imax(af_array *val, af_array *idx, const af_array in, const int dim) { | |||
return ireduce_common<af_max_t>(val, idx, in, dim); | |||
} | |||
|
|||
template<af_op_t op> | |||
static af_err rreduce_common(af_array *val, af_array *idx, const af_array in, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will an overload to reduce_common not work?
src/backend/cpu/ireduce.cpp
Outdated
|
||
template<af_op_t op, typename T> | ||
void rreduce(Array<T> &out, Array<uint> &loc, const Array<T> &in, | ||
const int dim, const Array<uint> &rlen) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this function still needed? It looks like you can combine this with ireduce.
@syurkevi sounds good: Though f16 not really faster than f32: Could you please just add a minimalist bench like this one: |
@syurkevi could you please at least rebase for me to see what needs to be finished and advise accordingly ? |
@syurkevi I took care of rebase from latest master. If you are adding more ragged functions and need to touch the ireduce kernel. You can find the kernels in the file If you face any issues while editing/adding new things to kernels, please ping me. I can guide you through the nvrtc related changes. |
e502b36
to
6599dd2
Compare
aa0a856
to
2307ccf
Compare
Addresses #2782 .
TODO: