Description
EDITS following different comments:
- (Feature request: axis and keepdims for gufuncs (and use it in np.median) #8810 (comment)): include
keepdims
. See also previous discussion in Add an axis argument to generalized ufuncs? #5197 and on https://mail.scipy.org/pipermail/numpy-discussion/2014-October/071455.html (which independently suggested more or less the same syntax). - see ENH: Implement axes keyword argument for gufuncs. #8819 for the implementation of
axes
; adjusted implementation notes accordingly.
Rationale
Following the suggestion that, ideally, np.median
and similar functions would be gufuncs
so that they could be overridden with __array_ufunc__
(#8247), it was realised that the main show-stopper is the absence of an axis
argument for gufuncs (also annoying in the context of the new all_equal
gufuncs (#8528). Furthermore, for functions like argmax
, etc., a keepdims
would be very useful (#8710).
Specification
In practice, the new axis
argument would use underneath a more general axes
argument, which is a list of tuples of axes for each of the core dimensions. For normal ufunc
, axes
obviously has to be empty (or absent); for gufunc
, defaults would be -1
, -2
, etc., as needed. For the output, the tuple would be extended by one (or possibly more) from that implied by the signature if keepdims
is set . Short-cuts may be possible, e.g., for gufuncs with only one index, a single number (or perhaps even None
for ravelling all dimensions); one would distinguish the general case of axes
with the simple case of axis
.
Implementation
Looking at the reference, it would seem that the C-api would not need to change, but the routine actually calling the underlying iteration machinery in umath/ufunc_object.c
would need to do appropriate transposes of the input and output arrays. Going through the call sequence to see places where change would be necessary (not marked with X
):
-
ufunc_generic_call
(all OK) -
PyUFunc_CheckOverride
(OK,axis
should already be inkwds
) -
PyUFunc_GenericFunction
: signal that non-emptyaxis
should error. -
PyUFunc_GeneralizedFunction
: do a remap of axes for the checks and before passing on to the iterator; -
get_ufunc_arguments
: interpretaxes
argument (new return argument); error if not needed. ENH: Implement axes keyword argument for gufuncs. #8819 - Implement the simple
axis
one for just a single axis. ENH: Implement axis for generalized ufuncs. #11018 - repeat above for
keep_dims
; ENH: Implement axis for generalized ufuncs. #11018
Excluded
In principle, it might be nice to allow axis
to refer to multiple axes or all of them (None
), but this is substantially more complicated, and it is not obvious it isn't better to let wrapping code deal with this. Indeed, this is done for np.median
, which ravels any requested axes as needed before passing the array on to partition
(which can handle only a single axis).