-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
ENH: Implement axes keyword argument for gufuncs. #8819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
numpy/core/src/umath/ufunc_object.c
Outdated
@@ -1205,6 +1216,10 @@ get_ufunc_arguments(PyUFuncObject *ufunc, | |||
*out_extobj = NULL; | |||
Py_XDECREF(*out_typetup); | |||
*out_typetup = NULL; | |||
if (out_axis != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is by analogy of the others, but seems actually wrong, since the reference is borrowed.
numpy/core/src/umath/ufunc_object.c
Outdated
@@ -2013,7 +2030,8 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc, | |||
/* The sizes of the core dimensions (# entries is ufunc->core_num_dim_ix) */ | |||
npy_intp *core_dim_sizes = inner_dimensions + 1; | |||
int core_dim_ixs_size; | |||
|
|||
/* swapping around of axis (TODO: allocate this when needed) */ | |||
int remap_axis[NPY_MAXARGS][NPY_MAXDIMS]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure whether it is better to malloc these 32*32 elements; same holds for op_axes_arrays
above.
03a1271
to
1a44770
Compare
numpy/core/src/umath/ufunc_object.c
Outdated
case 'a': | ||
/* possible axis argument for generalized ufunc */ | ||
if (strcmp(str, "axis") == 0 && ufunc->core_enabled) { | ||
*out_axis = value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work if out_axis
is NULL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was captured by the ufunc->core_enabled
, but I don't know why I didn't just go for the clarity of checking that out_axis != NULL
(which is also done for where
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, putting the checks in the other order should be faster
numpy/core/src/umath/ufunc_object.c
Outdated
* possibly remap axes. | ||
*/ | ||
int axis_rqst; | ||
int axis_list = PyList_Check(axis); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe axis
on np.sum
actually only allows tuples, not lists -- it would be good to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the code be a little more readable if this variable was called axis_is_list
?
numpy/core/src/umath/ufunc_object.c
Outdated
if (axis_tuple) { | ||
rqst = (int)PyInt_AsSsize_t(PyTuple_GET_ITEM(axis_item, | ||
iax)); | ||
if (rqst == -1 && PyErr_Occurred()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's an error_converting(rqst)
function in common.h
that we'd normally use here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, nice. The same pattern is all over the file...
numpy/core/src/umath/ufunc_object.c
Outdated
/* | ||
* possibly remap axes. | ||
*/ | ||
int axis_rqst; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not Py_Ssize_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly since your check_and_adjust_axis
seems to want an int
. I'm eternally confused about these C types...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, you're right. I did that because that's what we used elsewhere in the public API, not because it was the right decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be a little influenced by having to follow this in my daily work, but wouldn't things be more readable if this was requested_axis
, or axis_for_all
, or something more descriptive?
@@ -1025,6 +1029,13 @@ get_ufunc_arguments(PyUFuncObject *ufunc, | |||
} | |||
|
|||
switch (str[0]) { | |||
case 'a': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we're not using PyArg_ParseTupleAndKeywords here? Out of scope for this PR, but still
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably speed – I haven't checked if this particular case makes a meaningful difference, but the ufunc call path is a place where microseconds count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switch statements actually compile as hardwired binary search, so for small lists of possibilities the savings probably don't amount to much. I think the main virtues are compactness, clarity, and fall through, which can be used to select code entry points. The last is probably frowned upon by structured code fanatics...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If speed really matters here, then presumably we could save a little more with strcmp(str+1, "xis")
on the following lines...
Either way, I think this probably needs a comment explaining that we're avoiding PyArg_ParseTupleAndKeywords for speed reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't add the comment, but I'm not going to demand it before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a comment on the structure is a bit out-of-scope of this PR (and I don't really know what the original reason was anyway...)
@shoyer (commenting in main thread, since a good point): For reduce operations the usage of axis is a bit different: one can tell to go over multiple rather than a single axis. In contrast, here we want to define where the core dimensions of the gufunc are to be found. Now a counterargument is that But perhaps that argues that the items inside the list should also be lists, so that we keep open the option of using tuples to define multiple operand axes as a single core axis. I'll do that... |
How about using Alternatively: I'd be tempted to allow the reducers to be defined as Right now, do gufuncs incur a copy if their non-core dimensions are not contiguous? |
@eric-wieser - A |
numpy/core/src/umath/ufunc_object.c
Outdated
"axis list element %d should be a tuple " | ||
"with %d elements or None", i, num_dims); | ||
"axis item %d should be a list with %d " | ||
"elements or None", i, num_dims); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be mistakenly read as "a list with no elements". A comma would help, or so would swapping the order
@eric-wieser - I'm not sure I understand what you mean by the "reducer". Would this be in the signature? For the non-core dimensions, an iterator is used, so no copy is made (strides are passed to the inner loop, so no copy is made there either). |
I mean the functions with a signature ending in Yes, |
I think I'd prefer to keep discussion of the I also think you're right about I'm also wondering if, rather than special-case so much, it is better to go with @njsmith's suggestion of having |
My point here is that defining the signature in this way would let us use tuples for the inner axis argume without ambiguity
|
Hmm, I had thought that, really, the |
I'm struggling to think of a case when this is useful and not just cryptic.
Except we don't want to do that, (and I think currently don't?), because |
But
(no surprise now that @njsmith has been bemused by our discussion...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for doing this, Marten! This is some of the most forsaken NumPy code, so it's great to see more people figuring it out.
I commented inline on a couple of things I don't fully like of the way axis=
has to be specified. On top of that comment on operands with no core dimensions, and of making axes for output operands optional, I am a little wary of reusing a single integer axis for several operands, and lean more towards only accepting an integer if there is a single operand with core dimensions, and it happens to have a single core dimension, i.e. axis=1
would be ok for (i),()->()
, but not for (i),(i)->()
of for (i)->(i)
. Do you have a use case in mind where reusing that single value would be a good thing with signatures like these last two examples?
I think this whole matter is confusing enough that we shouldn't approve this without explicitly writing out some form of specification and putting it through some formal approval, probably in the mailing list. It would be tragic if we missed some relevant corner case, only find out in a couple of releases, and end up stuck with a suboptimal behavior because of backwards compatibility.
I think it was @shoyer that hashed out a couple of years ago a pretty comprehensive document of how this should work, and even implemented a Python parser? Not sure where to find it, but it would be good to compare it to this and see if there are any major discrepancies.
numpy/core/src/umath/ufunc_object.c
Outdated
* possibly remap axes. | ||
*/ | ||
int axis_rqst; | ||
int axis_list = PyList_Check(axis); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the code be a little more readable if this variable was called axis_is_list
?
numpy/core/src/umath/ufunc_object.c
Outdated
*/ | ||
int axis_rqst; | ||
int axis_list = PyList_Check(axis); | ||
int item_list = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here item_is_list
reads easier below for me.
numpy/core/src/umath/ufunc_object.c
Outdated
} else { | ||
rqst = axis_rqst; /* use same axis for all operands */ | ||
} | ||
for (iax = num_dims - 1; iax >= 0; --iax) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this logic is unnecessarily complicated. And unless I am missing something, it doesn't check that there are no repeated items in item
, does it? Rather than shuffling entries in remap_axis
around, you could fill it directly doing something like:
int have_seen_axis[NPY_NUM_DIMS] = {0};
for (j = 0; j < num_dims; ++j) {
if (item_list) {
rqst = (int)PyInt_AsSsize_t(PyList_GET_ITEM(item, j));
if (error_converting(rqst)) {
retval = -1;
goto fail;
}
}
if (check_and_adjust_axis(&rqst, op_ndim) < 0) {
retval = -1;
goto fail;
}
if (have_seen_axis[rqst]) {
PyErr_Format(PyExc_ValueError,
"axis item %d has value %d repeated", i, rqst);
}
have_seen_axis[rqst] = 1;
remap_axis[i][op_ndim - num_dims + j] = rqst;
}
iax = 0;
for (j = 0; j < op_ndim - num_dims; ++j) {
while (have_seen_axis[iax]) {
++iax;
}
remap_axis[i][j] = iax;
}
numpy/core/src/umath/ufunc_object.c
Outdated
} | ||
for (i = 0; i < nop; ++i) { | ||
int op_ndim, iax, rqst, dflt; | ||
int num_dims = ufunc->core_num_dims[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, num_core_dims
or core_ndim
would be easier on my brain below, where I have to keep remembering that these are only the core dimensions...
numpy/core/src/umath/ufunc_object.c
Outdated
op_ndim = broadcast_ndim + ufunc->core_num_dims[i]; | ||
} | ||
/* initialize remap to straight one to one */ | ||
for (j = 0; j < op_ndim; j++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you follow my suggestion below, this loop is no longer needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized this is still needed for when item
is None
...
numpy/core/src/umath/ufunc_object.c
Outdated
/* | ||
* possibly remap axes. | ||
*/ | ||
int axis_rqst; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be a little influenced by having to follow this in my daily work, but wouldn't things be more readable if this was requested_axis
, or axis_for_all
, or something more descriptive?
@jaimefrio - thanks for the comments! Your way to fill I also agree with you that this better go past the mailing list! To help that, I've started writing some test cases (now included) and will add documentation. I'm still pondering whether to try to get On the syntax: I thought a single axis still work for a To all those still interested: I went back to a list of tuples, which I find much clearer (perhaps it should even be a tuple of tuples), and I ended up not seeing how we could ever sensibly support turning multiple axes into one anyway. |
Responses to @mhvk's first and latest posts are below:
Instead, I might say: axis should be a list with length equal to the number of inputs with a core dimension, or the total number of inputs and outputs with core dimensions.
I don't see much point in this, unless you have a large number of core dimensions. Writing Alternatively, we could use
Yes, this makes sense.
By one core dimension, do you mean one core dimension on exactly one argument?
Do you still think this works? I'm not sure how. "Broadcasting" an integer axis over multiple arguments seems like a slippery road to go down. I think we have discussed this before, but I wanted to bring up again that one use of an e.g., for
We don't need to support this yet, but we should be sure we don't do anything that precludes it. For reference, the Python parser I wrote for handling gufunc axis argument can be found here: http://nbviewer.jupyter.org/gist/shoyer/7740d32850084261d870 I'm not sure how helpful this is, but probably it's still a good idea to prototype the logic for axis canonicalization in Python. |
I'm not sure - I think it's easier to understand when there's a 1-to-1 mapping between arguments and axes - even if it comes at the cost of having to specify
+1 from me on this. For now, let's just reserve
The former would be something like
Can we just do the canonicalization in |
@shoyer, @eric-wieser - OK, I agree, let's reserve @shoyer - did the examples from @eric-wieser and myself convince you it makes sense to keep the single-integer case as applying to any single core dimension? I very much like the idea of being able to use the axis as a way to select the signature in, e.g., I do agree that, in principle, one could move the translation from integer to single-element tuple (or set of single-element tuples) outside, although this does mean that |
I think I missed those examples -- can you point to them again? One option that I believe @njsmith and I discussed previously is is to have two separate keyword arguments:
I would actually lean toward only making
This allows for the simplifying logic of an The only slightly confusing part about this is that implementors of |
@shoyer - the case for allowing
I can see the appeal of I do wonder a bit about |
numpy/core/src/umath/ufunc_object.c
Outdated
} | ||
else { | ||
Py_INCREF(op_axes_tuple); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this outdated comment on not doing tuple promotion unless op_ncore == 1
numpy/core/src/umath/ufunc_object.c
Outdated
int nout = ufunc->nout; | ||
int nop = nin + nout; | ||
int iop, list_size; | ||
PyObject *op_axes_tuple; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also move to the loop scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had it there, but since it is decref'd in fail
, I got compilation errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fail
can move within the loop too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I ended up having only two instances of goto fail
, I just removed that altogether, and moved the op_axes_tuple
object in the loop.
44f0ddc
to
02f89d9
Compare
LGTM, though I didn't check super carefully for little bugs. If I understand correctly, a future plan may be to allow collapsing multiple axes to one axis like That's starting to get a bit cluttered, so I was thinking about other syntaxes. What about supporting a dictionary like |
@ahaldane - I quite like the I also feel having keys that are just numbers is slightly odd, but perhaps a related alternative would resolve this: one could look in @eric-wieser, what do you think? |
@mhvk I quite like the keyword arg idea. I also found some other alternatives in this thread. Here's an example of what the different options we've mentioned look like: np.matmul(a, b, out=c, axes=[((-4,-3), -1), (-1,), (-2,)])
np.matmul(a, b, out=c, axes={0: ((-4,-3), -1), 2: (-2,)})
np.matmul(a, b, out=c, axes0=((-4,-3), -1), axes2=(-2,))
np.matmul(a, b, out=c, axes0=((-4,-3), -1), axes_out0=(-2,))
np.matmul(a, b, out=c, axes=[{"m": (-4, -3), "n": -1}, {"n": -1}, {"m": -2}]. I think my favorites right now are the |
The nested tuple syntax feels most natural to me, though I agree that it can get cumbersome for complex expressions. I would be OK with optionally replacing tuples at either the argument or axis level with dictionaries, keyed by either argument number or axis name, e.g., any of
That said, I'm skeptical that we actually need this. If the main argument is readability, then dictionary for axis names is probably the most valuable (e.g., I don't like the |
That sounds fair, let's support the list-of-tuples format in any case. As far as I can see, everyone is happy with this PR and it is ready to merge. So how about we merge this PR after 1.14.1 gets released in the next few days? We can leave any additional syntaxes or options like 'keepdims' for discussion in future PRs, aiming for 1.15. |
Sounds all OK to me. |
Would they scale well to an arbitrary amount if input arguments? It seems My favourite is the axes={0: ((-4,-3), -1), 2: (-2,)} |
Now that 1.14.1 is in, let's go ahead and merge this one. Then we can debate alternate syntaxes. I'll give this one another read-through myself, and then I plan to merge in half a day. Sounds good? If anyone wants more time to check anything, or if anyone who did the more thorough reviews above wants to give the final say, please speak up before then. |
numpy/core/src/umath/ufunc_object.c
Outdated
*/ | ||
for (axis = op_nbroadcast; axis < op_ndim; axis++) { | ||
axis_item = PyTuple_GetItem(op_axes_tuple, axis - op_nbroadcast); | ||
op_axis = (int)PyNumber_AsSsize_t(axis_item, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I think this is partly copied from PyUFunc_GenericReduction
. There, we used PyTuple_GETITEM
and PyArray_PyIntAsInt
instead of these two functions. I don't think it's too important so I'm happy to merge as is, but probably those functions are slightly better: faster and avoids the cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm quite happy to change, but think that the GetItem
ended up this way after discussion with @eric-wieser - suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see there was debate over this and use of PyTuple_CheckExact
. I'm fine leaving it as you have it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my point about PyTuple_GetItem
was wrong - there's no reason not to use PyTuple_GETITEM
here, as the former doesn't respect subclasses __getitem__
anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, replaced both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do this for the tuple, I assume I should also use GETITEM
for the list?
if (axes) { | ||
remap_axis = PyArray_malloc(sizeof(remap_axis[0]) * nop); | ||
remap_axis_memory = PyArray_malloc(sizeof(remap_axis_memory[0]) * | ||
nop * NPY_MAXDIMS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to avoid mallocs. In principle we could replace nop
by NPY_MAXARGS
and allocate on the stack, but that would be 32*32
ints, or 8kb (4kb on 32bit), which seems a little large to put on the stack. But I see other places in numpy we have already done this, eg a grep shows:
numpy/core/src/umath/ufunc_object.c:2061: int op_axes_arrays[NPY_MAXARGS][NPY_MAXDIMS];
numpy/core/src/multiarray/nditer_pywrap.c:742: int op_axes_arrays[NPY_MAXARGS][NPY_MAXDIMS];
numpy/core/src/multiarray/einsum.c.src:2610: char op_labels[NPY_MAXARGS][NPY_MAXDIMS];
numpy/core/src/multiarray/einsum.c.src:2617: int op_axes_arrays[NPY_MAXARGS][NPY_MAXDIMS];
I like the strategy in the 3rd line there to use a char
, so it is only 1kb. I also like how those other examples use C's 2d static array syntax, instead of allocating a separate index array.
In summary, I actually think it would be a significant improvement to allocate remap_axis_memory
as a static 2d char array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had that originally, but thought that it was a waste of memory for what should normally be a rare use case. A yet more complicated alternative would be to allocate a, say, 3x3 2-D array and not use it is there are more arguments. (note that the char
is for op_labels, which are in fact characters.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few ways to tidy this along those lines.
Perhaps do this: Get rid of remap_axis
and only have char remap_axis_memory[NPY_MAXARGS][NPY_MAXDIMS];
Then redefine #define REMAP_AXIS(iop, axis) ((axes ? remap_axis_memory[iop][axis] : axis)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry, I didn't see your comment. Yeah it is debatable.
I guess if you already investigated both options, I am happy to go with whichever you thin kis better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to combine these into one allocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remap_axis
is of pointers, remap_axis_memory
of int
, so in principle they can have different stride size, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stride size isn't relevant if you're allocating a void*
. I suppose alignment could be an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end, I think this is an implementation detail - the present code is at least somewhat clear, so maybe best to leave it for later.
All right finished reviewing, and it looks good. I think after your replies, I am still happy to merge as is. We can leave the mallocs. I'll wait until this evening to merge in case there are any more comments. |
02f89d9
to
5dfe2f0
Compare
Pushed up a rebased version with My main question is whether we stick with the list of tuples or a dict of tuples (which I quite like, but only felt it was not "typical", for reasons I am not altogether sure about). We can of course merge and decide later, but that tends to commit us to what is there (and I don't like the idea of having alternatives for a rarely used keyword!). EDIT: the main advantage of a dict remains that one takes care of defaults by not including those operands. |
+1 for putting things into a final form before pushing to master. If we release before a change, it could bind us to what's already there. |
I'm -1 on the dict simply because it's inconsistent with the out_fr, out_exp = np.frexp(x, out=(out_fr, None)) is supported, but out_fr, out_exp = np.frexp(x, out={0: out_fr}) is not. If we want to support dictionaries, I think it should come in a much later PR that also adds support for named ufunc arguments. |
The axes argument allows one to specify the axes on which the gufunc will operate (by default, the trailing ones). It has to be a list with length equal to the number of operands, and each element a tuple of length equal to the number of core dimensions, with each element an axis index. If there is only one core dimension, the tuple can be replaced by a single index, and if none of the outputs have core dimensions, the corresponding empty tuples can be omitted.
This ensures we do not have to guard against any operand having fewer dimensions than required by the ufunc in, e.g., _parse_axes_argument.
5dfe2f0
to
05a420a
Compare
@eric-wieser - agree with you somewhat that the dict is a break of how we pass in other arguments (good to have an actual example!). Though by the comparison with a
|
All right, I think we are close enough to a consensus that I will merge now. We can always change things in future PRs. I think we are all at least OK with the list-of-tuples syntax, I for one happy with it. Thanks @mhvk and all reviewers! |
EDIT (2017-11-07): went with the least controversial solution, which is easy to extend if we wish. To avoid confusion with the normal use of
axis
, I went with anaxes
keyword, which nominally should be a list with length equal to the total number of operands (inputs & outputs), with each entry a tuple with length equal to the number of core dimensions of that operand, containing axis indices. I still allow two short-cuts that did not seem controversial:Previous attempt to resummarize, with above conclusions added in:
In general, axis should be a list with length equal to either the number of inputs or the total number of operands (inputs & outputs). Each entry should be a
listtuple with length equal to the number of core dimensions of that operand, containing axis indices. Some shortcuts may be handy.It is probably easiest to see for a given signature. Assuming this is
(ij),(j)->(i)
:axis=[(-2, -1), (-1,), (-1,)]
; still trueaxis=[(-2, -1), (-1,)]
is the same; not generally any more; only if output has no core dimensions[(-2, -1), -1, -1]
; still true(i),(i)->()
one could just passaxis=-1
(this is not meant to cover(i),(j)->()
). not implemented; maybe eventually haveaxis
argumentCurrently under discussion:
axes
so we can useaxis
as shortcut in wrappers that prepareaxes
. DONEargs
,out
, and/oraxis
should be safe nowimportant mostly as the implementation adopted here should not create ambiguity resolving dispatches in future
matmul
and other ufuncs that right now have a special case for 1d vectors. For example, multiplication:axis=[(-2,-1), (-2,-1)]
→ matrix-matrixaxis=[(-1,), (-2, -1)]
→ vector-matrixaxis=[(-2, -1), (-1,)]
→ matrix-vectorIdeas discarded:
None
as a replacement for the default for a given operand; reasons:None
for possible other useNone
elsewhere means "all axes" - a different meaning would be confusingaxes
could choose.NOTE: needs documentation before merging.