-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
MAINT: Ensure ufunc override call each class only once, plus cleanup #11338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MAINT: Ensure ufunc override call each class only once, plus cleanup #11338
Conversation
^ to avoid needing a rebase |
PyObject *method; | ||
PyObject *obj; | ||
int j; | ||
npy_bool new_class = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NPY_TRUE
, if you're using npy_bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, maybe I shouldn't bother
@@ -106,7 +105,10 @@ PyUFunc_WithOverride(PyObject *args, PyObject *kwds, | |||
} | |||
|
|||
for (i = 0; i < nargs + nout_kwd; ++i) { | |||
PyObject *method; | |||
PyObject *obj; | |||
int j; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can move into the if (with_override ...)
block - no need to declare it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a later commit, it ends up in the right place.
*/ | ||
method = get_non_default_array_ufunc(obj); | ||
if (method != NULL) { | ||
if (with_override != NULL) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about when with_override == NULL
? Won't things get called twice? When is it NULL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, methods
and with_override
are either both given or both NULL. This is partially why I ended up taking the function apart - it just didn't make sense to try to make one function work both for collecting and checking overrides. (It is not that great to come back to one's own code and find it rather poorly thought out!)
Having the move commit be part of this PR at all makes the comments above all appear as outdated... |
Shall I split the PR in two parts? I realize the move indeed makes it very unhandy, also to change things, since I effectively have to do it twice when I rebase. |
I think that would be best. |
de4c5ba
to
f8d034d
Compare
} | ||
} | ||
else { | ||
nout = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to just cast to a 1-tuple here? We already do that when we call __array_wrap__
and __array_ufunc__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also do:
PyObject **out_objs;
if (out_kwd_obj == NULL) {
nout = 0;
out_objs = NULL;
}
else if (PyTuple_CheckExact(out_kwd_obj)) {
nout = PySequence_Fast_GET_SIZE(out_kwd_obj)
out_objs = PySequence_Fast_ITEMS(out_kwd_obj);
}
else {
nout = 1
out_objs = &out_kwd_obj;
}
// Cleanup just xdecrefs out_kwd_obj
``
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could wrap this into
int parse_out_args(PyObject *args, PyObject *kwargs, PyObject **out_ref, PyObject ***out_objs) {
// code above
return nout
}
Which would at least prevent you repeating this logic twice in this file. We can solve the repetitions across files later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that logic very much, and have adapted it; in the end, it should percolate up.
if (kwds == NULL) { | ||
return 0; | ||
} | ||
out_kwd_obj = PyDict_GetItemString(kwds, "out"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a huge fan of parsing the out argument in yet one more place. Can we use ufunc_full_args
somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to leave that to a later PR (in #11372, it ends being done at the very start of ufunc_generic_call
, which I think is most logical, since every part below it needs it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me - my objection was to introducing one more parser. I'm happy now that the number of parsers in this file hasn't increased :)
@eric-wieser - thanks for the comments; I'll try to look better later. For now, for fun, have a look at where this is leading: #11372 (I do want to proceed in small steps, but the goal informs some choices, such as not using |
f8d034d
to
5cd5590
Compare
OK, adjusted. Next step here would be to move the stuff that is not shared between |
5cd5590
to
b5f4d8a
Compare
Is this going to be split into two pull reqeusts? |
*out_kwd_obj = PyDict_GetItemString(kwds, "out"); | ||
if (*out_kwd_obj == NULL) { | ||
return 0; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indentation
* Returns -1 if kwds is not a dict, 0 if no outputs found. | ||
*/ | ||
static int | ||
get_out_objects(PyObject *kwds, PyObject **out_kwd_obj, PyObject ***out_objs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need out_kwd_obj
as an argument - you don't seem to use it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I need a place to store the reference - at least, I think I do, and when I tried initially without it, it definitely had random segfaults (I considered initializing out_objs
with a NULL reference in its first slot, but I felt that it was better not to write a function that expected such initialized array was a bit much). But perhaps I am just missing something? (C remains not one of my strengths...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I get it now - otherwise you end up returning a reference to a local variable. That's pretty unfortunate.
"Internal Numpy error: call to PyUFunc_HasOverride " | ||
"with non-tuple"); | ||
goto fail; | ||
nin = PyTuple_Size(args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is misleading - it's not the number of inputs, but the number of positional args, some of which may be outputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm not sure if that's true - if it's not, can you clarify that PyUFunc_WithOverride
never receives out arguments positionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it is just the positional arguments. This means that for reductions arguments like axis
get checked too, which is not great, but out of scope here (there is an issue, #9107, and I do hope to get to it...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, now clarified this and called things arg
instead of in
.
@mattip: Looks sufficiently split to me, but some more comments |
@mattip - it was already split up; there will be a follow-up PR to move some of the functions to |
b5f4d8a
to
8d19718
Compare
@eric-wieser - addressed your smaller comments; if you have a suggestion for how to deal with |
@eric-wieser - shall we get this in? I'd like to get to the second part of this... |
PyObject **in_objs, **out_objs; | ||
|
||
/* check inputs */ | ||
nin = PyTuple_Size(args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: PySequence_FAST_SIZE
would be more consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, what you have behaves a little better in the face of bad input
/* | ||
* Check whether any of a set of input and output args have a non-default | ||
* `__array_ufunc__` method. Return 1 if so, 0 if not. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment needs an explanation of why this function needs to exist in addition to PyUFunc_WithOverride
(ie, performance, maybe?)
I'd like to see a little more in-source explanation of why both WithOverride and HasOverride are needed. It wasn't immediately obvious to me that
|
8d19718
to
3d6cc59
Compare
OK, done. Note that I didn't mention the O(nin^2), since the check for a class already having been done is much faster than getting the |
Overall, it likely doesn't matter much for performance, but it is more logical and more consistent with what python does: reverse operators are not called if the forward one of a given class already returned NotImplemented.
3d6cc59
to
52df1f1
Compare
Code was getting too convoluted and both can be optimized in different ways.
52df1f1
to
4b700ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_out_objects
isn't as nice as I'd hoped, but in future I'd hope that the arguments get parsed earlier anyway, and the need for it goes away
Indeed, eventually I'll merge this so I can try my follow-up. |
Replaces #11320, for which I wrote:
"This started as a simple fix for #11306, ensuring that as we collect arguments and overrides, we skip those that have the same class as ones collected already. But looking at the code (mostly my own...), I found a reference counting bug (first commit), and also realized that for the
ndarray
side, where all that is needed is to know whether there are any overrides, the code was needlessly slow. But splitting that out, there was no need any more for much of the code to be in thesrc/private
directory, so I moved it. As a result, this is quite a lot of shuffling, but I think it makes the whole a lot clearer and cleaner.Left really is the annoyance of not having access to the
PyArray
API - thus, this should be revisited oncemultiarray
andumath
have been merged."This PR follows @eric-wieser's request to make the commits more stand-alone, and in particular ensure that the one that moves functions around does nothing else than the move. This is true here (well, except that the names of the functions were changed).
Note that this is based on #11336 - I'll rebase once that is merged.EDIT: now done.EDIT: also kept out the move for now, for easier review
fixes #11306