-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Fix "numpy scalar * array-like == performance horror" #3501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Seems not quite that simple. The tests fail. It seems that this change as is breaks compatibility for array subclasses (i.e. matrix and masked arrays)? I don't have time to look into it, my best guess right now is that the reflected op, i.e. |
Ok, second attempt. This stays completely out of the |
Can you explain to me why the array is converted currently? As far as I |
First, to answer your question:
That routine calls Second, in my opinion, having Another thing that's broken in the current code and would have to be fixed along with the issue above is that lots of places (such as 1, 2) use My patch above is a valid band aid for #3375. It does not claim to fix the issues I've mentioned. I do claim that it doesn't break anything that was previously working. Moreover, having priority logic "further up" (as you called it) is actually less risky, because it only touches dispatch for binary operators. In other words, fewer code paths go through there than through the ufunc. The issues I mentioned above should probably be filed as bugs of their own. |
Good points, it is true that it doesn't even make sense that ufuncs can possibly return NotImplemented, etc. I agree that this whole thing isn't ideal and likely it is the only reasonable thing to fix it where you fixed it. It is basically only because of how python handles subclassing with operators (i.e. Python knows that the subclass should be asked to handle the operator) that this code works at all. To be honest, I am not quite certain that array priority is necessary at all :). I think the whole thing needs more thinking. I would digg a bit into checking other options before deciding that this perfect for myself, but I don't have the time right now. This creates one change I think, that I don't really mind too much, but should maybe be deprecated? If YourThing does not implement the |
Array priority itself is a reasonable idea--it makes my example possible at all. Without it, If some object declares an |
@@ -209,6 +209,49 @@ | |||
Py_INCREF(Py_NotImplemented); | |||
return Py_NotImplemented; | |||
} | |||
|
|||
if (!PyArray_Check(m2)) { | |||
/* Catch priority inversion and punt, but only if it's guaranteed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blank first line in comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is very large, I think you should omit the extended argument and just have a short explanation of what happens. The argument would look better as part of the pull request explanation, but it's late for that ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically just the first paragraph.
@seberg Are you happy with this? |
It seems like it is the right spot for such a check. I am still a little worried that breaking the case of people defining array priority but not supporting all the operators actually might affect code out there. I don't mind that as such, so maybe just try and see... What I am thinking of is mostly objects defining |
* Catch priority inversion and punt, but only if it's guaranteed | ||
* that we were called through m1 and the other guy is not an array | ||
* at all. Note that some arrays need to pass through here even | ||
* with priorities inverted, for example: float(17) * np.matrix(...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
float(17)
isn't an array so I guess there's a mistake in the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, that's correct. float(17) * np.matrix(...)
first ends up in np.matrix.__rmul__
, which calls np.dot
, which calls PyArray_MatrixProduct2
, which casts float(17)
into an array scalar an then calls this place. :)
Btw, I think I've addressed @charris's concerns in the commit I pushed earlier. |
OK, one more thing. Because this changes behavior it should be mentioned in For next time note the standard prefixes for commit messages in |
Revised, with change comments and better commit message, as requested. |
Fix "numpy scalar * array-like == performance horror"
Thanks for your help getting this merged! |
This pull request fixes #3375. All tests that pass before this change also pass after. (There's a failure that seems unrelated.) The PR also adds a test that asserts that
__array_priority__
gets looked at early enough to make sure no copy is attempted.