-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
BUG: Minimize iterator size in gufunc allocation of output arrays #4458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Closes numpy#4442 (or at least mitigates it) Prior to this PR, the iterator created by a gufunc had all core dimensions of all output arguments appended to the broadcasted non-core dimensions. When working with gufuncs with multiple return arguments, this would lead to "iterator is too large" errors. This PR changes that mechanism, and now each distinct core-dimension is appended only the maximum number of times that it comes up in an output argument's signature. As an example, take numpy#4442, which was failing in a call to SVD with signature `(m,n)->(m,n),(n),(n,n)`. The iterator was created with shape `(m,n,n,n,n)` (all output arguments' core dimensions), and then the `op_axes` for each of the three output parameters were set to `[0, 1, -1, -1, -1]`, `[-1, -1, 0, -1, -1]` and `[-1, -1, -1, 0, 1]`. After this PR, in this same case the iterator has shape `(m, n, n)`, and the `op_axes` now are `[0, 1, -1]`, `[-1, 0, -1]` and `[-1, 0, 1]`. The only concern is that a function with e.g. signature `(a,b)->(b,a)` would have had its output created with an iterator of shape `(b, a)` and `op_axes` set to `[0 ,1]`. Now it will be created with shape `(a, b)` and `op_axes` set to `[1, 0]`. Not sure if it will result in a different memory layout of the allocated array.
That seems to be the failure that always pops up these days. I don't think it has to do anything with any of my changes. |
Yes, debug is broken on travis currently. I didn't read the stuff, but I understand it correctly that before a dimension was added for all output core dimension, while now you only add it as many times as the most equivalent ones occures (I bet the sentence doesn't make sense, but think you probably know what I mean ;)). A possible hack should be to add a flag to the iterator something like |
I just checked the linalg gufuncs... and it appears to me that SVD is the only one that should have this problem. But... calculating the SVD of a large matrix doesn't seem that absurd to me that I think we still might want to do a |
Yes, I think you understand the idea, not sure if my description is well formed English either... My biggest concern with something like Of course that I do not know how to go about it does not make it a bad idea! And a hack or not, it would solve the issue entirely. For SVD the iterator now has size |
Heh, I had missed the longer explanation ;), it is clear... Well, there is a RemoveAxis step, which can fully recalculate the size (see also gh-3861 for the place though the size calculation should be stolen from the one setting the current error). So if we add a Just adding the INTERNAL because I am not sure if I like it enough to consider it public api... |
Do you really need a new flag? I have tried to follow the use of The idea would be to move the size calculation (and the check to see if |
Great point. I forgot about |
didn't follow everything, but as a minimal fix wouldn't it be possible to bump the itersize variable to 64 bit on all platforms? |
@juliantaylor, it might be nice anyway for some einsums/reductions. The itersize is exposed, so I am not sure about compatibility (only through function access, so if we don't mind bogus results for things that currently don't work...). The underlying issue would only be mitigated, so if we assume nobody does svd's on huge arrys, we are fine... Otherwise, we may have to swallow the bitter pill and actually consider backporting the larger change (it should not be huge, but...). @jaimefrio do you have time to implement your suggestion (as simple as possible)? Otherwise I will probably do it tomorrow or in two days. |
I thought it was going to be easier than it is, and have a not-yet-working implementation halfway through. I'll try to complete it tonight or over the weekend. But even if I do get it to work, I am going to need very close supervision on this one: I am learning the iterator internals one compiler error at a time, so I have very little confidence that I am not overlooking some minor (or major) detail. I'll open a new PR as soon as I get it to work. When I do, feel free to grab it and complete it yourself if you think that is going to be faster/safer. |
:-) |
Since I was surprised that it should be difficult, I had a stab at it. The result is here: seberg@d7ba7e7 May have missed something, but overall I would guess that it can't fail worse then current 1.8. which lacks any check ;). Is there anything bad with this solution Jaime? |
@charris it is a much more accurate description of my workflow than I'd like to admit :__( @seberg Ah, really nice! Your approach is much cleaner and elegant and concise and overall just plain better than the mess I was trying to put together. The iterator internals are adult material, and kids like myself shouldn't be playing with it! At first I thought you may not be handling setting of |
Where do we sit with this? It would be good to have a solution for 1.8.1, either @jaimefrio or @seberg. |
@seberg's other PR is the proper solution. This PR is just delaying the inevitable, someone will eventually come complaining again if this is all we do. The only concern is how to properly test the changes to the iterator. I think it is relatively safe, but I don't trust my judgement on this too much. So ideally let's do the other, I am keeping this open as a backup in case something nasty comes up in the other one. |
@jaimefrio @seberg Sebastian's version has been merged, do you want to close this one? |
Yes, probably not worth the extra code complication. After the changes to the iterator it would only help pushing errors like this one:
so that it would accept a couple more dimensions. Does not seem like a pressing need... |
Closes #4442 (or at least mitigates it)
Prior to this PR, the iterator created by a gufunc had all core dimensions
of all output arguments appended to the broadcasted non-core dimensions.
When working with gufuncs with multiple return arguments, this could lead
to "iterator is too large" errors with moderately sized inputs.
This PR changes that mechanism, and now each distinct core-dimension is
appended only the maximum number of times that it comes up in an output
argument's signature.
As an example, take #4442, which was failing in a call to SVD with
signature
(m,n)->(m,n),(n),(n,n)
. The iterator was created with shape(m ,n, n, n, n)
(all output arguments' core dimensions), and then theop_axes
for each of the three output parameters were set to[0, 1, -1, -1, -1]
,[-1, -1, 0, -1, -1]
and[-1, -1, -1, 0, 1]
. Afterthis PR, in this same case the iterator has shape
(m, n, n)
, and theop_axes
now are[0, 1, -1]
,[-1, 0, -1]
and[-1, 0, 1]
.The only concern is that a function with e.g. signature
(a,b)->(b,a)
wouldhave had its output created with an iterator of shape
(b, a)
andop_axes
set to
[0, 1]
. Now it will be created with shape(a, b)
andop_axes
setto
[1, 0]
. Not sure if it will result in a different memory layout of theallocated array.