Skip to content

BUG: Fix memory leak for subclass slicing #10068

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 26, 2017
Merged

Conversation

seberg
Copy link
Member

@seberg seberg commented Nov 22, 2017

A slice object was created if getslice/setslice was used
by an inherited subclass (through python or directly) but not
decref'd.

Closes gh-10066

--

I admit, I probably overdid it on that test, heh.

@mhvk mhvk added this to the 1.14.0 release milestone Nov 22, 2017
@mhvk
Copy link
Contributor

mhvk commented Nov 22, 2017

Nice sleuthing - and the fix looks good (and, @f0k, thanks for reporting!)

@charris - I assume there are no 1.13 bug fixes any more, so this will just go in 1.14?

@eric-wieser
Copy link
Member

Careless on my part in #8953

@seberg
Copy link
Member Author

seberg commented Nov 22, 2017

I really do not get why the error paths end up with one extra reference to that object, even happens if I call __getitem__ manually, is that a python bug?

@charris
Copy link
Member

charris commented Nov 22, 2017

@mhvk, If it causes a noticeable problem, as this seems to do, there is no reason not to backport. I'm going to do a 1.13.4 about the same time as 1.14.0 just for the distros who are still on 1.13.

@charris charris added the 09 - Backport-Candidate PRs tagged should be backported label Nov 23, 2017
@seberg
Copy link
Member Author

seberg commented Nov 23, 2017

Too sleepy to figure out why the error paths have a higher ref count. Probably can just rip that part out of the test, or put a 3 until someone gets 2. It did decrease by one correctly of course....

@eric-wieser
Copy link
Member

eric-wieser commented Nov 23, 2017

It's possible that array_getitem also leaks references when passed slices - but you wouldn't see that with this test...

@f0k
Copy link
Contributor

f0k commented Nov 23, 2017

Thanks for the fast reaction! The fix looks good, not sure about the refcount in the test either. When I run the same code as a standalone script, I get 3 only with the leaking numpy 1.13, otherwise I get 2. Does it change anything if you check the refcount before comparing against a slice (i.e., does assert_equal leak a reference)? Otherwise just go with the flow and change it to 3 in the test.

Careless on my part in #8953

Why was it added back in the first place? To support user code that called __getslice__ directly? And can anybody briefly enlighten me why this was a problem for subclasses only, and not for ndarray itself?

It's possible that array_getitem also leaks references when passed slices

No, I just checked that x[:7:] doesn't leak. This forces Python to call __getitem__ with a slice object, instead of calling __getslice__ (as would happen with x[:7]). The workaround of x[:7,] mentioned in #10066 also forces Python to use __getitem__, but constructs an extra tuple.

@seberg
Copy link
Member Author

seberg commented Nov 23, 2017

Yes, it is just to support __getslice__ direct calls. The basic thing is, CPython has two slots for these, one on the C-side and one on the python side. We manually define the Python side now, but explicitly do not define the C-side. This means that normal python arrays do not use this code. (unless you call __getslice__ explicitly.

For subclasses, the automatic wrappers will create the C-side slot if they find __getslice__, etc. so for the subclass the C-side implementation suddenly exists and your code here will use the function.

@mhvk
Copy link
Contributor

mhvk commented Nov 23, 2017

@f0k - just to add what @seberg wrote above: it is quite common for subclasses to override something and then do super(...).__getslice__(...) -- we didn't want to break these. Obviously, this can all go by 2019, when we no longer support python2.

@seberg
Copy link
Member Author

seberg commented Nov 23, 2017

I guess I will remove the test probably. the refcount is bad for any method (and inheriting only object, and moving the def to top level). Also it works correctly on my python 3. I would not be too surprised if it is already fixed in python 2 master as well.

It still seems strange to me, I can't see why this is not a moderately bad memory leak in python 2.

@seberg
Copy link
Member Author

seberg commented Nov 23, 2017

Ah, found it with a bit of googleing. Something leaks into the exception info. So it collected when a new exception gets raised.

@njsmith
Copy link
Member

njsmith commented Nov 24, 2017

One strategy for making refleak tests more robust is to call the function, like, 100 times in a loop, and then assert that the refcount is less than 50.

@seberg
Copy link
Member Author

seberg commented Nov 24, 2017 via email

@f0k
Copy link
Contributor

f0k commented Nov 26, 2017

So it collected when a new exception gets raised.

Interesting. So can you fix the test by manually raising and catching an exception before checking the refcount? It may also be possible to run x[0:5] 100 times and then check for the number of slice instances in what's returned by gc.get_objects() (before and afterwards). It's probably also okay to remove the test, though.

@charris
Copy link
Member

charris commented Nov 26, 2017

So a refcount of 2 should be OK? It would be good to finish this up, I'd even take the fix separate from the test if that will speed things up.

@seberg
Copy link
Member Author

seberg commented Nov 26, 2017

Sorry, am overloaded. Tests should pass now

@f0k
Copy link
Contributor

f0k commented Nov 26, 2017

Tests should pass now

They do! They trigger a DeprecationWarning in the test running with -3, though:

DeprecationWarning: sys.exc_clear() not supported in 3.x; use except clauses

So in addition, you will need a warning.catch_warnings() around sys.exc_clear() or something...

@charris
Copy link
Member

charris commented Nov 26, 2017

You can't win ;) Now Python 2.7 with the -3 flag is complaining that sys.exc_clear() is deprecated in Python 3.

I'll be glad when we can drop 2.7.

A slice object was created if __getslice__/__setslice__ was used
by an inherited subclass (through python or directly) but not
decref'd.

Closes numpygh-10066
@seberg
Copy link
Member Author

seberg commented Nov 26, 2017

Yeah, yeah, I am getting there, heh....

@charris charris merged commit 984bc91 into numpy:master Nov 26, 2017
@charris
Copy link
Member

charris commented Nov 26, 2017

Thanks Sebastian.

@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Nov 26, 2017
@f0k
Copy link
Contributor

f0k commented Nov 26, 2017

Thank you for the quick fix and your perseverance :)

@seberg seberg deleted the issue-10066 branch November 27, 2017 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants