-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
FIX: close mem leak for repeated draw #11972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
May be worth checking whether this helps #9141 (unlikely, but also a matter of dead weakref leakage)... |
I believe |
I can repro the performance issue
goes from ~1.6s to ~2.0s (apparently we spend a lot of time fiddling with transforms...), so that's quite significant. Perhaps we can use the |
Ha, figures. I spent a bit of time trying to figure out why there were so many unique transforms being created, but gave up. That whole machinery is opaque, at least to me. |
I have a grand plan to rewrite the whole thing in C++ at some point :p |
So if I do for child in children:
for key in child._parents.keys():
if child._parents[key] is None:
child._parents.pop(key)
child._parents[id(self)] = weakref.ref(self) Then I get median dt = 0.814 s If I do it without I get the memory leak an dt = 0.764, so only 7.5% increase in execution time, versus the WeakDictionary which has a 30% increase. (I got 1.012 s) I think its pretty crazy that a bit of transform book keeping can add so much to the draw time. But, is the above modest increase in book keeping time OK? |
would `del child._parents[key]` be slightly more efficient? It is a bit
more explicit about the intent.
…On Wed, Aug 29, 2018 at 5:48 PM Jody Klymak ***@***.***> wrote:
So if I do
for child in children:
for key in child._parents.keys():
if child._parents[key] is None:
child._parents.pop(key)
child._parents[id(self)] = weakref.ref(self)
Then I get median dt = 0.814 s
If I do it without I get the memory leak an dt = 0.764, so only 7.5%
increase in execution time, versus the WeakDictionary which has a 30%
increase. (I got 1.012 s)
I think its pretty crazy that a bit of transform book keeping can add so
much to the draw time.
But, is the above modest increase in book keeping time OK?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#11972 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AARy-EUtjr0emCC7SpPVews_jfZV_IEHks5uVwwkgaJpZM4WQyYV>
.
|
My proposal above was to do something along the lines (untested) of
which should auto-remove the dead weakrefs. |
@WeatherGod Actually, the code above still has the memory leak in it. Needs to be longer: for child in children:
badkeys = []
for key in child._parents.keys():
if child._parents[key]() is None:
badkeys += [key]
for key in badkeys:
child._parents.pop(key)
child._parents[id(self)] = weakref.ref(self) Which gives a mildly longer run-time: 0.8348 s @anntzer, your solution just gives a key error, and I don't follow it well enough to know how to fix it.
|
How about: ref = weakref.ref(self, lambda ref, sid=id(self), target=child._parents: target.pop(sid))
child._parents[id(self)] = ref |
@tacaswell that seems to work, and has dt=0.782, so only imperceptibly slower than without that change. |
3882397
to
2f0a96c
Compare
This doesn't fix #9141 unfortunately, but maybe someone had a similar misapprehension about what happens to wekrefs that are put into a dictionary... |
This seems reasonable, but I think it leaves pickling/unpickling incomplete, because the new deletion behavior is lost. This would not be the case if a WeakValueDictionary were used throughout. WeakValueDictionary lookups are indeed slow, though. Here is a comparison between a WeakValueDictionary,
The numbers are essentially unchanged for 100-entry dictionaries. |
Looking at the code for WeakValueDictionary gives me a mild suspicion that doing without it involves a risk of extremely rare errors. Here is an excerpt from the
|
For my edification, though, why do we keep getting so many child/parent pairs for these transforms? it seems this is something that we could readily deal with explicitly when the axes is cleared or destroyed. But maybe I'm being naive... |
I've never understood the transform framework well and I still don't, but here is the way it looks to me right now. First, the "parent-child" terminology is misleading; "parents" are transforms that depend on their "children", so that if a "child" is modified, the immediate parents, and their parents, etc. must be marked invalid, because the net result of the chain to that point will be modified and must be recalculated. Second, in the chain of transform steps taking a line in data coordinates to its representation in display coordinates, there is a series of links that does not need to change for each new line. The Axes box and the canvas dimensions aren't changing, for example. Therefore, there are links in the transform chain that are reused, and one end of the reused chain becomes a "child" that gets a new "parent" (starting a fresh sequence of links) each time a new line is added or an old line is replaced. Unless the old, no-longer-used "parents" are deleted from the Although I have described the transform sequence as chain, it can be more complicated. In particular, two chains, one for x and one for y, can be combined into a blended transform (a "parent"). Each of the respective "child" ends of the x and y chains then has a |
I still think there is a root cause here that I don't understand. If I check the transform for the line, which is the only thing that is changing each draw in the test code, then I get the same transform each time. Its not being invalidated, and its not changing. So its children never get set. So some other transform keeps reseting its own children in this case (or making a new "parent"). My strong suspicion is that it is the data cursor (or whatever we call the data display in the bottom left corner). I'm not sure where to get its transform info, but it doesn't seem like it should be calling |
What is the test code that you are using? |
I suspect the problem will turn out to be pervasive, not restricted to one little plot element. I can trigger massive transform node generation, and get a glimpse of one source, with the following procedure. First, on master, edit transforms.py, inserting 2 lines after line 170 so that the body of for child in children:
if len(child._parents) > 99:
raise Exception
child._parents[id(self)] = weakref.ref(self) Next, in ipython, execute
That should be enough to raise the exception. Then you can see from the long traceback the chain of function calls that is involved in the transform node generation, and you can use the The number "99" is obviously arbitrary. If you use a number like 50, you only need one This is not just a rabbit hole, it is a rabbit metropolis. |
Re: #11972 (comment) I think you mean the use of _atomic_removal? Looks like it went in in python/cpython@e10ca3a re: one thread's removing dead weakrefs incorrectly killing a new entry set by another thread. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost ready--but I think the _parents
construction in __setstate__
needs a similar treatment to include the callback.
lib/matplotlib/transforms.py
Outdated
# pass a weak reference. The second arg is a callback for | ||
# when the weak ref is garbage collected to also | ||
# remove the dictionary element, otherwise child._parents | ||
# keeps growing for multiple draws. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative comment: "Use weak references so this dictionary won't keep obsolete nodes alive; the callback deletes the dictionary entry. This is a performance improvement over using WeakValueDictionary."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WRT __set_state__
no prob - I'll do tonight...
@efiring agreed that its a bit of a warren. I understand why import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
l = ax.plot(np.arange(10))
plt.show(block=False)
for i in range(1, 5000):
l[0].set_ydata(np.arange(0, 10 * i, i))
plt.pause(0.0001) does, which is what the OP was doing. Its something with |
Right, without the |
If we can track down why repeatedly drawing is creating new transforms that may yield some nice performance improvements as well..... |
... is there somewhere that all the |
.. actually, sorry, I see now - unless you use blitting and animation, of course the whole image needs to be recomposed on subsequent draws because otherwise it doesn't know what the "background" looks like and you can't erase the "old" state cleanly. Still not sure why that means new transforms need to be made each draw, so that is still a mystery... |
Stale keeps track if there has been a change to the artist that would require re-rendering the whole figure. In principle it could be used for 'auto-blitting' (which requires tracking what bounding boxes are stale, what artist overlap with that, which makes the stale box bigger, and figuring out where the updated artists will land, and then blanking just those (regions) re-drawing just the the things that change) but currently it is just used to decide if we should call
matplotlib/lib/matplotlib/backends/backend_agg.py Lines 392 to 407 in 7c0e760
|
2f0a96c
to
f23d891
Compare
f23d891
to
3325bde
Compare
@efiring, I thinkI fixed the setstate properly. OTOH, I don't have a lot invested in squashing this bug for folks who pickle their plotting environment (a feature I think is over the top in requiring us to jump through hoops). |
It is something we support and can not break. The biggest use case is using multi-processing to build figures where generating (but not drawing) the artists is very expensive. Why is |
Its still unclear to me! |
def set_children(self, *children):
"""
Set the children of the transform, to let the invalidation
system know which transforms can invalidate this transform.
Should be called from the constructor of any transforms that
depend on other transforms.
"""
# Parents are stored as weak references, so that if the
# parents are destroyed, references from the children won't
# keep them alive.
print('set children!', id(self))
for child in children:
# Use weak references so this dictionary won't keep obsolete nodes
# alive; the callback deletes the dictionary entry. This is a
# performance improvement over using WeakValueDictionary.
ref = weakref.ref(self, lambda ref, sid=id(self),
target=child._parents: target.pop(sid))
child._parents[id(self)] = ref and then run: import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
l = ax.plot(np.arange(10))
plt.show() and wiggle the cursor around - |
More digging arround - if you |
@jklymak sorry to barge in. Have you tried the fix you gave in the commit? I still get between 0.1-0.3 MB increase in memory use with every call to draw. I made the changes yesterday, left the app running for 15 hours, memory usage still went up by 2.2 GB (redraw every 30 seconds). Is the fix still in progress? |
@Bibushka I tested w/ the code below and get results like those below for as long as I want. If you are testing w/ different code, or have a different set up perhaps there is another memory leak, or there is something wrong with the test?
import matplotlib.pyplot as plt
import numpy as np
import os
import linecache
import sys
import tracemalloc
import time
def display_top(snapshot, key_type='lineno', limit=2):
'''
function for pretty printing tracemalloc output
'''
snapshot = snapshot.filter_traces((
tracemalloc.Filter(False, "<frozen importlib._bootstrap>"),
tracemalloc.Filter(False, "<unknown>"),
))
top_stats = snapshot.statistics(key_type)
total = sum(stat.size for stat in top_stats)
print("Total allocated size: %.1f KiB" % (total / 1024))
tracemalloc.start()
y = np.random.rand(100)
x = range(len(y))
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(x,y,'b-')
plt.show(block=False)
t0 = time.time()
while True:
try:
ax.clear()
ax.plot(x,np.random.rand(100),'b-')
plt.pause(0.0001)
snapshot = tracemalloc.take_snapshot()
display_top(snapshot)
time.sleep(0.05)
except KeyboardInterrupt:
break |
I'm kind of a nub and my setup is not as fancy. I use memory_profiler's profile to track the memory changes, see results below:
Results: update_figure Line # Mem usage Increment Line Contents update_figure Line # Mem usage Increment Line Contents update_figure Line # Mem usage Increment Line Contents update_figure Line # Mem usage Increment Line Contents update_figure Line # Mem usage Increment Line Contents update_figure Line # Mem usage Increment Line Contents update_figure Line # Mem usage Increment Line Contents update_figure Line # Mem usage Increment Line Contents update_figure Line # Mem usage Increment Line Contents update_figure Line # Mem usage Increment Line Contents |
@Bibushka The @jklymak Are you sure that loop is actually causing re-draws? I think you need a |
@tacaswell. Plt.pause does the redraw not think. It definitely updates the plot ;-) |
Ah, 🐑 nvm |
@tacaswell i don't think it's the timer, nor the lists that cause the problem. The problem is that one extra point in the plot shouldn't make the memory jump by 0.1MB by each call of self.draw() and the fact that this memory isn't cleared when i use self.axes.cla(). |
I have managed to switch to PyQt5 instead of PySide2 and it seams to have stopped the memory leak, never would've imagined a library could cause such a huge problem |
See #12089 and the follow on PRs. This should be fixed in 3.0 and will be fixed in 2.2.4 |
PR Summary
Closes #11956
See test in #11956, but repeated drawing kept growing the size of
transform._parents
ad infinitum with dead weak refs. These aren't too big, but add up if you run for quite a while...PR Checklist