-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
matplotlib shouldn't call gc.collect() #3044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Milestone
Comments
kmike
added a commit
to kmike/matplotlib
that referenced
this issue
May 6, 2014
tacaswell
added a commit
to tacaswell/matplotlib
that referenced
this issue
Aug 23, 2022
Matplotlib has a large number of circular references (between figure and manager, between axes and figure, axes and artist, figure and canvas, and ...) so when the user drops their last reference to a `Figure` (and clears it from pyplot's state), the objects will not immediately deleted. To account for this we have long (goes back to e34a333 the "reorganize code" commit in 2004 which is the end of history for much of the code) had a `gc.collect()` in the close logic in order to promptly clean up after our selves. However, unconditionally calling `gc.collect` and be a major performance issue (see matplotlib#3044 and matplotlib#3045) because if there are a large number of long-lived user objects Python will spend a lot of time checking objects that are not going away are never going away. Instead of doing a full collection we switched to clearing out the lowest two generations. However this both not doing what we want (as most of our objects will actually survive) and due to clearing out the first generation opened us up to having unbounded memory usage. In cases with a very tight loop between creating the figure and destroying it (e.g. `plt.figure(); plt.close()`) the first generation will never grow large enough for Python to consider running the collection on the higher generations. This will lead to un-bounded memory usage as the long-lived objects are never re-considered to look for reference cycles and hence are never deleted because their reference counts will never go to zero. closes matplotlib#23701
melissawm
pushed a commit
to melissawm/matplotlib
that referenced
this issue
Dec 19, 2022
Matplotlib has a large number of circular references (between figure and manager, between axes and figure, axes and artist, figure and canvas, and ...) so when the user drops their last reference to a `Figure` (and clears it from pyplot's state), the objects will not immediately deleted. To account for this we have long (goes back to e34a333 the "reorganize code" commit in 2004 which is the end of history for much of the code) had a `gc.collect()` in the close logic in order to promptly clean up after our selves. However, unconditionally calling `gc.collect` and be a major performance issue (see matplotlib#3044 and matplotlib#3045) because if there are a large number of long-lived user objects Python will spend a lot of time checking objects that are not going away are never going away. Instead of doing a full collection we switched to clearing out the lowest two generations. However this both not doing what we want (as most of our objects will actually survive) and due to clearing out the first generation opened us up to having unbounded memory usage. In cases with a very tight loop between creating the figure and destroying it (e.g. `plt.figure(); plt.close()`) the first generation will never grow large enough for Python to consider running the collection on the higher generations. This will lead to un-bounded memory usage as the long-lived objects are never re-considered to look for reference cycles and hence are never deleted because their reference counts will never go to zero. closes matplotlib#23701
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I'm working with a big IPython notebook with inline matplotlib charts and a lot of objects in memory. When a chart is drawn IPython closes the matplotlib figure; when this happens matplotlib calls
gc.collect()
here (there is an another possible gc.collect call here). This causes 5-10s delays, making interactive work uncomfortable.What do you think about removing these calls, or maybe using less aggressive settings, like generation 0 or 1? In my case
gc.collect(0)
andgc.collect(1)
are instant unlikegc.collect()
orgc.collect(2)
.I've fired a related issue to the IPython bug tracker: ipython/ipython#5795
The text was updated successfully, but these errors were encountered: