Skip to content

[Bug]: High CPU utilization of the macosx backend #28960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cinsk opened this issue Oct 9, 2024 · 7 comments · Fixed by #28981
Closed

[Bug]: High CPU utilization of the macosx backend #28960

cinsk opened this issue Oct 9, 2024 · 7 comments · Fixed by #28981

Comments

@cinsk
Copy link

cinsk commented Oct 9, 2024

Bug summary

After showing interactive figure, the CPU utilization of python process went to 100%.

Code for reproduction

#######################################################
# Case 1: 100% cpu
import matplotlib.pyplot as plt
fig = plt.figure()
plt.plot(range(5))
plt.show()
# after closing the window
import pandas              # starting 100% CPU utilization

#######################################################
# Case 2: 100% cpu
import matplotlib.pyplot as plt
import pandas as pd

# No CPU utilization at the moment

fig = plt.figure()
df = pd.DataFrame(range(5))

plt.plot(df[0])           # with this, CPU utilization is 100%.
plt.show()                 # the same

# after closing the window, still 100%

#######################################################
# Case 3: no issue.
import matplotlib.pyplot as plt
fig = plt.figure()
plt.plot(range(5))
plt.show()              # strangely, this has no problem.

Actual outcome

no problem except it consumes 100% cpu. The figure is still responsive.

Expected outcome

matplotlib backend should not consume 100% cpu.

Additional information

  • pandas version 2.2.3 (pip)

  • Certain operation (closing interactive figures) causes 100% cpu with 'macosx' backend.

  • Closing the figure, or calling plt.close() does not help. (backend: macosx)

  • No problem with qt5agg backend.

Operating system

Mac (Intel) Ventura 13.6.6

Matplotlib Version

3.9.2

Matplotlib Backend

macosx

Python version

Python 3.12.3

Jupyter version

No response

Installation

pip

@QuLogic
Copy link
Member

QuLogic commented Oct 10, 2024

# Case 1: 100% cpu
import matplotlib.pyplot as plt
fig = plt.figure()
plt.plot(range(5))
plt.show()
# after closing the window
import pandas              # starting 100% CPU utilization

Does the import finish, or is it stuck there?

@greglucas
Copy link
Contributor

I can't reproduce this on an arm mac.

Are you able to reproduce this without pandas? Every line you said causes the cpu utilization is pandas related. How are you running this? As a script, in ipython/terminal, a jupyter notebook, ...

@greglucas greglucas added status: needs clarification Issues that need more information to resolve. GUI: MacOSX labels Oct 10, 2024
@cinsk
Copy link
Author

cinsk commented Oct 15, 2024

@QuLogic import finished normally.

All the code I posted were functional. Just causing high cpu utilization.

@greglucas

I have this symptom on 2019 intel Mac Pro, although I don't think the symptom is related to the CPU architecture.

Initially, I thought this was something in pandas code and posted in pandas, but I found no cpu utilization after changing macosx backend to qt5agg, so I left issue here.

@cinsk
Copy link
Author

cinsk commented Oct 15, 2024

@QuLogic @greglucas
Apology, looks like pandas has nothing to do with this problem. In the first code, even without loading pandas, just closing the figure window causing high CPU utilization.

# I ran this on terminial, interactively python (without any ipython).
$ python --version
Python 3.12.3
$ python
Python 3.12.3 (main, Jun  9 2024, 13:36:41) [Clang 15.0.0 (clang-1500.1.0.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import matplotlib.pyplot as plt
>>> fig = plt.figure()
>>> plt.plot(range(5))
[<matplotlib.lines.Line2D object at 0x118cddfd0>]
>>> plt.show()
>>> # after closing the window, CPU began to reach 100%.
# In another terminal
$ top
Processes: 585 total, 4 running, 581 sleeping, 3460 threads             11:32:22
Load Avg: 2.97, 3.10, 2.63  CPU usage: 10.30% user, 2.83% sys, 86.86% idle
SharedLibs: 379M resident, 68M data, 25M linkedit.
MemRegions: 795960 total, 4657M resident, 99M private, 1748M shared.
PhysMem: 16G used (5029M wired, 2518M compressor), 63M unused.
VM: 68T vsize, 3570M framework vsize, 5509378664(187) swapins, 5542395576(0) swap
Networks: packets: 382827238/277G in, 257840674/99G out.
Disks: 545342433/25T read, 393859951/25T written.

PID    COMMAND      %CPU  TIME     #TH    #WQ   #PORTS  MEM    PURG   CMPRS
97460  python3.12   100.1 04:06.03 10/1   1     226     67M    9824K  0B

@cinsk
Copy link
Author

cinsk commented Oct 15, 2024

>>> import cProfile, pstats, io
>>> pr = cProfile.Profile()
>>> pr.enable()
>>> import matplotlib.pyplot as plt
>>> fig = plt.figure()
>>> plt.plot(range(5))
[<matplotlib.lines.Line2D object at 0x129b74950>]
>>> plt.show()
>>> # Closing the window, let it consume 100% cpu (about 5 seconds)
>>> pr.disable()
>>> ps = pstats.Stats(pr)
>>> ps.print_stats()

Not sure the profiler output would help, but I attached the output of the above (too large to paste it here)
tmp2.txt

@greglucas
Copy link
Contributor

greglucas commented Oct 15, 2024

Thanks for the detailed information @cinsk, I can now reproduce this. Bisecting it points to this commit: fd43a64

I think this is due to our "wait_for_stdin" function which checks some things in macos' native event loop and works fine while we have our event loop running (with a figure shown), but then when we close the windows and go back to the Python terminal we want to defer back to their event loop and remove ours because it is no longer running. So I think we are spinning our wheels looking for events on a loop that is no longer running and checking this indefinitely with lots of pings to raise the CPU to 100%.

This fixes the immediate issue for me I think, where we can just exit early. But I'm not sure if we should actually be setting PyOS_InputHook = NULL when we stop our event loops instead?

diff --git a/src/_macosx.m b/src/_macosx.m
index 09838eccaf..5550e4d297 100755
--- a/src/_macosx.m
+++ b/src/_macosx.m
@@ -52,6 +52,10 @@ static void handleSigint(int signal) {
 }
 
 static int wait_for_stdin() {
+    // Check for whether an event loop is running currently first
+    // if it isn't we should exit early and do nothing
+    if (![NSApp isRunning]) return;
+
     @autoreleasepool {
         stdin_received = false;
         stdin_sigint = false;

@greglucas greglucas added status: confirmed bug and removed status: needs clarification Issues that need more information to resolve. labels Oct 15, 2024
@QuLogic QuLogic added this to the v3.9.3 milestone Oct 15, 2024
@tacaswell
Copy link
Member

That seems like a reasonable fix to me.

https://github.com/python/cpython/blob/90b1406b881aa0a57ce3a73c402b73b79138b7e0/Parser/myreadline.c#L47-L61 is where it gets called from readline ("old" python shell) and python/cpython#120066 is where it was added for the new shell. I think that in both cases short-circuiting and returning if the event loop is not running will be OK as something on the Python side will then wait for 100ms and then call us again.

I don't think we want to go down the route of installing/uninstalling the input hook as that means we need a good way to know when to install/remove it (removing the first time there is no event loop seems reasonable and I guess we could add it when we create the first figure?). Maybe not impossible, but a bunch of complexity that can be handled effectively having a 10hz polling loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants