Skip to content

Micro optimization of plotting #26303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 16, 2023

Conversation

eendebakpt
Copy link
Contributor

PR summary

In this PR some micro optimizations are applied to the matplotlib plotting. Optimizations selected are in methods that show in the profiling (cProfile).

Benchmark

Mean +- std dev: [main] 2.58 ms +- 0.23 ms -> [pr] 2.49 ms +- 0.23 ms: 1.04x faster

on script

import matplotlib
# print(matplotlib)
import pyperf

setup = """
import matplotlib.pyplot as plt

x=[1,2,3,4,5]
y=[5,6,3,3,4]
n=6
def go():
    for ii in range(n):
        plt.figure(1)
        plt.plot(x,y)
"""

runner = pyperf.Runner()
runner.timeit(name="mpl", stmt="go()", setup=setup)

PR checklist

@tacaswell
Copy link
Member

import matplotlib
# print(matplotlib)
import pyperf

setup = """
import matplotlib.figure as mfigure

x=[1,2,3,4,5]
y=[5,6,3,3,4]
n=6
def go():
    for ii in range(n):
        fig = mfigure.Figure()
        ax = fig.subplots()
        ax.plot(x,y)
"""

runner = pyperf.Runner()
runner.timeit(name="mpl", stmt="go()", setup=setup)

This may be a better bench mark script.

I have concerns that the speed up is less than the std....

@tacaswell
Copy link
Member

well, did you mean to plot many times to the same figure or do a plot per figure?

@tacaswell
Copy link
Member

new (this branch)

mpl: Mean +- std dev: 796 us +- 7 us

old (3.7.2 from wheels)

mpl: Mean +- std dev: 845 us +- 13 us

using

import matplotlib
# print(matplotlib)
import pyperf

setup = """
import matplotlib.figure as mfigure

x=[1,2,3,4,5]
y=[5,6,3,3,4]
n=6
fig = mfigure.Figure()
ax = fig.subplots()


def go():
    for ii in range(n):
        ax.plot(x,y)
"""

runner = pyperf.Runner()
runner.timeit(name="mpl", stmt="go()", setup=setup)

so I think there is a real speed up here, even if it is realitvely small against the cost of making a new Figure and Axes (that is probably why the std was so high...the first run took an order of magnitude longer!)

@tacaswell tacaswell added this to the v3.8.0 milestone Jul 13, 2023
@tacaswell tacaswell requested a review from oscargus July 13, 2023 20:29
Copy link
Member

@oscargus oscargus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most things make sense for sure. Some things I simply trust are faster.

A few minor comments though.

@eendebakpt eendebakpt changed the title Draft: micro optimization of plotting Micro optimization of plotting Jul 14, 2023
@@ -1634,6 +1634,8 @@ def _safe_first_finite(obj, *, skip_nonfinite=True):
def safe_isfinite(val):
if val is None:
return False
if isinstance(val, int):
Copy link
Contributor

@anntzer anntzer Jul 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure special-casing ints here make sense -- certainly that'll make a microbenchmark based on plotting ints faster, but at the cost of introducing a branch for all other cases (plotting floats is likely much more common, and even np.int is perhaps a more common case than python ints in real code).

If one really wants to workaround the fact that np.isfinite is relatively slow, one can instead use math.isfinite (which on a quick microbenchmark is extremely fast), taking into account the fact that it won't handle certain cases like datetimes (but will handle numpy floats and ints), so something like

try:
    if math.isfinite(val): return True
except TypeError:
    pass
# continue with the np.isfinite check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an excellent suggestion. On my system np.isfinite(val) if np.isscalar(val) else True is about 900 ns, and math.isfinite(val) (including the try-except) is 80 ns. The isinstance(val, int) is 50 ns, but does not handle the important float case.

@oscargus oscargus merged commit b0121b6 into matplotlib:main Jul 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants