Skip to content

LaTeX rendering is really slow #4880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lgeiger opened this issue Aug 7, 2015 · 27 comments
Closed

LaTeX rendering is really slow #4880

lgeiger opened this issue Aug 7, 2015 · 27 comments
Labels
OS: Apple Performance status: closed as inactive Issues closed by the "Stale" Github Action. Please comment on any you think should still be open. status: inactive Marked by the “Stale” Github Action topic: text/usetex

Comments

@lgeiger
Copy link

lgeiger commented Aug 7, 2015

When usetex=True plotting is a lot slower than the normal.

I generated a simple testplot with some mathtex and timed the rendering time with usetex=True and usetex=False for both saving the plot and just displaying it:

import matplotlib.pyplot as plt
import matplotlib
import numpy as np
%matplotlib inline
x = np.linspace(0, 20, 500)
y = 3 * x + 0.5 * x**3 + 2

def plot(x, y):
    plt.plot(x, y)
    plt.title('simple testplot')
    plt.xlabel('$x$')
    plt.ylabel(r'$3 x + \frac{1}{2} x^3 + 2$')
    plt.savefig('test.pdf')

%timeit -n 1 -r 1 plot(x, y)

Here are the measurements:

  • save pdf usetex=False: 351 ms
  • save pdf usetex=True: 8.34 s
  • display usetex=False: 39 ms
  • display usetex=True: 38.3 ms but it took approx. 7 s until the plot was displayed

My system is running Mac OS X 10.10.4, matplotlib 1.4.3 and the latest LaTeX version.

@tacaswell
Copy link
Member

Is it any faster the second time? I suspect much of that delay is because with usetex a new LaTeX process is being spun up, the text processed and rendered, converted to png, and then added back into the figure.

That said, 8s is pretty crazy. I use usetex day-to-day and have never seen anything that bad.

Does in work better not in the notebook and using one of the backends that we support?

@lgeiger
Copy link
Author

lgeiger commented Aug 7, 2015

Yes caching works.
It takes about 2 s for saving the same plot a second time. But if something in the plot is changed rendering times are again 4-5 s even if the plot is not displayed.

Just displaying a plot the second time works ok. If anything is modified it takes again 3-5s till the plot is displayed.
The problem exists with other backends or outside of the notebook as well.

@pwuertz
Copy link
Contributor

pwuertz commented Aug 10, 2015

Order of seconds is what I'd consider pretty normal whenever latex is involved somewhere.

@jkseppan
Copy link
Member

TeX gets run separately for each string in the image, including every tick label. One optimisation would be to first gather all the TeX input, then make a big TeX file where each string is on its own page, then turn each output page into a separate png file (or a separate sequence of rendering commands in the case of vector backends).

@jkseppan
Copy link
Member

One workaround: if you don't need everything to be rendered with TeX, you can set the usetex property on a subset of text objects.

import matplotlib.pyplot as plt
import matplotlib
matplotlib.rc('text', usetex=False)
import numpy as np
%matplotlib inline
x = np.linspace(0, 20, 500)
y = 3 * x + 0.5 * x**3 + 2

def plot(x, y):
    plt.plot(x, y)
    plt.title('simple testplot')
    plt.xlabel('$x$', usetex=True)
    plt.ylabel(r'$3 x + \frac{1}{2} x^3 + 2$', usetex=True)
    plt.savefig('test.pdf')

%timeit -n 1 -r 1 plot(x, y)

@tdegeus
Copy link

tdegeus commented Oct 10, 2017

Same here. I also must add that on macOS it is significantly slower than on for example Linux. What could be the reason for this?

@entaylor
Copy link

entaylor commented Jan 5, 2018

This is only to bump this as a serious issue on MacOS (having just upgraded to High Sierra, python 3.6, and matplotlib 2.1.1). Depending on the plot, rendering with TeX can take minutes.

For a simple plot with x and y axis labels and just a '$m = 2.2 \pm 1.1$' annotation, the save time is 26 seconds. Yes, caching works, so that subsequent re-saves of the same plot take closer to 1 second. But if i, for example, change the annotation to have a different number of decimal places, then we are back to 8.8 sec.

jkseppan's workaround with usetex=True only for the x/y labels and annotation does not significantly reduce the save time: it's 8.4 sec instead of 8.8 sec.

@anntzer
Copy link
Contributor

anntzer commented Jan 5, 2018

Just a data point here:
Looks like a significant (less than half, though) part of the time is spent calling not tex, but kpsewhich (which is done by dviread to locate font files).
Possible relevant threads: https://email.esm.psu.edu/pipermail/macosx-tex/2014-October/053020.html https://tug.org/pipermail/tex-k/2015-April/002600.html

I have a longish-term plan to hack into usetex to let the tex subprocess output directly as much relevant information as possible into the log file (with some tex programming)...

@jkseppan
Copy link
Member

jkseppan commented Jan 5, 2018 via email

@anntzer
Copy link
Contributor

anntzer commented Jan 5, 2018

kpathsea is actually LGPL so we're fine (but I have no idea whether that'll help)

@tdegeus
Copy link

tdegeus commented Jan 5, 2018

Possible duplicate: #9653

@jkseppan
Copy link
Member

jkseppan commented Jan 6, 2018

So I was guessing that the invocation of kpsewhich is the slow part, but I haven't measured it. It could also be that the way it searches the file system is slower on a Mac than on Linux. If that's the case, then linking to kpathsea wouldn't buy anything.

It would be useful to measure what's actually happening on a Mac. I thought the tool of choice for this would be DTrace but apparently that has been broken for several OS versions. Does anyone know how to trace subprocesses and their system calls on macOS?

@anntzer
Copy link
Contributor

anntzer commented Jan 6, 2018

I just used

  1. a python profiler (pprofile, but there's a bunch of them that would probably work) to see that one of the bottlenecks is the calls to kpsewhich, and
  2. the time shell builtin to time the calls to kpsewhich cmr10.pfb pdftex.map, which is about 5x slower on OSX (0.11s) than on linux (0.2s) (arguably different machines though). Interestingly the duration of the call is essentially independent of the number of files searched, so it is some "initialization" step that is slower.

Apparently kpsewhich has an "-interactive" mode, which may be a way to bypass that initialization (but would require writing our own loop to manage the subprocess' stdout -- a bit a pain but not that hard, see handling of inkscape for testing).

@jkseppan
Copy link
Member

jkseppan commented Jan 6, 2018

Could someone who is experiencing slow TeX rendering try out branch jkseppan:kpsewhich-batching and report back whether it's noticeably faster?

@jklymak
Copy link
Member

jklymak commented Jan 7, 2018

@jkseppan Sorry, I get a ValueError when I use your branch and the code above...

Traceback (most recent call last):
  File "testtex.py", line 17, in <module>
    plot(x, y)
  File "testtex.py", line 15, in plot
    fig.savefig('test.pdf')
  File "/Users/jklymak/matplotlib/lib/matplotlib/figure.py", line 1883, in savefig
    self.canvas.print_figure(fname, **kwargs)
  File "/Users/jklymak/matplotlib/lib/matplotlib/backends/backend_qt5agg.py", line 185, in print_figure
    super(FigureCanvasQTAggBase, self).print_figure(*args, **kwargs)
  File "/Users/jklymak/matplotlib/lib/matplotlib/backend_bases.py", line 2257, in print_figure
    **kwargs)
  File "/Users/jklymak/matplotlib/lib/matplotlib/backends/backend_pdf.py", line 2586, in print_pdf
    self.figure.draw(renderer)
  File "/Users/jklymak/matplotlib/lib/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/Users/jklymak/matplotlib/lib/matplotlib/figure.py", line 1331, in draw
    renderer, self, artists, self.suppressComposite)
  File "/Users/jklymak/matplotlib/lib/matplotlib/image.py", line 138, in _draw_list_compositing_images
    a.draw(renderer)
  File "/Users/jklymak/matplotlib/lib/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/Users/jklymak/matplotlib/lib/matplotlib/axes/_base.py", line 2549, in draw
    mimage._draw_list_compositing_images(renderer, self, artists)
  File "/Users/jklymak/matplotlib/lib/matplotlib/image.py", line 138, in _draw_list_compositing_images
    a.draw(renderer)
  File "/Users/jklymak/matplotlib/lib/matplotlib/artist.py", line 55, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/Users/jklymak/matplotlib/lib/matplotlib/axis.py", line 1140, in draw
    renderer)
  File "/Users/jklymak/matplotlib/lib/matplotlib/axis.py", line 1080, in _get_tick_bboxes
    extent = tick.label1.get_window_extent(renderer)
  File "/Users/jklymak/matplotlib/lib/matplotlib/text.py", line 925, in get_window_extent
    bbox, info, descent = self._get_layout(self._renderer)
  File "/Users/jklymak/matplotlib/lib/matplotlib/text.py", line 301, in _get_layout
    ismath=False)
  File "/Users/jklymak/matplotlib/lib/matplotlib/backends/backend_pdf.py", line 2151, in get_text_width_height_descent
    renderer=self)
  File "/Users/jklymak/matplotlib/lib/matplotlib/texmanager.py", line 596, in get_text_width_height_descent
    page = next(iter(dvi))
  File "/Users/jklymak/matplotlib/lib/matplotlib/dviread.py", line 256, in __iter__
    have_page = self._read()
  File "/Users/jklymak/matplotlib/lib/matplotlib/dviread.py", line 362, in _read
    self._dtable[byte](self, byte)
  File "/Users/jklymak/matplotlib/lib/matplotlib/dviread.py", line 171, in wrapper
    return method(self, *[f(self, byte-min) for f in get_args])
  File "/Users/jklymak/matplotlib/lib/matplotlib/dviread.py", line 507, in _fnt_def
    self._fnt_def_real(k, c, s, d, a, l)
  File "/Users/jklymak/matplotlib/lib/matplotlib/dviread.py", line 522, in _fnt_def_real
    vf = _vffile(fontname)
  File "/Users/jklymak/matplotlib/lib/matplotlib/dviread.py", line 1132, in _fontfile
    return cls(filename) if filename else None
  File "/Users/jklymak/matplotlib/lib/matplotlib/dviread.py", line 674, in __init__
    Dvi.__init__(self, filename, 0)
  File "/Users/jklymak/matplotlib/lib/matplotlib/dviread.py", line 213, in __init__
    fontnames = sorted(set(self._read_postamble()))
  File "/Users/jklymak/matplotlib/lib/matplotlib/dviread.py", line 320, in _read_postamble
    raise ValueError("malformed dvi file: too few 223 bytes")
ValueError: malformed dvi file: too few 223 bytes

@jkseppan
Copy link
Member

jkseppan commented Jan 7, 2018

@jklymak Right, the code breaks for virtual fonts, and somehow your default TeX configuration uses those (mine doesn't). I pushed a workaround on the branch.

@jkseppan
Copy link
Member

jkseppan commented Jan 7, 2018

I pushed a better solution on the same branch, so virtual fonts get similar treatment as dvi files.

@jkseppan
Copy link
Member

It would still be interesting to hear about results on that branch. PR #10236 includes a cleaned-up version of half of that.

@jklymak
Copy link
Member

jklymak commented Jan 12, 2018

First test usetex=False, second =True using jkseppan/kpsewhich-batching

$ pythonw testkpse.py
0.397324800491333
(matplotlibdev)
# jklymak @ valdez in ~/matplotlib on git:443ece10c x [8:55:40]
$ pythonw testkpse.py
7.1202168464660645

Ummmm, unless there is c-code in there thats changed? I didn't recompile matplotlib...

jkseppan added a commit to jkseppan/matplotlib that referenced this issue Jan 12, 2018
This should improve performance if there is a significant startup
cost to running kpsewhich, as reported by some users in matplotlib#4880.
jkseppan added a commit to jkseppan/matplotlib that referenced this issue Jan 14, 2018
This should improve performance if there is a significant startup
cost to running kpsewhich, as reported by some users in matplotlib#4880.
jkseppan added a commit to jkseppan/matplotlib that referenced this issue Feb 16, 2018
This should improve performance if there is a significant startup
cost to running kpsewhich, as reported by some users in matplotlib#4880.
jkseppan added a commit to jkseppan/matplotlib that referenced this issue Feb 18, 2018
This should improve performance if there is a significant startup
cost to running kpsewhich, as reported by some users in matplotlib#4880.
@mattfack
Copy link

mattfack commented Oct 4, 2018

Hi there,
I am having the same problem with matplolib 2.2.3 and python 2.15.7, while with matplotlib 1.5.1 and python 2.15.7 I am not experiencing the slow down.

@jkseppan
Copy link
Member

jkseppan commented Oct 4, 2018

Have you tried with matplotlib 3.0? I think some improvements went in. Other than that, I have some PRs waiting for review (#10236, #10238, #10268) that should help with usetex performance, but nothing has happened on those for a while.

@mattfack
Copy link

mattfack commented Oct 4, 2018

Just tried it before you replied :) anyway yes: it is way faster. I just decided to move from python 2.7.15 to python 3.7.
I also decided to uninstall both python versions I have on my mac and install anaconda.

@jkseppan
Copy link
Member

jkseppan commented Oct 6, 2018

We're going to need someone who experiences this issue to help with debugging.

@anntzer asked on one of the PRs: "Calling kpsewhich --debug=-1 ... gives a bit more output on what it does, can someone report what this looks like on (a slow) OSX?"

I'm going to guess that the caching improvements that went into the Matplotlib 3.0 release helped for at least some cases. Does someone still observe this issue with current versions?

@fdeugenio
Copy link

Sorry to necromance this, but I had a similar problem as OP and solved it as follows. Hope it helps you too.

Inspect the matplotlib cache folder, e.g. du -hc ~/.cache/matplotlib/tex.cache. If the folder size is considerable (>1GB, but actually depends on storage speed) then delete the content.

I think what was happening on my machine was that I made so many diagnostic plots that matplotlib/latex was spending more time rummaging through the cache than it would take to call latex from scratch. If this is indeed the problem, it could be addressed by controlling the cache size, or removing cache items that have not been used for longer than some time interval.

@ml-utils
Copy link

ml-utils commented Aug 30, 2022

#4880 (comment)

import matplotlib

This did the trick for me, it's much faster to disable TeX by default and only enable it on the strings that need it.

@tacaswell
Copy link
Member

matplotlib/latex was spending more time rummaging through the cache than it would take to call latex from scratch.

We do hash-based lookup so it should be constant time. My guess is that the relevant metric is not size, but number of files and we had pushed that folder past the number of files the file system was happy with.

@github-actions
Copy link

github-actions bot commented Sep 1, 2023

This issue has been marked "inactive" because it has been 365 days since the last comment. If this issue is still present in recent Matplotlib releases, or the feature request is still wanted, please leave a comment and this label will be removed. If there are no updates in another 30 days, this issue will be automatically closed, but you are free to re-open or create a new issue if needed. We value issue reports, and this procedure is meant to help us resurface and prioritize issues that have not been addressed yet, not make them disappear. Thanks for your help!

@github-actions github-actions bot added the status: inactive Marked by the “Stale” Github Action label Sep 1, 2023
@github-actions github-actions bot added the status: closed as inactive Issues closed by the "Stale" Github Action. Please comment on any you think should still be open. label Oct 2, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS: Apple Performance status: closed as inactive Issues closed by the "Stale" Github Action. Please comment on any you think should still be open. status: inactive Marked by the “Stale” Github Action topic: text/usetex
Projects
None yet
Development

No branches or pull requests