Reuse single kpsewhich instance for speed. #19531

anntzer · 2021-02-17T00:09:27Z

On MacOS, where spawing kpsewhich instances is rather slow (#4880 (comment)), this appears
to speed up

python -c 'from pylab import *; mpl.use("pdf"); rcParams["text.usetex"] = True; plot(); savefig("/tmp/test.pdf", backend="pdf")'

around two-fold (~4s to ~2s). (There's also a small speedup on Linux,
perhaps ~10%, but the whole thing is already reasonably fast.)

Note that this is assuming that the dvi cache has already been built;
the costly subprocess calls here are due to calls to kpsewhich to
resolve the fonts whose name are listed in the dvi file.

Much of the complexity here comes from the need to force unbuffered
stdin/stdout when interacting with kpsewhich (otherwise, things just
hang); this is also the reason why this is not implemented on Windows
(Windows experts are welcome to look into this...; there, the speedup
should be even more significant). (On Linux, another solution, which
does not require a third-party dependency, is to call
stdbuf -oL kpsewhich ... and pass bufsize=0 to Popen(), but
ptyprocess is pure Python so adding a dependency seems reasonable).

The format kwarg to find_tex_file had never been used before, and
cannot be handled in the single-process case, so just deprecate it.

Edit: See #19558 for another approach, which also works on Windows for a large speedup. I'll keep this PR as separate for now to allow comparing the various approaches.

PR Summary

PR Checklist

Has pytest style unit tests (and pytest passes).
Is Flake 8 compliant (run flake8 on changed files to check).
New features are documented, with examples if plot related.
Documentation is sphinx and numpydoc compliant (the docs should build without error).
Conforms to Matplotlib style conventions (install flake8-docstrings and run flake8 --docstring-convention=all).
New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).

On MacOS, where spawing kpsewhich instances is rather slow, this appears to speed up ``` python -c 'from pylab import *; mpl.use("pdf"); rcParams["text.usetex"] = True; plot(); savefig("/tmp/test.pdf", backend="pdf")' ``` around two-fold (~4s to ~2s). (There's also a small speedup on Linux, perhaps ~10%, but the whole thing is already reasonably fast.) Note that this is assuming that the dvi cache has already been built; the costly subprocess calls here are due to calls to kpsewhich to resolve the fonts whose name are listed in the dvi file. Much of the complexity here comes from the need to force unbuffered stdin/stdout when interacting with kpsewhich (otherwise, things just hang); this is also the reason why this is not implemented on Windows (Windows experts are welcome to look into this...; there, the speedup should be even more significant). (On Linux, another solution, which does not require a third-party dependency, is to call `stdbuf -oL kpsewhich ...` and pass bufsize=0 to Popen(), but `ptyprocess` is pure Python so adding a dependency seems reasonable). The `format` kwarg to `find_tex_file` had never been used before, and cannot be handled in the single-process case, so just deprecate it.

timhoffm · 2021-02-17T21:20:29Z

This is heavy machinery. I assume we cannot get away cheaply by collecting the requests first and send them to kpsewhich in a single batch run?

anntzer · 2021-02-17T21:43:13Z

I think that would request quite a bit of reworking of the innards of dviread :(

timhoffm · 2021-02-17T22:42:12Z

Fair enough. Just wanted to make sure we're not overlooking an easier solution.

anntzer · 2021-02-18T00:06:00Z

I guess the other solution would be to revive @jkseppan's series of PRs #10236, #10238, #10268. From my PoV these PRs (while likely actually implementing a better solution) basically died from the use of sqlite as cache format, which is reputedly something really useful to know but which I know nothing about :-(

Edit: Yet another idea would be to call kpsewhich once to get pdftex.map, and immediately list all the files that are referenced in it, and then just build a gigantic kpsewhich query that locates all the files at once, caching that. From a quick test this is still reasonably fast on linux, but can run into argc length limits on mac :( Or possibly start kpsewhich interactively, feed the whole query to it, and terminate it, which should avoid the buffering issues...
On the other hand some systems seem to have a truly enormous pdftex.map, where doing the whole query is actually significantly slower...

anntzer · 2021-02-20T19:42:20Z

I spent some more time looking at this problem. I can think of two other solutions (which I guess I'm mostly writing for my own reference, but heh :-)):

Solution 1: Plain (e)TeX actually has a way to directly query glyph sizes:

$ latex '\documentclass{article}\begin{document}'
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020) (preloaded format=latex)
 restricted \write18 enabled.
entering extended mode
LaTeX2e <2020-02-02> patch level 5
L3 programming layer <2020-08-07>
(/usr/local/texlive/2020/texmf-dist/tex/latex/base/article.cls
Document Class: article 2019/12/20 v1.4l Standard LaTeX document class
(/usr/local/texlive/2020/texmf-dist/tex/latex/base/size10.clo))
(/usr/local/texlive/2020/texmf-dist/tex/latex/l3backend/l3backend-dvips.def)
(./texput.aux)
*\font\1=cmr10\message{\the\fontcharwd\1`X}
7.50002pt

and TeX actually appears to flush output, which makes it controllable via subprocess.Popen. The nice thing here is that this completely gets rid of the need to have a tfm parser, as TeX basically does that work for us. We'll need to spend a bit more time grokking the specs to understand the various scale factors involved, but that should hopefully be not too complicated. On the other hand I don't think this handles vf fonts, so we'd need to figure out 1) whether we can detect the need for a .vf without actually kpsearching it (perhaps they are needed only if a glyph slot is empty?), and 2) if they are indeed often needed (if not, then we can fall back to kpsewhich when they are needed, but we'll still have a big speedup in the common case).

Solution 2: luatex embeds and exposes kpathsea, and can likewise be used interactively

In [1]: import subprocess

In [2]: p = subprocess.Popen(["luatex", "\\relax"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, bufsize=0)

In [3]: p.stdout.readline()
Out[3]: b'This is LuaTeX, Version 1.12.0 (TeX Live 2020) \n'

In [4]: p.stdout.readline()
Out[4]: b' restricted system commands enabled.\n'

In [5]: p.stdout.readline()
Out[5]: b'\n'

In [6]: p.stdin.write(b'\\directlua{print(kpse.find_file("pncr8r", "tfm"))}\n')
Out[6]: 99

In [7]: p.stdout.readline()
Out[7]: b'*/usr/local/texlive/2020/texmf-dist/fonts/tfm/adobe/ncntrsbk/pncr8r.tfm\n'

(the * is just the tex prompt; if the file is not found *nil\n is printed.)
The advantage of this method is that it should also work on Windows (where subprocesses are very slow to start and therefore this technique would bring the largest speedup). On the other hand, this means that luatex would become an optional dependency for performance (it doesn't cost much to keep the old direct kpsewhich invocations if luatex is not present).

jklymak · 2021-02-20T21:17:26Z

Is luatex a big burden? I think most major dists have it now, don't they?

anntzer · 2021-02-20T22:07:42Z

See #19551, which implements the luatex-based solution (with some encoding-related wrinkles still left, but it's mostly there).

anntzer · 2021-05-26T22:24:04Z

Mostly superseded by #19558; we can reopen if there's appetite for a solution that specifically doesn't require luatex.

anntzer added OS: Apple topic: text/usetex Performance labels Feb 17, 2021

anntzer force-pushed the single-kpsewhich branch 2 times, most recently from 203ddf6 to 9f3da44 Compare February 17, 2021 00:26

anntzer force-pushed the single-kpsewhich branch from 9f3da44 to 737c80e Compare February 17, 2021 08:19

anntzer marked this pull request as draft February 18, 2021 00:48

anntzer marked this pull request as ready for review February 18, 2021 00:56

anntzer marked this pull request as draft February 22, 2021 13:32

anntzer mentioned this pull request Feb 22, 2021

Use luatex in --luaonly mode to query kpsewhich. #19558

Merged

7 tasks

anntzer closed this May 26, 2021

anntzer deleted the single-kpsewhich branch May 26, 2021 22:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse single kpsewhich instance for speed. #19531

Reuse single kpsewhich instance for speed. #19531

anntzer commented Feb 17, 2021 •

edited

Loading

timhoffm commented Feb 17, 2021

anntzer commented Feb 17, 2021

timhoffm commented Feb 17, 2021

anntzer commented Feb 18, 2021 •

edited

Loading

anntzer commented Feb 20, 2021

jklymak commented Feb 20, 2021

anntzer commented Feb 20, 2021

anntzer commented May 26, 2021

Reuse single kpsewhich instance for speed. #19531

Reuse single kpsewhich instance for speed. #19531

Conversation

anntzer commented Feb 17, 2021 • edited Loading

PR Summary

PR Checklist

timhoffm commented Feb 17, 2021

anntzer commented Feb 17, 2021

timhoffm commented Feb 17, 2021

anntzer commented Feb 18, 2021 • edited Loading

anntzer commented Feb 20, 2021

jklymak commented Feb 20, 2021

anntzer commented Feb 20, 2021

anntzer commented May 26, 2021

anntzer commented Feb 17, 2021 •

edited

Loading

anntzer commented Feb 18, 2021 •

edited

Loading