Skip to content

tex cache lockfile retries should be configurable #7776

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gerritholl opened this issue Jan 9, 2017 · 7 comments
Closed

tex cache lockfile retries should be configurable #7776

gerritholl opened this issue Jan 9, 2017 · 7 comments
Milestone

Comments

@gerritholl
Copy link

I am doing data processing and generating many plots with many scripts running in parallel. The scripts use texmanager, and many fail when drawing to png due to below. My plots are large, with many subplots, and the filesystem I'm writing to is slow. Therefore, the hardcoded limit of 50 tries with 0.1 second waits is not appropriate.

Those hardcoded values should be configurable so that the user can change them if needed.

Traceback (most recent call last):
  File "/home/users/gholl/venv/stable-3.5/bin/plot_hirs_field_timeseries", line 11, in <module>
    load_entry_point('FCDR-HIRS==0.0.4', 'console_scripts', 'plot_hirs_field_timeseries')()
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/FCDR_HIRS/analysis/timeseries.py", line 1225, in main
    "calibpos": p.corr_calibpos})
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/FCDR_HIRS/analysis/timeseries.py", line 592, in plot_noise_with_other
    alltemp='_'.join(temperatures), tb=t[0], te=t[-1]))
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/pyatmlab/graphics.py", line 208, in print_or_show
    fig.canvas.print_figure(str(outf))
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/backend_bases.py", line 2192, in print_figure
    **kwargs)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/backends/backend_agg.py", line 545, in print_png
    FigureCanvasAgg.draw(self)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/backends/backend_agg.py", line 464, in draw
    self.figure.draw(self.renderer)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/artist.py", line 63, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/figure.py", line 1142, in draw
    renderer, self, dsu, self.suppressComposite)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/image.py", line 139, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/artist.py", line 63, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/axes/_base.py", line 2405, in draw
    mimage._draw_list_compositing_images(renderer, self, dsu)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/image.py", line 139, in _draw_list_compositing_images
    a.draw(renderer)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/artist.py", line 63, in draw_wrapper
    draw(artist, renderer, *args, **kwargs)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/axis.py", line 1138, in draw
    renderer)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/axis.py", line 1078, in _get_tick_bboxes
    extent = tick.label1.get_window_extent(renderer)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/text.py", line 967, in get_window_extent
    bbox, info, descent = self._get_layout(self._renderer)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/text.py", line 362, in _get_layout
    ismath=ismath)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/backends/backend_agg.py", line 230, in get_text_width_height_descent
    renderer=self)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/texmanager.py", line 676, in get_text_width_height_descent
    dvifile = self.make_dvi(tex, fontsize)
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/texmanager.py", line 406, in make_dvi
    with Locked(self.texcache):
  File "/home/users/gholl/venv/stable-3.5/lib/python3.5/site-packages/matplotlib/cbook.py", line 2738, in __enter__
    raise self.TimeoutError(err_str)
matplotlib.cbook.TimeoutError: LOCKERROR: matplotlib is trying to acquire the lock
    '/home/users/gholl/.cache/matplotlib/tex.cache/.matplotlib_lock-*'
and has failed.  This maybe due to any other process holding this
lock.  If you are sure no other matplotlib process is running try
removing these folders and trying again.

Unfortunately, the nature of the problem makes it hard to write a minimum code snippet that reproduces it, because it only occurs with big plots when running many processes in parallel.

I'm using Matplotlib 2.0.0rc2 (installed through pip) on Python 3.5.

@anntzer
Copy link
Contributor

anntzer commented Jan 10, 2017

Rather, the locking should be done per-entry, rather than globally for the whole folder, as I don't think writing to separate cache entries creates any problems?

The current lock implementation can only lock folders though (because it writes to a file within that folder). We can either move each cache entry into its own subfolder (not too hard?), or change the lock implementation to be able to lock single files (off the top of my head, one should be able to do so by creating lockfiles named $MPLCACHEDIR/locks/$(resolved-path-to-file-or-dir-to-be-locked-encoding-slashes-appropriately)).

@tacaswell tacaswell added this to the 2.1 (next point release) milestone Jan 10, 2017
@tacaswell
Copy link
Member

Another option would be to isolate your process's tex cache from each other by launching your python processes as

MPLCONFIGDIR=/tmp/some_pathN python foo.py

which if your /tmp is a ramdisk might get you a reasonable speed up to boot.

@ck2go
Copy link

ck2go commented Feb 14, 2017

Isolating the processes tex cache does not work when e.g. using multiprocessing (at least not to my knowledge). Therefore the suggestion to isolate each processes tex cache does not help. So either the tex cache dir should be configurable from within a running process, or the lock implementation should be changed.

Th problem actually gets worse the more cores/processes are used. Seen this at 32 vs. 16 cores.

@tacaswell tacaswell modified the milestones: 2.1 (next point release), 2.1.1 (next bug fix release) Sep 24, 2017
@tacaswell tacaswell modified the milestones: 2.1.1 (next bug fix release), 2.2 (next feature release) Oct 9, 2017
@naibaf7
Copy link

naibaf7 commented Feb 25, 2018

This is a dirty hack to fix the issue meanwhile:

import sys, os, math, tempfile, atexit, shutil
from joblib import Parallel, delayed

# Subprocess function
def subprocess_task():
    mpldir = tempfile.mkdtemp()
    atexit.register(shutil.rmtree, mpldir)
    umask = os.umask(0)
    os.umask(umask)
    os.chmod(mpldir, 0o777 & ~umask)
    os.environ['HOME'] = mpldir
    os.environ['MPLCONFIGDIR'] = mpldir
    import matplotlib
    class TexManager(matplotlib.texmanager.TexManager):
        texcache = os.path.join(mpldir, 'tex.cache')
    matplotlib.texmanager.TexManager = TexManager
    matplotlib.rcParams['ps.useafm'] = True
    matplotlib.rcParams['pdf.use14corefonts'] = True
    matplotlib.rcParams['text.usetex'] = True

    # From here on, safe to use matplotlib in parallel

# Main process function
def mainprocess_task(n_threads=32):
    with Parallel(n_jobs=n_threads) as parallel:
        parallel(delayed(subprocess_task)() for i in range(0,256))

Note that if, for some reason, matplotlib is already loaded before a subprocess (in my example from joblibs) wants to use matplotlib, then just setting the environmental variables alone is not enough anymore, since the TexManager has already established it's cache directory.
Therefore we modify the TexManager class per-subprocess to point to a different location. Problem solved.

Using different temporary directories for matplotlib will also speed up plotting significantly. For me, it was about 10x the speed (AMD Threadripper 1950X, 16C, 32T, plotting 3432 plots to an SSD).

Note of course, that the subprocess_task is best used as long as possible. Each process needs to set-up matplotlib this way only once, not for every call to the task.

@anntzer
Copy link
Contributor

anntzer commented Mar 6, 2018

Should be mostly closed by #10596. Please request a reopen if that turns out not to be enough.

@anntzer anntzer closed this as completed Mar 6, 2018
@QuLogic QuLogic modified the milestones: needs sorting, v3.0 Mar 6, 2018
@avivajpeyi
Copy link

Matplotlib version 3.2.1

Summary:

I am trying to plot several plots in parallel (using a scheduler to call python several times) using with rcParams["text.usetex"] = True, but get a 'Lock error'.

When I try to create the plots sequentially, I am able to do so without any problems. Is there any way I can disable this lock file or generate a new one every time I run my plotting script?

Error:

Short:

TimeoutError: Lock error: Matplotlib failed to acquire the following lock file: ....tex.matplotlib-lock
This maybe due to another process holding this lock file.  If you are sure no
other Matplotlib process is running, remove this file and try again.

Full log:

Traceback (most recent call last):
  ...
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/pyplot.py", line 723, in savefig
    res = fig.savefig(*args, **kwargs)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/figure.py", line 2203, in savefig
    self.canvas.print_figure(fname, **kwargs)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/backend_bases.py", line 2098, in print_figure
    result = print_method(
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/backends/backend_agg.py", line 514, in print_png
    FigureCanvasAgg.draw(self)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/backends/backend_agg.py", line 393, in draw
    self.figure.draw(self.renderer)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/artist.py", line 38, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/figure.py", line 1735, in draw
    mimage._draw_list_compositing_images(
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/image.py", line 137, in _draw_list_compositing_images
    a.draw(renderer)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/artist.py", line 38, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/gwpy/plot/axes.py", line 132, in draw
    super(Axes, self).draw(*args, **kwargs)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/artist.py", line 38, in draw_wrapper
    return draw(artist, renderer, *args, **kwargs)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/axes/_base.py", line 2590, in draw
    self._update_title_position(renderer)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/axes/_base.py", line 2538, in _update_title_position
    if title.get_window_extent(renderer).ymin < top:
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/text.py", line 905, in get_window_extent
    bbox, info, descent = self._get_layout(self._renderer)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/text.py", line 299, in _get_layout
    w, h, d = renderer.get_text_width_height_descent(
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/backends/backend_agg.py", line 203, in get_text_width_height_descent
    w, h, d = texmanager.get_text_width_height_descent(
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/texmanager.py", line 450, in get_text_width_height_descent
    dvifile = self.make_dvi(tex, fontsize)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/texmanager.py", line 336, in make_dvi
    with cbook._lock_path(texfile):
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/fred/oz117/avajpeyi/.conda/envs/parallel_bilby/lib/python3.8/site-packages/matplotlib/cbook/__init__.py", line 1826, in _lock_path
    raise TimeoutError("""\
TimeoutError: Lock error: Matplotlib failed to acquire the following lock file:
    /home/avajpeyi/.cache/matplotlib/tex.cache/fa043ab626a6a25c604115556c2004ae.tex.matplotlib-lock
This maybe due to another process holding this lock file.  If you are sure no
other Matplotlib process is running, remove this file and try again.

@cwiede
Copy link

cwiede commented Jul 23, 2021

I suddenly get this error now after updating to matplotlib 3.4.2 from version 3.2.1. Note that it happens when importing matplotlib with

import matplotlib.pyplot as plt

This is on a CI/CD build, so there might be other processes around using matplotlib. Not sure if there are any changes in matplotlib related to this error, but I never got it before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants