Skip to content

Caching figures generated by plot directive #25091

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

dstansby
Copy link
Member

PR Summary

A rebase of #10149 onto current main, with conflicts solved, and the latest feedback on that PR taken into account. For context see the description of #10149.

PR Checklist

Documentation and Tests

  • Has pytest style unit tests (and pytest passes)
  • Documentation is sphinx and numpydoc compliant (the docs should build without error).
  • New plotting related features are documented with examples.

Release Notes

  • New features are marked with a .. versionadded:: directive in the docstring and documented in doc/users/next_whats_new/
  • API changes are marked with a .. versionchanged:: directive in the docstring and documented in doc/api/next_api_changes/
  • Release notes conform with instructions in next_whats_new/README.rst or next_api_changes/README.rst

@dstansby dstansby added this to the v3.8.0 milestone Jan 26, 2023
@tacaswell
Copy link
Member

as a note, you should be able to force-push to that branch @dstansby .

r-barnes and others added 4 commits January 27, 2023 08:25
Fix bug with adding null string to outname_list
Futzing with PEP8 whitespace stuff.

Fix grammer

Examples of using rst need to be code blocks

Use logging instead of print

Satisfy PEP8 checks

FIX: issue with rebase due to re-naming of local variables

STY: fix line length and indentation issues

STY: trim whitespace and add trailing comma

Fix doc style

flake8 fixes

plot_preserve_dir > plot_cache_dir

Copy edit changelog entry
@dstansby dstansby force-pushed the plot_directive_preserve branch from c78c810 to fca5eb3 Compare January 27, 2023 08:31
@dstansby dstansby changed the title Plot directive preserve Caching figures generated by plot directive Jan 27, 2023
@dstansby dstansby marked this pull request as ready for review January 27, 2023 16:12
@ksunden
Copy link
Member

ksunden commented Jan 27, 2023

How does this interact with the changes to the plot directive introduced in #24205?

My initial reaction is that the outname change should be orthogonal, but the plot_cache_dir may conflict and/or be totally ignored.

e.g. :file:`docfile3-4-01.png` or something equally mysterious. With
``:outname:`` the figure generated will instead be named
:file:`stinkbug_plot-01.png` or even :file:`stinkbug_plot.png`. This makes it
easy to understand which output image is which and, more importantly, uniquely
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly skeptical of the first claim (that users can give things good names) and very skeptical of the second as the name is not random, it is based on the name of the file (and the position + function of the directive in that file) so is structurally unique. If users can control the names there is a high chance that they are going to collide.

If we want to make things save to cache like this we should give them names based on the hash of the source....

@@ -613,6 +635,12 @@ def render_figures(code, code_path, output_dir, output_base, context,
for fmt, dpi in formats:
try:
figman.canvas.figure.savefig(img.filename(fmt), dpi=dpi)
if config.plot_cache_dir and outname is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not actually use outname for anything other than checking it exists!

Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some conceptual concerns:

  • name are already (structurally) unique
  • do not trust users to give things unique names (although it does check that)
  • the cache is not invalidated if the code in the directive changes

and an implementation concern (we do not seem to actually use outname in the code!)

I think this is a good idea, but probably should cache the files based on a hash of the source rather than a user supplied name.

@dstansby
Copy link
Member Author

Sorry for doing a 'blind' rebase and not really thinking this through myself, but thanks for re-reviewing! Just thinking about the key for the cache, hasing the source code is potentially no though good because changes in Matplotlib internals (e.g. a change in default style, a bugfix to a plotting method) would lead to a different image with the same source code.

So I think we either have to:

  1. Abandon this
  2. Use a hash of the source code of the key, with some suitable user warning that any cached images will only ever be automatically updated if the source code is updated.

I'm torn between these two options, but am erring towards number 1. due to the potential of number 2. to cause comfusion to users at some point down the line. Thoughts?

@tacaswell
Copy link
Member

tacaswell commented Jan 27, 2023

Sorry for doing a 'blind' rebase and not really thinking this through myself, but thanks for re-reviewing!

I went back to the original PR where I also did a blind rebase and am to blame for at least some of these issues!


For the key we could start the hash with matplotlib.__version__, but for developers that would quickly lead to an ~unbounded cache as we put the git sha in the version and would only move the problem out a level to any plot directive that uses Matplotlib via a library (e.g. seaborn) having exactly the same problem but not participating in the hash (and it seems this would be general to any library used in the code...).

So given the caching problem (nb: there are two hard problems in computer science: naming things, cache invalidation, and off-by-one bugs) I'm inclined to go with option 1 as well.

@dstansby
Copy link
Member Author

In that case I'll close. @r-barnes thanks a lot for opening this PR originally - we decided not to carry it forward because of the difficulties of caching the images. See the comment above this one for more context. I think we would be open to re-visiting this, but that would require a well thought out strategy for caching the images in an issue before opening a pull request.

@dstansby dstansby closed this Jan 30, 2023
@dstansby dstansby mentioned this pull request Jan 30, 2023
6 tasks
Thanushri16 added a commit to Thanushri16/matplotlib that referenced this pull request Nov 10, 2024
…n getter setter for subplot param and figsize
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants