Skip to content

Plot directive: delegate file handling to Sphinx #24205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 23, 2022

Conversation

MikhailRyazanov
Copy link
Contributor

@MikhailRyazanov MikhailRyazanov commented Oct 17, 2022

PR Summary

This PR solves multiple problems with file handling in matplotlib.sphinxext.plot_directive described in issue #24005 (and #13858) by delegating all files copying and links generation to the Sphinx :download: role instead of trying to do so explicitly (which apparently cannot be done properly with the current approach).

PR Checklist

Tests and Styling

  • Has pytest style unit tests (and pytest passes).
  • Is Flake 8 compliant (install flake8-docstrings and run flake8 --docstring-convention=all).

Documentation

  • [N/A] New features are documented, with examples if plot related.
  • [N/A] New features have an entry in doc/users/next_whats_new/ (follow instructions in README.rst there).
  • [N/A] API changes documented in doc/api/next_api_changes/ (follow instructions in README.rst there).
  • [N/A] Documentation is sphinx and numpydoc compliant (the docs should build without error).

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for opening your first PR into Matplotlib!

If you have not heard from us in a while, please feel free to ping @matplotlib/developers or anyone who has commented on the PR. Most of our reviewers are volunteers and sometimes things fall through the cracks.

You can also join us on gitter for real-time discussion.

For details on testing, writing docs, and our review process, please see the developer guide

We strive to be a welcoming and open project. Please follow our Code of Conduct.

@tacaswell
Copy link
Member

I think this is going to move some files around in the final build. Do we expect that is going to cause trouble for anyone?

@MikhailRyazanov
Copy link
Contributor Author

MikhailRyazanov commented Oct 18, 2022

this is going to move some files around in the final build

Yes. Moreover, the paths will be unpredictable (they are generated by hashing file contents). I've already expressed this concern in #24005. However, at least the docs built for this PR look fine (I don't see any obvious problems).

If somebody has a deeper understanding of Sphinx API, maybe an alternative solution can be found. Keep in mind, although, that it must:

  1. Work correctly for singlehtml and dirhtml builders.
  2. Work correctly for building multiple targets.
  3. Avoid copying unneeded files.

@timhoffm
Copy link
Member

@MikhailRyazanov Thanks for looking into this. Unfortunately, we might lack sphinx expertise to properly understand and solve this issue. - I have to admit, that the download-approach feels a bit hackish, but I don't have a better suggestion. So it might be that we'll have trouble coming up with a decision here.

If somebody has a deeper understanding of Sphinx API, maybe an alternative solution can be found. Keep in mind, although, that it must:

  1. Work correctly for singlehtml and dirhtml builders.
  2. Work correctly for building multiple targets.
  3. Avoid copying unneeded files.

While this all is reasonable would be nice, we do not necessarily need a 100% solution. Maybe that makes solving some parts easier.

We could decide that singlehtml is not supported by the directive. And/or that multiple targets are not supported and instead must be build with separate calls.

@MikhailRyazanov
Copy link
Contributor Author

MikhailRyazanov commented Oct 21, 2022

@timhoffm, IMHO hackish is how this extension currently circumvents Sphinx mechanisms by copying files directly to the output tree (at a wrong build phase) and trying to generate links to them (based on incorrect assumptions).

I would say that :download: is the correct approach at least for the script files — to compare with, this is what the Sphinx-Gallery extension does (check “Examples” in the Matplotlib docs). And, conceptually, for PDF plots as well (although modern browsers can display PDF files, these are mainly intended for downloading, since SVG is a more appropriate alternative for viewing in a browser).

Basically, what do you think is wrong with using :download: for additional files? Except that some other extensions might depend on the files written by plot_directive (this is really hackish, since such behavior is not documented anywhere), although we don't know that any such third-party extensions exist...

Regarding singlehtml, not supporting it would be a very strange and unjustified decision, especially since Read the Docs uses it for building downloadable documentation (called “Zipped HTML” there).

@tacaswell
Copy link
Member

We have been doing it his way from at least 2011, so this is not a something new we have started doing, and it is unclear to me if there was a better way to do this in sphinx at time. To Tim's point, it is very unclear if using :download: is actually less hackish or just differently hackish nor is it clear to me that there is a "right" phase to do this sort of thing.

That sphinx-gallery uses :download: is a point in its favor, however it was also my understanding that all of sphinx-gallery is "hackish" to work around sphinx limitations.

I agree with Tim those three things are "nice to have".

It is also not clear to be that read the docs should be using singlehtml for this purpose. While I 'm sure it make sense for many projects, if you build our docs that way you get (leaving broken links aside) a 77MB html file which I could not get a browser to even show me from disk...

This also makes a superficial change of adding download icons
image which I think is a slight improvement over just text.

I am not worried about further extensions relying on files being there, but on people who are relying on urls that have been stable for 10+ years.

Despite all of this, I am weakly in favor of taking this PR. It reduces the amount of code we have and will let is push future issues with this back upstream and singlehtml is likely useful to many of our users.

Copy link
Member

@tacaswell tacaswell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weakly in favor of taking this. Fixing singlehtml worth the risk of changed URLs.

@MikhailRyazanov
Copy link
Contributor Author

@tacaswell, thanks for approving this!

To Tim's point, it is very unclear if using :download: is actually less hackish or just differently hackish

As I wrote, it at least lets Sphinx itself to handle all the file management and thus avoid all the mentioned problems (when unneeded files are copied, needed files are not copied, and wrong links are generated). The only “hackish” thing about it is that formally :download: is intended for files from the source tree rather than from the build directory, but in reality in accepts any paths, so this should not be a problem.

nor is it clear to me that there is a "right" phase to do this sort of thing.

According to Sphinx documentation, there is “Phase 1: Reading”, when the doctree is prepared and cached on the disk (to <build>/plot_directive in our case), and then a separate “Phase 4: Writing”, when all the actual output files are written (plot directive currently ignores this phase and thus could not copy the files when building multiple targets). So the proper way would be either to insert references to the desired files into the doctree and let Sphinx to copy them or, if this can't be done, to add a hook that will copy the necessary files at the writing phase. In either case, this will need better understanding of Sphinx than apparently anybody here has and will result in writing more code than simply using :download:.

This also makes a superficial change of adding download icons
which I think is a slight improvement over just text.

The visual appearance is entirely controlled by the selected Sphinx theme (through CSS). Matplotlib docs use mpl-sphinx-theme, inherited from pydata-sphinx-theme, which adds these icons. The default Sphinx theme has no icon and calmer link decorations, and sphinx_rtd_theme has a simpler icon but otherwise no special link formatting. Those who wish the old appearance, in which links to raw files are not distinct from links to normal web pages, can also achieve that easily by a simple custom CSS...

@timhoffm
Copy link
Member

@story645 thanks for pinging. However, I don’t feel competent to review this topic and I don’t have the time to dig into this.

@timhoffm timhoffm removed their request for review November 11, 2022 18:57
@QuLogic
Copy link
Member

QuLogic commented Nov 17, 2022

Looks like this needs a rebase now.

@MikhailRyazanov
Copy link
Contributor Author

MikhailRyazanov commented Nov 20, 2022

OK, “rebased” and updated the tests introduced in PR #24215.

No idea why the AppVeyor test is shown as failed, with cryptic messages ”Canceling since a higher priority waiting request for 'Tests-24205-' exists” / “More recent PR #24205 build has been started.” Should it be restarted manually?

@QuLogic QuLogic merged commit 7413bf0 into matplotlib:main Nov 23, 2022
@QuLogic
Copy link
Member

QuLogic commented Nov 23, 2022

Thanks @MikhailRyazanov! Congratulations on your first PR to Matplotlib 🎉 We hope to hear from you again.

@MikhailRyazanov
Copy link
Contributor Author

Thanks @QuLogic! Will it be released in 3.7.0 or 3.6.3? (And when this can be expected?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants