-
-
Notifications
You must be signed in to change notification settings - Fork 7.9k
[Doc]: Support of notebook format for docs creation #25016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I'm not following. Do you see inconvenience for the autor or consumer of the example? You can use multiple sections (https://sphinx-gallery.github.io/stable/syntax.html). For the author in particular:
For a consumer: We have "Download jupyter notebook" buttons at the bottom of the example. |
The biggest issue with our docs right now is that the tutorials/examples are built a different way than the narrative docs, and that strongly affects the structure of the docs, in bad ways I would argue. #24987 is an attempt to mitigate that somewhat by including tutorials linked in the narrative sections. Looking at nbsphinx, it seems to me that it can intersperse rst with Notebooks. If it's true, that means a PR like #24979 that was converting gallery So, I think adding nbsphinx would be a relatively intriguing option, and may indeed allow people to put more documentation in the narrative sections of the docs, particularly if the notebooks can be interspersed with rst and cross linking etc all work seamlessly. |
OTOH, a quick test with nbsphinx didn't render in-place for me, so I guess it will take some configuration to get working with our doc build. |
I am very much against adding .ipynb files to the docs as they pose significant version control problems (you have to remember to strip the output and diff tools do not understand json). If we do go that route we would have to use the jupytext format, but as @timhoffm points out sphinx-gallery already support If we are going to go through another disruptive change to how we encode the docs it will probably be a move to myst (but I think we should let that settle a bit longer). |
Good point about diffs, and needing to use jupytext or similar. I wouldn't personally be in favour if this had to be a wholesale change. But mixing in a different format with the existing rst files (not gallery!) isn't out of the question. |
#25021 does the bulk switch to |
For the consumer, there is no inconvenience, and the "Download Jupyter notebook" button is amazing! I see the inconvenience for the author and the reviewer. For the author, because it's difficult to iterate over an example that is in the end of the example script, as the whole script has to be run for each iteration. I think that the For the reviewer, it's not easy to see how the changes are affecting the example image outputs: This inconvenience could be solved by another advantage of Here is how a GitHub pull request changes tab will look like for The main problem with this: The git history would be filled with image data. This could be overcome with the following strategy: All prs, that iterate on examples, can go to a new branch called |
Can you mix rst with nbsphinx? I didn't get that to work in my 5 minutes of playing with it. |
This is a fatal problem. Our repo is already extremely large because we include the test images (which we need to do something about) so including the example images is a non-starter. Even if the examples were on a separate branch, they would still be in the repository and included in history unless we reguarly force-pushed that branch out of existence (which would then lead to chaos on any outstanding PRs against that branch when we did so). We would also need a checker to make sure no changes came in via any method but that GHA (which is doable but annoying). The proposed scheme might work with a separate repo, but at that point we are growing a tremendous amount of complexity to solve a problem which is easy to not make (by sticking with .py, .rst, or (if we must) .md files). I am also slightly concerned about integrating GHA that closely into our development process. We are getting those free cpu cycles essentially on an advertising budget line and they can go away at any time (see what happened with travis.org). Using it for nice-to-haves (like the new contributor greeter) and CI is a reasonable risk (because if they went away tomorrow we could do without or replace it "easily" (as we already use a diverse set of CI providers)), but for something like "this needs to run to update the examples in our docs" I would be worried. In our history we have moved hosting platforms once (from sourceforge to github which also included a VCS change from svn->git) so I also think we should keep an eye towards that being a possibility in the future.
That GitHub has integrated tools is good (even if they are in beta), but it does not solve the problem when using other git tools on local machines. Being able to chase through the history of changes to a file is very valuable to understand why things are the way they are. I am aware of things like nbdime, however I am not sure I have ever managed to get it set up correctly. I think that any improvement to the editing workflow for some contributors (and a major degradation for others (well, me)) would be greatly outweighed by the added complexity for all developers of having to get notebook diffing tools installed.
As someone who has written a bunch of those longer examples, my tactic was to copy out the section I was working on into its own file (well, emacs buffer), iterate there and then copy back when I was happy. Further, most IDEs also have a "run region interactively" functionality even absent the automatic "cell" detection. From my verbosity @kolibril13 I hope you can tell I do not like to say "no", but I am pretty firm that we should not start using |
I'll agree strongly with this - I wouldn't do this just for the convenience of the author. The constrained_layout tutorial is the slowest tutorial we have, and it takes 4 s on my machine to make all the figures, which is hardly a barrier to iterating. Or if I'm doing something really fussy, I'll do what Tom suggests above and make a test file. For me, the only advantage here would be organizational, where long explanations that would be better in the narrative docs but are easier to write in code would have their home where they belong in the narrative. |
FWIW I'm strongly against ipynb here as well, because...
... I still have no idea of how to set up the matplotlib/notebook interaction at all. (I haven't tried very hard, but I have tried a couple of times.) |
I think the problem with section wise code execution is solved by the PR #25021 so
It would be really lovely to have all images version controlled, close together with their generating code. I just want to mention that this is can be done for smaller Matplotlib related projects. |
Thank you for being understanding @kolibril13 .
That is still on GitHub, what does it look like with |
Hello, folks - I am just now seeing this. I'd like to point out that it is not necessary to use the .ipynb format for executable docs. In fact, using Jupytext it is possible to have executable .md files (using the MyST Markdown syntax) that will only be executed upon docs build, meaning they don't contain any outputs and so would be straightforward to diff. I have a PR up to SciPy to do that but unfortunately we are still hanging on a minor detail. Please check scipy/scipy#17322 Happy to answer questions or provide a minimal example if folks are generally in favor of this idea. |
After the bulk switch to Also, I want to note that the best workflow I could find to work
Currently, it's also not possible to change these settings in the Jupytext plugin, but maybe this would be worth making an issue https://github.com/notebookPowerTools/vscode-jupytext |
It does look like you have a different setup for the .py files to make them more "narrative" that I'm not familiar with. Are these modified .py files also opened as notebooks in Jupyter classic or JupyterLab? Or just VSCode? The advantage of the myst/jupytext approach is that these markdown files are opened as jupyter notebooks in a plain jupyter setup as well. |
Also - this header you are mentioning is exactly what makes a regular markdown file executable. Remove it, and it is back to a regular static markdown file. |
Considering this example: # %%
a = 1
# %%
b = 2 It's possible to open this in VS Code as a notebook when the vscode-jupytext plugin is installed.
If that works, that would be really nice! I've just tested that with my JupyterLab installation (Version 3.4.8) with the notebook from your pull request: https://github.com/scipy/scipy/blob/b55d32eee1a18db5fd3f06a423b62bb19bc2501b/doc/source/notebooks/interp_transition_guide.md |
You need to install jupytext, and then you can right click this md file in Jupyterlab and "Open as notebook". You can see examples here: https://github.com/numpy/numpy-tutorials (a more comprehensive explanation is here) |
Thanks! |
@kolibril13 I am pretty sure you can get that to work by using a devcontainer.json that includes an initial configuration for your codespace - check out numpy/numpy#23076 |
I'll have a look! |
Problem
As soon as one example has more than one sub-example, e.g. here
https://matplotlib.org/stable/gallery/text_labels_and_annotations/font_family_rc.html#sphx-glr-gallery-text-labels-and-annotations-font-family-rc-py
it is inconvenient to iterate on that example with python files (
*.py
), because it's not straight forward to run a python file cell in sections.Suggested improvement
A good format to support section wise code execution would be
.ipynb
.What do you think about supporting
.ipynb
for creating docs examples?I think that would make iterating over examples easier, and it would encourage contributors to create examples of shorter code blocks with multiple sub-examples. This could also be very useful in the tutorial section.
In my opinion, a bunch of small examples each showing only one feature is much better than one big example.
Matplotlib has currently many examples that have long and convoluted code (e.g. 1,2,3,4), and I think that might be the case because it's difficult to split examples into smaller chunks with the current infrastructure.
The text was updated successfully, but these errors were encountered: