-
Notifications
You must be signed in to change notification settings - Fork 208
Add optional pypandoc rst parsing #220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Please use "Fix #issueNumber" this way the associated issue gets closed automatically when the PR is merged. For more details, look at this. I have edited your description but please remember to do it next time. |
I think we would need tests for this and probably a new entry in the build matrix with pypandoc and pandoc installed. The latter can be installed through anaconda. |
Also I am wondering what happens if you have |
If pypandoc is installed but not pandoc, OSError is raised. try:
import pypandoc
print('Using pandoc for converting rst to md in notebooks')
except (ImportError, OSError) as e:
print('Pandoc not found. Error: {}. Falling back to regex based parser'.format(e)) Please show me where I should add the above and tests. |
I don't like having our rst2md code in the except block. I would rather write a new wrapper function like def rst2md(args):
try:
import pypandoc
return pypandoc.rst2md(args)
except:
return sphx_glr_rst2md(args) Then we can document this function explaining the pypandoc use and the default option of a regexp based conversion in our library. Also our function documentation remains available. I'm fine not throwing a message if pandoc is not installed, is an optional dependecy and just as we don't complain about mayavi or seaborn being installed here it shall not be the case. Optional dependencies remain user responsibility. We do need to add this case to the test matrix this dependency. I don't like the anaconda install because it has to go trough conda-forge(but it is just another line in the .travis.yml, and travis time is for free so include it at will). I would do the install trough pip and then use pypandoc itself to download the official binary. $ python -c "import pypandoc;pypandoc.download_pandoc()" This command takes care of it. Finally we have to include this in the changes and put some documentation in the advanced_configuration.rst |
pandoc is accessible through conda without conda-forge. pypandoc is accessible through pip. This is what I would do for simplicity. |
Moved existing code to a new function as you suggested.
Do I have to create a new environment here? or Use existing one? I'm sorry but I'm a noob when it comes to travis. |
For testing with travis and also I would like to see in the circle ci artifacts this output you need this diff --git a/.travis.yml b/.travis.yml
index 438e17ec..ed79e259 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -32,6 +32,8 @@ matrix:
env: DISTRIB="conda" PYTHON_VERSION="3.4"
- os: linux
env: DISTRIB="conda" PYTHON_VERSION="3.5"
+ - os: linux
+ env: DISTRIB="conda" PYTHON_VERSION="3.5" PYPANDOC="True"
before_install:
- if [ "$DISTRIB" == "conda" ]; then
wget http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh;
@@ -51,6 +53,9 @@ install:
if [ "$PYTHON_VERSION" == "2.7" ]; then
conda install --yes mayavi;
fi;
+ if [ "$PYPANDOC" == "True"]; then
+ python -c "import pypandoc;pypandoc.download_pandoc()";
+ fi;
fi;
- pip install -r requirements.txt
- if [ "$DISTRIB" == "ubuntu 14" ]; then pip install seaborn sphinx==1.4.9; fi;
diff --git a/circle.yml b/circle.yml
index dc3a807f..b153aebd 100644
--- a/circle.yml
+++ b/circle.yml
@@ -16,6 +16,7 @@ dependencies:
- sed -i "s/ENABLE_USER_SITE = .*/ENABLE_USER_SITE = False/g" /home/ubuntu/miniconda/envs/circleenv/lib/python2.7/site.py
- conda install -n circleenv --yes numpy scipy seaborn mayavi setuptools matplotlib pillow sphinx nose sphinx_rtd_theme
- pip install -r requirements.txt
+ - python -c "import pypandoc;pypandoc.download_pandoc()"
override:
- python setup.py develop
|
doc/advanced_configuration.rst
Outdated
Pandoc for converting rst to markdown for notebook downloads | ||
============================================================ | ||
|
||
Sphinx uses restructured text while Jupyter notebooks uses markdown for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notebook use markdown to declare the format of the rendered text.
doc/advanced_configuration.rst
Outdated
============================================================ | ||
|
||
Sphinx uses restructured text while Jupyter notebooks uses markdown for | ||
markup of the text. Therefore, restructured text need to be translated to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to this mismatch in input formats, the generated Jupyter notebooks may not display correctly if you have heavily used the formatting syntax of restructured text. Sphinx-Gallery provides a minimal parser to transform between this two formats suitable for basic use cases and is the default parse. Sphinx-Gallery is nevertheless aware of pypandoc and if found in your system it will be used to convert between this formats. To install pypandoc in your system you can use the following commands:
I'm not so good with writing. These paragraphs might need some extra reviews to give a clear and easy to understand message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, here's what I think we should write:
Sphinx uses restructured text while Jupyter notebooks uses markdown for formatting the text. Due to this mismatch in markup formats, the generated Jupyter notebooks may not display the text correctly if you have formatted your text heavily with restructured text.
Sphinx-Gallery provides a minimal parser to convert from restructured text to markdown. It is suitable for basic use cases and is the default parser. Sphinx-Gallery is nevertheless aware of pypandoc and if it is found in your system, it will be used to conversion. To install pypandoc
in your system you can use the following commands:
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reminding @Titan-C
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are getting to a clearer text:
Sphinx uses reStructuredText while Jupyter notebooks use Markdown for
formatting the text blocks. Due to this mismatch in markup formats, the
generated Jupyter notebooks may not render the text cells correctly if
you make heavy use of the syntax features in reStructuredText
Sphinx-Gallery provides a minimal parser to convert from
reStructuredText to Markdown. It is suitable for most basic use cases
and is the default parser. Sphinx-Gallery is nevertheless aware of
pypandoc and when it is found in
your system, it will be used instead to do the conversion. To install
pypandoc
in your system you can use the following commands:
sphinx_gallery/notebook.py
Outdated
@@ -98,6 +99,18 @@ def rst2md(text): | |||
|
|||
return text | |||
|
|||
def rst2md(text): | |||
"""Converts the RST text from the examples docstrigs and comments | |||
into markdown text for the Jupyter notebooks""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add:
Delegates the conversion to pypandoc if found in the system, otherwise uses our custom converter
I think we would need tests for this and probably a new entry in the build
matrix with pypandoc and pandoc installed.
anaconda.
+1
|
@Titan-C I think this is ready to be merged. |
@chsasank Thank you so much for keeping up with this. The code looks great and I'm happy with the documentation. But just now with the Circle CI in place I checked the output of pypandoc and comparied it against what Sphinx-gallery supports. I'm disappointed on the output of Pandoc as it misses conversions we already support, we certainly need more calibration for it beyond the default. Have a look at:
|
We have test for the latex conversion. Travis being green and pandoc not having good latex conversion means travis was not setup correctly. (My error) The if clause needs to be - if [ "$PYPANDOC" == "True"]; then
+ if [ "$PYPANDOC" == "True" ]; then The white-space is relevant |
Updated. Build fails because
Sasank. |
Yes build should fail on Pandoc because it is not converting everything as we would want. We expect warning blocks and multiline Latex to work. (The inline Latex, first fail is indeed no big deal) |
@chsasank I think we won't merge this because pypandoc, at this moment,
does not provide the support for some features we currently use. We do
need to improve on our RST parser as I mention in
#232 (comment),
it is just not an easy problem to solve.
|
Is this the feature you want but pandoc doesn't support? |
Yes, I want the notes and warning boxes to be parsed.
|
I agree that it was a great idea to explore. It's a pity that we cannot
normalize enough the two ouptuts. I think that it boils down to that.
|
So, is there no way to present titles, or h3 and h4 headers in .ipynb that are converted from .py files? Seems like a big limitation. |
Rather than commenting in a closed PR, please open an issue with more details and ideally a way to reproduce what you are seeing. |
Please see #219 for discussion on this. Fix #219.
Can somebody point to me where I should add optional requirements.