Skip to content

[WIP] Add examples to samples_generator #6924

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

[WIP] Add examples to samples_generator #6924

wants to merge 1 commit into from

Conversation

MartinThoma
Copy link
Contributor

What does this implement/fix?

It adds graphical examples directly to some of the data generators

Comments

Currently, make does not work. But it seems not to be related to my changes:

Doctest: unsupervised_learning.rst ... ok
Doctest: working_with_text_data.rst ... /home/moose/GitHub/scikit-learn/doc/tutorial/text_analytics/working_with_text_data.rst:1: VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future
  .. _text_data_tutorial:
FAIL

======================================================================
FAIL: Doctest: working_with_text_data.rst
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/doctest.py", line 2226, in runTest
    raise self.failureException(self.format_failure(new.getvalue()))
AssertionError: Failed doctest test for working_with_text_data.rst
  File "/home/moose/GitHub/scikit-learn/doc/tutorial/text_analytics/working_with_text_data.rst", line 0

----------------------------------------------------------------------
File "/home/moose/GitHub/scikit-learn/doc/tutorial/text_analytics/working_with_text_data.rst", line 99, in working_with_text_data.rst
Failed example:
    twenty_train = fetch_20newsgroups(subset='train',
        categories=categories, shuffle=True, random_state=42)
Expected nothing
Got:
    Downloading 20news dataset. This may take a few minutes.
----------------------------------------------------------------------
File "/home/moose/GitHub/scikit-learn/doc/tutorial/text_analytics/working_with_text_data.rst", line 452, in working_with_text_data.rst
Failed example:
    gs_clf.best_score_
Expected:
    0.900...
Got:
    0.90000000000000002

>>  raise self.failureException(self.format_failure(<StringIO.StringIO instance at 0x7fb149ea1d88>.getvalue()))

-------------------- >> begin captured logging << --------------------
sklearn.datasets.twenty_newsgroups: WARNING: Downloading dataset from http://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz (14 MB)
sklearn.datasets.twenty_newsgroups: INFO: Decompressing /home/moose/scikit_learn_data/20news_home/20news-bydate.tar.gz
--------------------- >> end captured logging << ---------------------

----------------------------------------------------------------------
Ran 38 tests in 43.950s

FAILED (SKIP=3, failures=1)
Makefile:40: recipe for target 'test-doc' failed

Also, the printed plots open during the test. I'm not sure if this might be a problem.

@jnothman
Copy link
Member

Thanks for informing us of that test-doc failure. test-doc should be run more often than it is. But the make target you needed was doc (or doc-noplot) not test-doc.

@jnothman
Copy link
Member

But I don't know if sphinx handles that plotting as you expect; nor do we have matplotlib as a general or testing dependency, so these examples will fail in continuous integration.

I think a better solution would be to explicitly reference examples where one can see an illustration (in 2d).

@MartinThoma
Copy link
Contributor Author

MartinThoma commented Jun 23, 2016

@jnothman What reasons speak against adding matplotlib as a testing dependency? I think having example plots adds great value (e.g. scipy/scipy#6253 - http://scipy.github.io/devdocs/generated/scipy.ndimage.gaussian_laplace.html#scipy.ndimage.gaussian_laplace). One the one hand, one can more easily / intuitively understand how some algorithms work. On the other hand, one can also see the code which generated the image.

@jnothman
Copy link
Member

Well, apart from the dependency question, they didn't seem to render when I
tried compiling your changes.

On 23 June 2016 at 20:43, Martin Thoma notifications@github.com wrote:

@jnothman https://github.com/jnothman What reasons speak against adding
matplotlib as a testing dependency? I think having example plots adds great
value (e.g. scipy/scipy#6253 scipy/scipy#6253 -
http://scipy.github.io/devdocs/generated/scipy.ndimage.gaussian_laplace.html#scipy.ndimage.gaussian_laplace
)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#6924 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAEz637BjCVGDLYaPzL51vTN_GWeWl_Kks5qOmNNgaJpZM4I8PEP
.

@MartinThoma
Copy link
Contributor Author

@rgommers Do you have an idea what is different in scikit-learn from scipy, so that the code above doesn't render? A very similar example worked just fine in scipy.ndimage.

@jnothman
Copy link
Member

Is it possible there have been extensions to numpydoc? As per #4077, we're not in sync with that project.

@rgommers
Copy link
Contributor

Is it possible there have been extensions to numpydoc? As per #4077, we're not in sync with that project.

Looks like the issue is that scikit-learn doesn't use matplotlib.sphinxext.plot_directive. Instead, in conf.py here it imports gen_rst.py. I'm not sure that just adding plot_directive to the mix won't give conflicts with gen_rst, but it's worth a try.

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Jun 26, 2016 via email

@amueller
Copy link
Member

I think these should be added to the user guide, and the docstring should link to the user guide.

@amueller amueller added Easy Well-defined and straightforward way to resolve Sprint Stalled help wanted labels Sep 27, 2018
@rth rth removed the Sprint label Jun 27, 2019
@adrinjalali adrinjalali deleted the branch scikit-learn:master January 22, 2021 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Easy Well-defined and straightforward way to resolve help wanted module:datasets Stalled
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants