display='diagram' needs more prominence in documentation #18305

jnothman · 2020-08-31T02:03:42Z

The sklearn.set_config(display='diagram') feature highlighted here needs more prominence. At least, it should be easy to find it when googling scikit-learn diagram display. It isn't. scikit-learn display=diagram is much the same, and scikit-learn "display=diagram" is on the mark but the results are not easy to parse as being relevant.

We probably need an example, or a user guide page, showing how different estimators are diagrammed, and perhaps also instructing users to use IPython.display's display function when needed.

The text was updated successfully, but these errors were encountered:

dishak331 · 2020-08-31T13:57:43Z

The sklearn.set_config(display='diagram') feature highlighted here needs more prominence. At least, it should be easy to find it when googling scikit-learn diagram display. It isn't. scikit-learn display=diagram is much the same, and scikit-learn "display=diagram" is on the mark but the results are not easy to parse as being relevant.

We probably need an example, or a user guide page, showing how different estimators are diagrammed, and perhaps also instructing users to use IPython.display's display function when needed.

So you need examples along with that guide page so that it shows more prominence such as diagrams and all?

tash149 · 2020-10-06T08:48:39Z

Hey @jnothman can you please help me with open source contributions to projects pertaining to machine learning. I have worked on research projects though I have never contributed here. Also all the issues seem so overwhelming I can't decide where to begin with. It would be wonderful if you could guide me through. Thanks 😁✨

reshamas · 2020-11-04T18:19:46Z

@thomasjpfan What is the best place to put an example of the pipeline diagram call?
cc: @Mariam-ke

a) Here?
https://github.com/scikit-learn/scikit-learn/blob/647fcb1ac13abd8c2eb2554d526f4ad41fee6778/doc/modules/compose.rst

b) Release highlights
https://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_0_23_0.html#sphx-glr-download-auto-examples-release-highlights-plot-release-highlights-0-23-0-py

c) some other file?

NicolasHug · 2020-11-04T21:45:22Z

I'm not sure the right way to go is to introduce diagram rendering in the UG at random places. In most cases this will distract from the original purpose of the code snippets.

I would prefer to have a dedicated example for illustrating the diagrams of different estimator, as Joel proposed above. Note that we already have https://scikit-learn.org/dev/auto_examples/compose/plot_column_transformer_mixed_types.html#html-representation-of-pipeline . Maybe we can add the keyword "Diagram" to this subsection for SEO.

We can also add a small and short note at the end of this section of the getting started guide: https://scikit-learn.org/dev/getting_started.html

jnothman · 2020-11-05T13:02:00Z

Or a section (in Getting Started or otherwise) on how to configure Scikit-learn for a Jupyter Notebook.

reshamas · 2020-11-05T14:03:56Z

How about,

in this file: HTML representation of Pipeline,
we change the title from "HTML representation of Pipeline" to "HTML Representation of Pipeline (Display Diagram)"
in this file: Getting Started, can add:

Pipelines: displaying diagrams in Jupyter notebook
---------------------------------------------------

The default configuration for displaying a pipeline is 'text':  `set_config(display='text')`.  To visualize the diagram in Jupyter Notebook, use `set_config(display='diagram')` and then call the pipeline object.

  >>>from sklearn.pipeline import Pipeline
  >>>from sklearn.svm import SVC
  >>>from sklearn.decomposition import PCA
  >>>estimators = [('reduce_dim', PCA()), ('clf', SVC())]
  >>>pipe = Pipeline(estimators)
  >>>pipe
  Pipeline(steps=[('reduce_dim', PCA()), ('clf', SVC())])

  >>>from sklearn import set_config
  >>>set_config(display='diagram')
  >>>pipe

NicolasHug · 2020-11-05T14:09:11Z

OK for 1

For 2: I would rather not make a new section in the getting started guide. I think a note would be enough and to keep it short, we can just re-use the already defined pipeline in the corresponding section, and use the config_context manager (We can also mention set_config)

reshamas · 2020-11-05T14:22:13Z

In getting_started.rst, can make the following edit, at the end of the section "Pipelines: chaining pre-processors and estimators", can add:

The default configuration for displaying a pipeline is 'text' where `set_config(display='text')`.  To visualize the diagram in Jupyter Notebook, use `set_config(display='diagram')` and then call the pipeline object.

NicolasHug · 2020-11-05T15:07:48Z

I think users would benefit from seeing the diagram in action:

To render estimators as diagrams in notebooks, use the `display='diagram'` option:

>>> with config_context(display='diagram'):
>>>    pipe
<diagram renders here>

You may also call `set_config` (link https://scikit-learn.org/stable/modules/generated/sklearn.set_config.html) at the top of the notebook.

This would ideally be withing a .. note:: Diagram rendering of estimators

NicolasHug · 2020-11-13T13:39:59Z

I didn't realize that the code in the UG is not executed by sphinx, and thus we can't see diagrams in the UG. It seems that we should create a dedicated example then, with various pipelines / estimators of different complexity. Would you be interested in doing that @reshamas ? @hongshaoyang also showed interest

Some examples of estimators worth rendering (feel free to deviate):

single estimator like SVC
basic pipeline
complex pipeline with ColumnTransformer (as in the other example linked above)
GridSearch within a pipeline, or GridSerach over a pipeline

reshamas · 2020-11-13T16:12:25Z

@NicolasHug
Would the examples:

be created here: https://github.com/scikit-learn/scikit-learn/tree/master/examples
Would we add a new sub-folder for "examples/pipelines" OR

put SVC example under "/svm"
put ColumnTransformer under "/preprocessing"
put basic pipeline --> ?
put GridSearch --> ?

NicolasHug · 2020-11-13T16:13:45Z

I think we can just put it in examples/miscellaneous (we only need 1 example)

reshamas · 2020-11-13T16:21:15Z

You mean we need a few examples, but in one file called /examples/miscellaneous/pipeline.py?

NicolasHug · 2020-11-13T16:24:41Z

yes, though the file name needs to be named plot_xyz.py so I'd suggest plot_display.py

By example I mean one entry in https://scikit-learn.org/stable/auto_examples/index.html. Each of these are 1 file. In the example, we can illustrate more than one estimator (all of them, actually)

jnothman · 2020-11-14T12:49:20Z

I think we might also explicitly have "Tip(s) for scikit-learn in jupyter" to mention this.

NicolasHug · 2020-11-14T12:51:48Z

where? should this just be the example? I'm not sure we have enough material about scikit-learn in jupyter to have a whole new UG chapter

jnothman · 2020-11-14T22:03:21Z

I don't really know. In a tutorial?

NicolasHug · 2020-11-15T08:00:38Z

I wouldn't recommend a tutorial because our tutorials page needs some work: #18257. We use the examples nowadays for tutorial-like entries, e.g. this or this

I'm also curious about what else our recommendations are regarding scikit-learn + jupyter?

reshamas · 2020-12-16T14:23:15Z

@NicolasHug @jnothman
Could we have a folder with example notebooks, that people could easily download and use?

reshamas · 2021-03-18T13:46:31Z

@NicolasHug @cmarmo @jnothman
The Modin project has a "jupyter" folder in their /examples/. Here it is:
https://github.com/modin-project/modin/tree/master/examples/jupyter

Is this something we could consider doing for scikit-learn? It seems a bit more flexible.

NicolasHug · 2021-03-18T13:53:51Z

@reshamas why not create a regular example as suggested in #18305 (comment)? The examples from the gallery can be downloaded as .ipynb files if users want to use them as notebooks.

reshamas · 2021-03-18T13:59:55Z

@NicolasHug I recall the problem with creating a regular example was that sphinx did not produce the diagram in the documentation. But, in a Jupyter notebook, the diagram is produced. That way, users can see the code with the diagram.

NicolasHug · 2021-03-18T14:02:49Z

The examples are able to generate the html disaply properly, if that's what you mean: https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html#sphx-glr-auto-examples-compose-plot-column-transformer-mixed-types-py

The issue with the html renderring and sphinx, IIRC, was only in the UG because the code isn't executed there

reshamas · 2021-03-31T14:43:28Z

@cmarmo
I haven't had time to work on this, so if someone wants to take it over, it's available.

jnothman added Documentation Easy Well-defined and straightforward way to resolve help wanted labels Aug 31, 2020

reshamas mentioned this issue Nov 4, 2020

DOC Display diagram to pipeline example #18758

Merged

hongshaoyang mentioned this issue Nov 10, 2020

[MRG] DOC: Add display='diagram' guides to visualize estimators #18806

Closed

NicolasHug removed Easy Well-defined and straightforward way to resolve help wanted labels Nov 13, 2020

cmarmo added the help wanted label Mar 31, 2021

This was referenced Oct 8, 2021

Add a demo of the HTML repr to "Compact estimator representations" example #21289

Closed

HTML representation of estimators: center the diagrams #21290

Closed

ogrisel closed this as completed in #18758 Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

display='diagram' needs more prominence in documentation #18305

display='diagram' needs more prominence in documentation #18305

jnothman commented Aug 31, 2020

dishak331 commented Aug 31, 2020

tash149 commented Oct 6, 2020 •

edited

Loading

reshamas commented Nov 4, 2020

NicolasHug commented Nov 4, 2020

jnothman commented Nov 5, 2020

reshamas commented Nov 5, 2020

NicolasHug commented Nov 5, 2020

reshamas commented Nov 5, 2020

NicolasHug commented Nov 5, 2020 •

edited

Loading

NicolasHug commented Nov 13, 2020

reshamas commented Nov 13, 2020

NicolasHug commented Nov 13, 2020 •

edited

Loading

reshamas commented Nov 13, 2020

NicolasHug commented Nov 13, 2020 •

edited

Loading

jnothman commented Nov 14, 2020 via email

NicolasHug commented Nov 14, 2020

jnothman commented Nov 14, 2020 via email

NicolasHug commented Nov 15, 2020

reshamas commented Dec 16, 2020

reshamas commented Mar 18, 2021

NicolasHug commented Mar 18, 2021

reshamas commented Mar 18, 2021

NicolasHug commented Mar 18, 2021

reshamas commented Mar 31, 2021

display='diagram' needs more prominence in documentation #18305

display='diagram' needs more prominence in documentation #18305

Comments

jnothman commented Aug 31, 2020

dishak331 commented Aug 31, 2020

tash149 commented Oct 6, 2020 • edited Loading

reshamas commented Nov 4, 2020

NicolasHug commented Nov 4, 2020

jnothman commented Nov 5, 2020

reshamas commented Nov 5, 2020

NicolasHug commented Nov 5, 2020

reshamas commented Nov 5, 2020

NicolasHug commented Nov 5, 2020 • edited Loading

NicolasHug commented Nov 13, 2020

reshamas commented Nov 13, 2020

NicolasHug commented Nov 13, 2020 • edited Loading

reshamas commented Nov 13, 2020

NicolasHug commented Nov 13, 2020 • edited Loading

jnothman commented Nov 14, 2020 via email

NicolasHug commented Nov 14, 2020

jnothman commented Nov 14, 2020 via email

NicolasHug commented Nov 15, 2020

reshamas commented Dec 16, 2020

reshamas commented Mar 18, 2021

NicolasHug commented Mar 18, 2021

reshamas commented Mar 18, 2021

NicolasHug commented Mar 18, 2021

reshamas commented Mar 31, 2021

tash149 commented Oct 6, 2020 •

edited

Loading

NicolasHug commented Nov 5, 2020 •

edited

Loading

NicolasHug commented Nov 13, 2020 •

edited

Loading

NicolasHug commented Nov 13, 2020 •

edited

Loading