Skip to content

ENH: Display parameters in HTML representation #30763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 145 commits into from
May 21, 2025

Conversation

DeaMariaLeon
Copy link
Contributor

@DeaMariaLeon DeaMariaLeon commented Feb 3, 2025

Reference Issues/PRs

Working on first point on #26595
Fixes: #21266

What does this implement/fix? Explain your changes.

  • This code allows to visualise the estimator's parameter values. It adds an interactive dropdown hidden by default.
  • Copy paste button added. When clicking on it, the parameter's name is saved to the clipboard.

Any other comments?

@glemaitre : WDYT?

Copy link

github-actions bot commented Feb 3, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 3d5804f. Link to the linter CI: here

@DeaMariaLeon DeaMariaLeon changed the title WIP: get_params() work - not ready for review WIP: get_params() work Feb 3, 2025
@glemaitre glemaitre self-requested a review February 6, 2025 15:28
@betatim
Copy link
Member

betatim commented Feb 7, 2025

This looks super cool! What about a more descriptive title for the PR? Every time I see a notification related to it pop up I think: what get_params is broken??!! To one second later realise what this PR is really about :D

Maybe something like "Display parameters in HTML representation"?

@DeaMariaLeon DeaMariaLeon changed the title WIP: get_params() work WIP: Display parameters in HTML representation Feb 7, 2025
@trallard
Copy link

Hey folks, I am so happy to see more accessibility discussions in the wild ✨💜 Thank you for being thoughtful and striving to serve our community better.
There was also another talk at PyCon DE this year by Smera, our fabulous designer!

If you wanted to make the style more in line with the PyData Sphinx Theme:

The reason I bring these resources is per @betatim 's comment:

There was a great talk at PyCon Germany about the work that went into pydata-sphinx-theme to make it more accessible and one of my main takeaways from it was that accessibility is a bit like cryptography: so many things to think about that it is worth not rolling your own :D

We have spent much time thinking about accessibility and design for PyData Sphinx Theme and documenting our guidelines so our community can reuse and build on them without reinventing the wheel.
So, you might find some of our guidelines and styles helpful for you moving forward.
Alternatively, if you need or would like accessibility-related feedback or brainstorming, feel free to contact me, and I will be more than happy to help.

@DeaMariaLeon
Copy link
Contributor Author

Here is the example again: Column Transformer with Mixed Types

@ArturoAmorQ
Copy link
Member

ArturoAmorQ commented May 13, 2025

Sorry for jumping in. I don't want to bring noise to a PR that already has 6 participants. I just wanted to share that I had a look on the render linked in #30763 (comment) and I noticed a change in behavior.

  • On main:
    image
  • This PR:
    image

Personally I find it more useful to know the column names than knowing they were selected using a make_column_selector object. Is there a way to retrieve the actual names of the selected columns from the object?

@DeaMariaLeon
Copy link
Contributor Author

DeaMariaLeon commented May 13, 2025

@ArturoAmorQ thanks for bringing it up!.. it's a bug..
The names of the columns are shown correctly before make_column_selector is used (the first diagram in the document is fine).

@glemaitre
Copy link
Member

I don't think that there is a bug here. @ArturoAmorQ I think that you click on the first pipeline in main and the second pipeline in this PR. In main, it is the rendering of the the second pipeline:

image

I was a bit surprise because the code never execute any function when it comes to diagram. However, it could be a further improvement potentially by recording the column name once fit is called and show them once the pipeline is fitted. Before fit, we cannot do better than the signature of the object. We could also the representation of the make_column_selector function to have a better string reprenstation.

@ArturoAmorQ
Copy link
Member

Actually it's me who read the example too fast. Sorry for that. There are 2 pipelines in the example and the second one uses the make_column_selector. Your PR has the same behavior as before.

Still I wonder if we can somehow retrieve the column names from the make_column_selector 🤔 In any case that would be for a different PR.

Sorry again for the noise.

@glemaitre
Copy link
Member

Thanks @trallard for the guideline. We definitely need some visual improvements with those diagrams. Basically, this PR was a good starting point to find out a couple of limitations: inconsistency between IDEs/doc rendering and dark/light theme management (among others).

We should probably limit the scope of this PR but we need to revisit specifically the look of those diagrams and those guidelines will be useful for sure. We can hopefully also reuse knowledge that we gained when developing a couple of add-ons in skrub.

@trallard
Copy link

+1 on limiting scope.
If there is anything I can help with just give me a ping whenever.

@GaelVaroquaux
Copy link
Member

@trallard , happy to have you around. Thanks a lot!

Copy link
Member

@GaelVaroquaux GaelVaroquaux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rendering looks fine to me (I no longer have issues with the contrast).

I do have one question on the code (inlined in the review).

In addition, I just wanted to check: this refactor does not change the protocol for subclasses in other packages compared to the one that we had before? If it does, we should document this clearly.

return self._repr_html_inner

def _repr_html_inner(self):
return self._html_repr()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this method? It seems to me that we could simply call _html_repr istead of this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was taken from the original base.py code (that's the way it is in main). But I forgot to add its docstring:

This function is returned by the @property `_repr_html_` to make
`hasattr(estimator, "_repr_html_") return `True` or `False` depending
on `get_config()["display"]`. 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I should do it differently of course I'll do it, but in the meantime I added the docstring.

@DeaMariaLeon
Copy link
Contributor Author

this refactor does not change the protocol for subclasses in other packages compared to the one that we had before?

No, it doesn't.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @GaelVaroquaux for the review. Looks good on my side. Let's merge.

Thanks @DeaMariaLeon it is a nice improvement.

@glemaitre glemaitre merged commit d077f82 into scikit-learn:main May 21, 2025
36 checks passed
@DeaMariaLeon
Copy link
Contributor Author

Thank you All!

@DeaMariaLeon DeaMariaLeon deleted the get-params branch May 21, 2025 08:36
@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented May 21, 2025 via email

@betatim
Copy link
Member

betatim commented May 21, 2025

Whoop whoop 🥳

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented May 21, 2025 via email

@betatim
Copy link
Member

betatim commented May 21, 2025

I tried this out (for the first time) just now. One thing I noticed is that there is a "copy" button for each parameter. Not having paid attention to the discussion.design of this I approached it like an uneducated user. My first thought was that maybe it will copy the argument and value (n_neighbors=20) but it copies only the parameter name (n_neighbors). Overall I was wondering what the idea behind the feature was?

Screenshot 2025-05-21 at 13 01 31

@glemaitre
Copy link
Member

glemaitre commented May 21, 2025

@betatim The feature is particularly useful when you have nested pipeline and column transformer to get the fully qualified name, e.g. histgradientboostinclasssifier__max_depth. With a single estimator, it is a bit less pertinent. You can see it as an alternative to call pipeline.get_params to find the fully qualified name to copy it and paste it in pipeline.set_params (e.g. when doing the parameter grid in GridSearchCV.

We started only by copying the parameter name but I don't mind extending to the parameter value if we think it is useful.

@betatim
Copy link
Member

betatim commented May 21, 2025

That makes sense. Maybe something to show off in the release highlight to help others realise why copying the name is useful

jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request May 30, 2025
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
jeremiedbb pushed a commit that referenced this pull request Jun 5, 2025
Co-authored-by: Guillaume Lemaitre <guillaume@probabl.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Display all parameter values in a tabular for a tab of the notebook HTML repr of estimators
7 participants