-
-
Notifications
You must be signed in to change notification settings - Fork 26k
ENH Allow multiple scorers input to permutation_importance #19411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Allow multiple scorers input to permutation_importance #19411
Conversation
@simonamaggio told me on a private channel that this is still work in progress: new tests are needed before reviewing. Also, it would be great to check the computational benefit of this approach on a small benchmark snippet to check that the caching mechanism works as expected. |
Edit: The above is probably wrong as discussed below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will need a test to check that we can provide multiple scorers. We should test all the supported combination and check the output. This would be a matter of parametrization using pytest
.
Then, we need to update the User Guide as well to illustrate how to use this feature. I don't think that we have an example (in examples/
folder) where we could use this feature.
Please add an entry to the change log at doc/whats_new/v*.rst
. Like the other entries there, please reference this pull request with :pr:
and credit yourself (and other contributors if applicable) with :user:
.
…nces FIX check that the code works with callable returning a dict with multiple metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy with the change in the documentation. LGTM
@thomasjpfan @NicolasHug It could be nice to have your thoughts regarding the way to output the feature importances with multiple metrics. We went for a |
Sounds like the most sensible way, from a quick glance. What else did you have in mind? |
In |
Fair point. I would sacrifice a bit of consistency for the sake of usability and keep the current |
I would agree with the dict of bunches. On the topic of |
Maybe I'm missing something, but using a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more comments below:
Not sure it brings that much benefits as the list of keys is not static, so attribute based look-ups is not necessarily that natural. But not a strong opinion.
You can also access those results using the usual |
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
All CIs are green. Merging. Thank you @simonamaggio |
Scratch that, we did. My ctrl-f did not work as intended... |
Hehehe. This could have happen thought, it is quite common that I forgot it during the review :) |
I understand why my ctrl f was not working anymore in firefox: I had the match "Whole Words" option enabled without realizing it... |
Reference Issues/PRs
Fixes #18701
What does this implement/fix? Explain your changes.
Input argument
scoring
can be a list, tuple or dict with multiple metrics.The permutation feature importance is computed iteratively for the different metrics and
the result is a dict with
metric_name:Bunch
containing the feature importances, mean and std for each of the requested input metrics.