-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH Add CalibrationDisplay plotting class #17443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lucyleeow , made a quick pass but looks good
Regarding the histograms: since it's just a simple call the plt.hist
, I think we should let users call that and rely on the prob_pred
attribute of the Vizualiser. Since this would be illustrated in the examples, it's fine IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I post these comments. I see this is still WIP.
self.prob_pred = prob_pred | ||
self.estimator_name = estimator_name | ||
|
||
@_deprecate_positional_args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would need this deprecation since this is a new class and new function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I move the *
to not allow any positional args..? (bit confused about this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically here def plot(self, ax=None, *, name=None, ref_line=True, **kwargs):
as ax
arguably should/could be keyword.
Thanks for the reviews guys! I would like to update the examples Can I do the non- |
I'm fine with doing this here so we can see it in action. Regarding the module, I'm not sure |
Good point, it's not really a metric.
I'll wait a bit to see if there are any objections and if there are none, I'll move it there. |
@NicolasHug would it be appropriate to put this inside |
ping @glemaitre changes made, thanks! (and the CIs are greeeeen!) |
LGTM apart from the small comment where I would check what failing message do we get. |
I will not merge right now. I would like to know if @ogrisel +1 is still standing after the changes. |
ping @ogrisel...? |
@ogrisel Can we merge? |
@adrinjalali I added this PR to the 1.0 milestone as I think this should really be in. Already +2 and just waiting for a final OK of @ogrisel as there have been some changes since his approval. |
Yes please! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is now a failing test, not sure why:
In sklearn/calibration.py
at line 650:
if y_pred.ndim != 1: # `predict_proba`
if y_pred.shape[1] != 2:
> raise ValueError(classification_error)
E ValueError: Expected 'estimator' to be a binary classifier, but got DecisionTreeClassifier
Beyond fixing the test failure, I think the error message should be improved to report the observed shape of y_pred
.
The failure is due to the fact that we already improved the previous error message in another PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another detail to fix below.
Other than that and the failing test, I confirm that this PR looks good to me. Thanks again @lucyleeow.
f"{classification_error} fit on multiclass ({y_pred_shape} classes)" | ||
" data" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to change this message if you had something different in mind @ogrisel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the message looks good!
Thank you @ogrisel @glemaitre, changes made! |
Merged! Thank you very much @lucyleeow! |
Thanks! I'm so happy! :D |
Nice. Thanks @lucyleeow |
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Reference Issues/PRs
closes #8425
What does this implement/fix? Explain your changes.
Adds
CalibrationDisplay
for binary classifiers with visualization APIAny other comments?
Plot currently looks like this:

Yet to add tests and update examples as I am unsure about API:
Should a histogram be added? If so should the histogram be a separate plot (e.g., here) or on the same plot (as suggested by @amueller: #8425 (comment)).
If we put it on the same plot I'm worried it will be too crowded. If we want 2 separate plots, the API becomes difficult as we should have 2 different plot
**kwargs
parameters, for both plots so people can amend them separately. You also couldn't useax = plt.gca()
to get the current axis when you want to add lines to an existing plot (like for the current plots, e.g., here). I think you could useCalibrationDisplay.ax_
though.