-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] EHN Provide a pos_label parameter in plot_roc_curve #17651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the refactoring is nice. I would like to have the other 2 PRs in first thought.
Could you add en entry in whats new in 0.24
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the refactor of _get_target_scores
is nice.
I am +0 on the refactor into a new base class. Is there another visualization we would like to add that would use this new base class?
sklearn/metrics/_plot/base.py
Outdated
estimator_name : str, default=None | ||
Name of estimator. If None, then the estimator name is not shown. | ||
|
||
pos_label : str or int, default=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having pos_label
here makes this base class less generic. We would only be able to use this to plot curves related to classification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there any plot classes you believe should use the CurveDisplay
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomasjpfan Do we have curves for regression?
I assume that we could make a trick to find what type of data we handle since we have y
and to know if this is a classification problem. If this is not, we can discard pos_label
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be happy with renaming it to ClassificationCurveDisplay
I forgot I had this comment still here.
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@claramatos The 2 other PRs have been merged so we can go forward here. You should resolve the conflict and once done, I will look at the tests. |
Sorry I had to revert #17594 that I merged too quickly (see the discussion at the end for the details). |
Since #17569 is closed can this be closed to? (I've already merged the most recent master version) @glemaitre |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think PR is doing three things.
- Refactor into
CurveDisplay
, which I am concerned with. I am thinking of how this plays out when we want to support multiclass with many curves. I didn't do this type of refactor at first because I wanted the ROC and PR curve to evolve independently of each other. - Refactor into
_get_target_scores
, which I am happy with. - Add
pos_label
intoplot_roc_curve
, which I am happy with.
@claramatos Can we make this PR only about 2 and 3 and have 1 for a follow up PR?
sklearn/metrics/_plot/base.py
Outdated
estimator_name : str, default=None | ||
Name of estimator. If None, then the estimator name is not shown. | ||
|
||
pos_label : str or int, default=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be happy with renaming it to ClassificationCurveDisplay
I forgot I had this comment still here.
So your suggestion would be to remove the |
Sorry for being unclear. I am suggesting to remove |
@thomasjpfan I followed your suggestion and removed |
sklearn/metrics/_plot/base.py
Outdated
@@ -38,3 +43,72 @@ def _check_classifier_response_method(estimator, response_method): | |||
estimator.__class__.__name__)) | |||
|
|||
return prediction_method | |||
|
|||
|
|||
def _get_target_scores(X, estimator, response_method, pos_label=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the word "scores" here may get confused with metrics and scorers. Maybe _get_response
is more clear?
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thank you @claramatos !
Actually this is fine since we don't compute the ROC-AUC via Thanks @claramatos |
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
closes #15573
relates with #17569
Add a pos_label parameter to specify which class to be the positive class when
estimator.classes_[1]
is not the right choice when usingplot_roc_curve
.