-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
MNT Improve error message with implicit pos_label in brier_score_loss #15412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT Improve error message with implicit pos_label in brier_score_loss #15412
Conversation
@@ -2180,6 +2180,16 @@ def test_brier_score_loss(): | |||
assert_almost_equal( | |||
brier_score_loss(['foo'], [0.4], pos_label='foo'), 0.36) | |||
|
|||
# correctly infer pos_label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be tested in test_common.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess not, because we only infer pos_label in this way in brier_score_loss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need what's new
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should modify the documentation to be more specific with string regarding the inference.
sklearn/metrics/_classification.py
Outdated
@@ -2486,6 +2486,6 @@ def brier_score_loss(y_true, y_prob, sample_weight=None, pos_label=None): | |||
np.array_equal(labels, [-1])): | |||
pos_label = 1 | |||
else: | |||
pos_label = y_true.max() | |||
pos_label = labels[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to update comment on the top of the if statemtent.
We need also to improve the documentation to be more specific in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comment is still valid and we already noted down the expected behaviour: Defaults to the greater label unless y_true is all 0 or all -1 in which case pos_label defaults to 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant is to move this comment in the first branch (if np.array_equal ...
) of the if and add the behavior in the second one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment is OK since "otherwise pos_label is set to the greater label" is still true here . But yeah the docstring could be clarified to say that strings are also ordered by lexicographic order (which makes absolutely no sense to me)
Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>
should we deprecate assert_almost_equal? |
This is still a undeprecated numpy function so I would say no. |
This seems confusing, which one should we encourage contributors to use? |
We encourage to use pytest. Only when we want to check something on array,
we us `assert_***`.
Deprecating will be annoying because we need to change our tests. This is a
similar problem to the PEP8 issue.
…On Thu, 7 Nov 2019 at 15:32, Hanmin Qin ***@***.***> wrote:
This is still a undeprecated numpy function so I would say no.
This seems confusing, which one should we encourage contributors to use?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#15412?email_source=notifications&email_token=ABY32P7YEQJVV5UN2X7GSELQSQRJRA5CNFSM4JHD2DOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDMS4BQ#issuecomment-551104006>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABY32P73J4L6KGFV3DAMXG3QSQRJRANCNFSM4JHD2DOA>
.
--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/
|
Actually we've already deprecated lots of things, e.g., assert_equal, assert_not_equal. |
`assert_array_equal`, `assert_allclose`, ... these functions would work on
arrays while we do not use the numpy assert for scaler `assert_equal`,
`assert_greater`, ...
…On Thu, 7 Nov 2019 at 15:45, Hanmin Qin ***@***.***> wrote:
We encourage to use pytest. Only when we want to check something on array,
we us assert_***.
Actually we've already deprecated lots of things, e.g., assert_equal,
assert_not_equal.
And I can't understand what do you mean by "check something on array"
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#15412?email_source=notifications&email_token=ABY32P457CM2VGH5LD42NFDQSQSXHA5CNFSM4JHD2DOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDMUDSY#issuecomment-551109067>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABY32P4FSRAGQ43T6WVW26TQSQSXHANCNFSM4JHD2DOA>
.
--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/
|
thanks, that's strange but I'm now able to understand. Is it the final decision in scikit-learn? |
I think that #14222 was
going towards this direction.
…On Thu, 7 Nov 2019 at 16:03, Hanmin Qin ***@***.***> wrote:
assert_array_equal, assert_allclose, ... these functions would work on
arrays while we do not use the numpy assert for scaler assert_equal,
assert_greater, ...
thanks, that's strange but I'm now able to understand. Is it the final
decision in scikit-learn?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#15412?email_source=notifications&email_token=ABY32P43OH4M2RT26DMVDWTQSQU2JA5CNFSM4JHD2DOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDMWAFY#issuecomment-551116823>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABY32P5CZQS4A66X2PDOYJTQSQU2JANCNFSM4JHD2DOA>
.
--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I'm misunderstanding something, but I think we should stop treating strings as something that can be ordered.
I think that 'pos_label' should be specified if strings are passed. There is no sensible way to infer the positive labels when targets are strings (the lexicographic order is, IMHO, completely arbitrary).
doc/whats_new/v0.22.rst
Outdated
@@ -604,6 +604,10 @@ Changelog | |||
used as the :term:`scoring` parameter of model-selection tools. | |||
:pr:`14417` by `Thomas Fan`_. | |||
|
|||
- |Fix| Fixed a bug where :func:`metrics.brier_score_loss` will raise an error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- |Fix| Fixed a bug where :func:`metrics.brier_score_loss` will raise an error | |
- |Fix| Fixed a bug where :func:`metrics.brier_score_loss` would raise an error |
sklearn/metrics/_classification.py
Outdated
@@ -2486,6 +2486,6 @@ def brier_score_loss(y_true, y_prob, sample_weight=None, pos_label=None): | |||
np.array_equal(labels, [-1])): | |||
pos_label = 1 | |||
else: | |||
pos_label = y_true.max() | |||
pos_label = labels[-1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment is OK since "otherwise pos_label is set to the greater label" is still true here . But yeah the docstring could be clarified to say that strings are also ordered by lexicographic order (which makes absolutely no sense to me)
Yes, we've already deprecated things like assert_equal because of pytest, so I want to ask whether we should also deprecate assert_almost_equal.
Actually we do so in plot_roc_curve and plot_precision_recall_curve :) |
pos_label=1. is going to be the cause of some weird issue when |
I agree. I am not sure we should merge this PR as it is because we don't have a consensus. For now I think it's fine to raise an error when the users use string labels and the brier score together without explicit pos_label, as long as the error message is explicit enough. |
@ogrisel the aim of this PR is to fix bugs of current behaviour, not to introduce new bahaviour. |
I agree with @ogrisel suggestion. This is not a new behavior this is a bug fix. |
But "This is not a new behavior this is a bug fix." is actually my suggestion? |
No the bugfix is to raise a proper error, which is also what I proposed |
Although it's dealing with scorers rather than underlying metrics, the usability issues could be partially addressed by a solution to #12385, since this "scorer builder" could construct a Brier scorer for each class, and only consider a single one positive if requested explicitly by the user.... That tool would explicitly have the job of "describe your task, and I'll give you appropriate scorers" |
Yes it fixes the incorrect inference of pos_label for strings. It made no sense before, now we error with a proper error message. This is a changed behavior (as a bug fix) and it's worth a what's new. |
We also raise an error before, this PR only improves the error message, do you still want a what's new? @NicolasHug |
right, sorry no need for a whatsnew |
ping @NicolasHug @ogrisel let's merge? The doc is wrong so prehaps we should put this into 0.22. |
Why put back the 'O' check? |
@NicolasHug |
@ogrisel what about #15412 (comment) |
removing from the milestone, happy for it to be back when we get back to it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise lgtm
sklearn/metrics/_classification.py
Outdated
if pos_label is None: | ||
if (np.array_equal(labels, [0]) or | ||
if labels.dtype.kind in ('O', 'U', 'S'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if labels.dtype.kind in ('O', 'U', 'S'): | |
if any(isinstance(label, str) for label in labels): |
assert score1 == pytest.approx(score2) | ||
|
||
# positive class if correctly inferred an object array with all ints | ||
y_pred_num_obj = np.array([0, 1, 1, 0], dtype=object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test case was added which was enabled by #15412 (comment) and with this fb199bd diff
@lucyleeow @glemaitre, another interesting PR about |
Seems that people often specify pos_label when y_true is str :)