[MRG] Fixed brier_score_loss function when y_true only has one label #8459

jorkro · 2017-02-25T22:52:58Z

For issue #6980 a fix was introduced allowing the brier_score_loss function to accept y_true with a single label. Test cases were also added, but only the scenario where the probability is 0.5 was tested. 0.5 is problematic because it produces the same Brier score, regardless of whether the outcome is 0 or 1, and it's producing incorrect scores for values other than 0.5 in this scenario. Since brier_score_loss indirectly calls label_binarize, y_true with a single label is converted to a zero-valued array, no matter what is set as a positive label:

brier_score_loss([1], [0.4], pos_label=1) produces 0.16, because [1] is converted to [0]. The correct answer should have been 0.36.

The fix contains a small modification that allows binary arrays to pass through (so it skips label_binarize). Test cases have also been added.

Let me know if anything else is required.

…s one label by not letting label_binarize convert 1s to 0s.

codecov · 2017-02-25T23:29:11Z

Codecov Report

Merging #8459 into master will decrease coverage by -0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #8459      +/-   ##
==========================================
- Coverage   95.47%   95.47%   -0.01%     
==========================================
  Files         342      342              
  Lines       60907    60911       +4     
==========================================
+ Hits        58154    58157       +3     
- Misses       2753     2754       +1

Impacted Files	Coverage Δ
sklearn/metrics/classification.py	`97.51% <100%> (-0.27%)`	❌
sklearn/metrics/tests/test_classification.py	`100% <100%> (ø)`	✅

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 067adad...048b5e8. Read the comment docs.

jorkro · 2017-02-26T08:09:33Z

Hello,

I'm seeing that code coverage was reduced slightly with this fix. The reason I made the pass-through conditional is because _check_binary_probabilistic_predictions is also being used by calibration_curve. brier_score_loss ensures that all data is binary because of np.array(y_true == pos_label, int), calibration_curve does not.

I see 3 approaches here:

I leave it as-is.
I add non-binary test-cases for calibration_curve so coverage is 100% again. I suspect however that that's not its intended use.
I modify _check_binary_probabilistic_predictions so it raises an error when the data is not binary and skip the label_binarize step if it is.

Let me know which approach you prefer.

nelson-liu · 2017-02-27T18:08:30Z

sklearn/metrics/tests/test_classification.py

@@ -1463,5 +1463,7 @@ def test_brier_score_loss():
    assert_raises(ValueError, brier_score_loss, y_true, y_pred + 1.)
    assert_raises(ValueError, brier_score_loss, y_true, y_pred - 1.)
    # calculate even if only single class in y_true (#6980)
-    assert_almost_equal(brier_score_loss([0], [0.5]), 0.25)


sorry if i'm being slow, but is there any reason to remove these?

@nelson-liu Considering the other tests that I had added, I felt these were redundant. I can add them back if you want.

jnothman

@qinhanmin2014 could you double check that this is the right solution?

qinhanmin2014 · 2018-02-01T14:53:06Z

@jnothman
It's not completely right since it does not consider -1 (it only considers 0 and 1).
I might be -1 of merging this. We lack some basic input validations in brier_score_loss (see #9562). Once we add these checks, I'm afraid we have to rewrite the code.
I think currently, the problem is how do define pos_label=None (see #10010), once we've made the decision, I'd prefer something like #9562 to solve all the known problems in brier_score_loss to avoid unnecessary work.

jnothman · 2019-04-10T11:15:37Z

@qinhanmin2014 wrote an age ago:

It's not completely right since it does not consider -1 (it only considers 0 and 1).

That's not actually true, since y_true is binarized in brier_score_loss before being passed to _check_binary_probabilistic_predictions, which is modified here.

We should add tests with pos_label=1 where negative is represented as -1, and pos_label='yes' when negative is represented as 'no', just to be sure.

Otherwise this LGTM

qinhanmin2014 · 2019-04-10T13:40:34Z

@qinhanmin2014 wrote an age ago:
It's not completely right since it does not consider -1 (it only considers 0 and 1).
That's not actually true, since y_true is binarized in brier_score_loss before being passed to _check_binary_probabilistic_predictions, which is modified here.

Sigh, I feel that again I wasted a whole year.
@jnothman I mean when y_true only contains -1, all of them will be treated as positive class (they should be treated as negative class).
I think a reasonable solution here will be to copy the definition of pos_label=None from precision_recall_curve/roc_auc_curve and deprecate pos_label=None in 0.22. WDYT?

jnothman · 2019-04-10T21:42:31Z

What do you mean by changing the definition but also deprecating? I would be happy to just deprecate the current behaviour towards doing the same as ROC

qinhanmin2014 · 2019-04-11T01:30:20Z

What do you mean by changing the definition but also deprecating? I would
be happy to just deprecate the current behaviour towards doing the same as
ROC

I mean first change the definition of the default in 0.21 (this will be a small PR which might be easier to review), then deprecate pos_label=None in 0.22.

jnothman · 2019-04-11T02:50:15Z

I don't know if we should deprecate pos_label=None, but we could warn and then raise an error if y_true is constant...?

…

On Thu, 11 Apr 2019 at 11:30, Hanmin Qin ***@***.***> wrote: What do you mean by changing the definition but also deprecating? I would be happy to just deprecate the current behaviour towards doing the same as ROC I mean first change the definition of the default in 0.21 (this will be a small PR which might be easier to review), then deprecate pos_label=None in 0.22. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

qinhanmin2014 · 2019-04-11T02:53:12Z

I don't know if we should deprecate pos_label=None

I'm aware that this might be controversial, this is why I propose to do it in another PR.

but we could warn and then raise an error if y_true is constant...?

Why? constant y_true in still a valid option.

I'll open a PR for you to review tomorrow evening.

Fixed brier_score_loss function in the case where y_true only contain…

048b5e8

…s one label by not letting label_binarize convert 1s to 0s.

nelson-liu reviewed Feb 27, 2017

View reviewed changes

qinhanmin2014 mentioned this pull request Aug 30, 2017

[MRG] Improvement and bug fix for brier_score_loss #9562

Closed

jnothman reviewed Feb 1, 2018

View reviewed changes

gnsiva mentioned this pull request Jun 12, 2018

[MRG + 1] brier_score_loss returns incorrect value when all y_true values are True/1 #9301

Closed

jnothman mentioned this pull request Jun 12, 2018

brier_score_loss error #11245

Closed

jnothman added this to the 0.21 milestone Apr 10, 2019

qinhanmin2014 mentioned this pull request Apr 12, 2019

[MRG] FIX Correct brier_score_loss when there's only one class in y_true #13628

Merged

glemaitre closed this in #13628 Apr 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Fixed brier_score_loss function when y_true only has one label #8459

[MRG] Fixed brier_score_loss function when y_true only has one label #8459

jorkro commented Feb 25, 2017

codecov bot commented Feb 25, 2017

jorkro commented Feb 26, 2017 •

edited

Loading

nelson-liu Feb 27, 2017

jorkro Feb 27, 2017

jnothman left a comment

qinhanmin2014 commented Feb 1, 2018

jnothman commented Apr 10, 2019

qinhanmin2014 commented Apr 10, 2019

jnothman commented Apr 10, 2019 via email

qinhanmin2014 commented Apr 11, 2019

jnothman commented Apr 11, 2019 via email

qinhanmin2014 commented Apr 11, 2019

[MRG] Fixed brier_score_loss function when y_true only has one label #8459

[MRG] Fixed brier_score_loss function when y_true only has one label #8459

Conversation

jorkro commented Feb 25, 2017

codecov bot commented Feb 25, 2017

Codecov Report

jorkro commented Feb 26, 2017 • edited Loading

nelson-liu Feb 27, 2017

Choose a reason for hiding this comment

jorkro Feb 27, 2017

Choose a reason for hiding this comment

jnothman left a comment

Choose a reason for hiding this comment

qinhanmin2014 commented Feb 1, 2018

jnothman commented Apr 10, 2019

qinhanmin2014 commented Apr 10, 2019

jnothman commented Apr 10, 2019 via email

qinhanmin2014 commented Apr 11, 2019

jnothman commented Apr 11, 2019 via email

qinhanmin2014 commented Apr 11, 2019

jorkro commented Feb 26, 2017 •

edited

Loading