[MRG] Improvement and bug fix for brier_score_loss #9562

qinhanmin2014 · 2017-08-16T02:30:11Z

Reference Issue

Finish up #9521

What does this implement/fix? Explain your changes.

(1)raise an error for multiclass y_true
(2)raise an error for invalid pos_label
(3)fix an old bug when y_true only has one label (closes #8459, closes #9300, closes #9301)

Any other comments?

amueller · 2017-08-16T21:04:42Z

sklearn/metrics/classification.py

+        if (np.array_equal(classes, [0]) or
+           np.array_equal(classes, [-1]) or
+           np.array_equal(classes, [1])):
+            pos_label = 1.


Is this treatment from logloss? it seems a bit magical.

@amueller Thanks.
Currently, log_loss don't support pos_label so the code is taken from _binary_clf_curve. But in the test we have to support situations when y_pred = ['egg', 'spam'] without a pos_label (see test_invariance_string_vs_numbers_labels). In order not to remove test, I give this awkward solution.
Btw, it might be OK not to use float like _binary_clf_curve because np.array_equal([1.0], [1])=True

I don't understand. That test uses pos_label right?

qinhanmin2014 · 2017-08-22T00:39:36Z

ping @jnothman @amueller Thanks. :)

qinhanmin2014 · 2017-08-28T00:04:53Z

ping @jnothman @amueller This pull request is based on your suggestions in #9521, could you kindly please review it? Thanks :)

jnothman · 2017-08-28T01:49:20Z

Patience... I'm mostly doing this on volunteer time, and Andy's had some presentations to give. Thanks @qinhanmin2014 for your very good work.

qinhanmin2014 · 2017-08-28T02:06:52Z

@jnothman Thanks. Kindly give me some suggestions when you have time and I'll improve accordingly as soon as possible.

jnothman · 2017-08-28T13:29:41Z

sklearn/metrics/classification.py

+    if len(classes) > 2:
+        raise ValueError("Only binary classification is supported.")
+
+    # ensure valid y_true if pos_label is not specified
    if pos_label is None:


Why instead don't you put this logic in _check_binary_probabilistic_predictions? Surely it (if not pos_label, at least the handling of a single class) applies there too.

qinhanmin2014 · 2017-08-28T14:55:25Z

@jnothman Thanks. _check_binary_probabilistic_predictions already has enough logic so I think you're asking me to use the function directly. The previous implementation use the function after y_true is binarized(y_true = np.array(y_true == pos_label, int)). This is why we accept multiclass classification previously. Now the issue fix two bugs along with an improvement. I've updated the main content to provide more information.

jnothman · 2017-08-28T23:30:23Z

sklearn/metrics/classification.py

+    _check_binary_probabilistic_predictions(y_true, y_prob)
+
+    # ensure valid y_true if pos_label is not specified
+    classes = np.unique(y_true)


I meant that this logic could move to the helper, so that other binary probabilistic metrics would similarly identify -1 alone as the negative class, and to avoid rubbing unique multiple times.

@jnothman Sorry but I don't quite understand. This part is about giving pos_label the right value and raise an error when needed. Currently, _check_binary_probabilistic_predictions don't accept pos_label as a parameter. So it might be hard to move the logic into _check_binary_probabilistic_predictions .

After determining pos_label's appropriate value, it is used once: to binarise y_true into {0,1}. This is also what _check_binary_probabilistic_predictions does, only without the ability to configure pos_label. If _check_binary_probabilistic_predictions took an optional pos_label you could handle all cases.

@jnothman Sorry to disturb again but I have some questions:
(1)It seems that _check_binary_probabilistic_predictions is only used in brier_score_loss(metrics.classification) and calibration_curve(calibration).
The latter do not support pos_label.
(2)I am wondering how we should took the optional pos_label since even if we support pos_label, there are cases when pos_label is not provided by users(default to None).
It seems that I cannot use def(..., pos_label=None) and it seems ugly to use something like def(..., **dict) .
(3)From my perspective, since _check_binary_probabilistic_predictions is only used in two functions and in each of these two functions, some part of it is meaningless(e.g., label_binarize in brier_score_loss and the check of y_prob in calibration_curve), how about remove the function(no test directly use the function)?

label_binarize isn't meaningless in brier: it's basically identical to what's happening in https://github.com/scikit-learn/scikit-learn/pull/9562/files/adfbbec50f326fe0a3678ea29c630269ef258fe1#diff-b04acd877dd793f28ae7be13a999ed88R1931

And why can't you use def _check_binary_probabilistic_predictions(y_true, y_prob, pos_label=None)?

@jnothman For example, in the function, if pos_label=None, there are two situations
(1)I don't pass pos_label(used in calibration)
(2)I pass pos_label=None(used in brier_score_loss when pos_label is not provided)
How can I distinguish them? Thanks. :)

Why do you need to distinguish them? Should they not be treated the same?

qinhanmin2014 · 2017-08-30T14:42:07Z

@jnothman @amueller
Sorry for wasting your time but I think fixing too many bugs in a pull request make me a bit confused. I just searched for scikit-learn and find other pull requests(#8459, #9301) addressing the bug related to single class in brier. So I make this pull request more focused. If needed, I'll propose more pull requests for the rest of my work.

jnothman · 2017-09-04T06:22:41Z

sklearn/metrics/classification.py

@@ -1912,8 +1912,10 @@ def brier_score_loss(y_true, y_prob, sample_weight=None, pos_label=None):
    assert_all_finite(y_true)
    assert_all_finite(y_prob)

+    # currently, we only support binary classification
+    _check_binary_probabilistic_predictions(y_true, y_prob)
+
    if pos_label is None:


I think these two lines are unnecessary if you just set y_true to the output of _check_binary_probabilistic_predictions

@jnothman Sorry but I don't understand your meaning. Which two lines? The core issue I want to solve here is that previous implementation use the function after y_true is binarized. Kindly provide more details please. Thanks.

Right, I realise again that you're not using _check_binary_probabilistic_predictions because it does not support pos_label. But it probably should.

My issue is that _check_binary_probabilistic_predictions does processing and returns something. It's weird not to use that return value.

I think I would add to _check_binary_probabilistic_predictions:

out = label_binarize... if pos_label is not None: if pos_label != labels[-1]: if pos_label != labels[0]: raise ValueError... out = 1 - out

qinhanmin2014 · 2017-09-10T02:42:16Z

@jnothman Thanks a lot for your detailed suggestions :)
Now the pull request
(1)raise an error for multiclass y_true
(2)raise an error for invalid pos_label
(3)fix an old bug when y_true only has one label (#8459, #9300, #9301)

qinhanmin2014 · 2017-09-28T03:29:55Z

ping @jnothman only for the existance of the old PR

@jnothman Thanks a lot for your detailed suggestions :)
Now the pull request
(1)raise an error for multiclass y_true
(2)raise an error for invalid pos_label
(3)fix an old bug when y_true only has one label (#8459, #9300, #9301)

qinhanmin2014 · 2017-10-17T15:48:57Z

ping @jnothman, over a month since the last reply. Could you please give a further review on it? Thanks a lot :)

jnothman · 2017-10-17T21:20:00Z

thanks for your great contributions, but my availability has been reduced over the last month with 0.19.1 the major priority

qinhanmin2014 · 2017-10-18T00:29:34Z

@jnothman Thanks a lot :) I'll wait.

qinhanmin2014 · 2017-11-02T01:49:40Z

Temporarily closing to consider the solution. Will split/reopen the PR if necessary.

@jnothman

…ision_score (#9980)  #### Reference Issues/PRs  part of #9829 #### What does this implement/fix? Explain your changes. (1)add pos_label parameter to average_precision_score (Although we finally decide not to introduce pos_label in roc_auc_score, I think we might need pos_label here. Because there are no relationship between the results if we reverse the true labels, also, precision/recall all support pos_label) (2)fix a bug where average_precision_score will sometimes return nan when sample_weight contains 0 ```python y_true = np.array([0, 0, 0, 1, 1, 1]) y_score = np.array([0.1, 0.4, 0.85, 0.35, 0.8, 0.9]) average_precision_score(y_true, y_score, sample_weight=[1, 1, 0, 1, 1, 0]) # output:nan ``` I do it here because of (3) (3)move average_precision scores out of METRIC_UNDEFINED_BINARY (this should contain the regression test for (1) and (2)) Some comments: (1)For the underlying method(precision_recall_curve), the default value of pos_label is None, but I choose to set the default value of pos_label to 1 because this is what P/R/F is doing. What's more, the meaning of pos_label=None is not clear even in scikit-learn itself (see #10010) (2)I slightly modified the common test. Currently, the part I modified is only designed for brier_score_loss(I'm doing the same thing in #9562) . I think it is right because as a common test, it seems not good to force metrics to accept str y_true without pos_label. #### Any other comments? cc @jnothman Could you please take some time to review or at least judge whether this is the right way to go? Thanks a lot :)

qinhanmin2014 added 2 commits August 16, 2017 10:19

improve brier_score_loss

6497d8f

restore _check_binary_probabilistic_predictions

941c578

qinhanmin2014 changed the title ~~[WIP] Improvement and bug fix for brier_score_loss~~ [MRG] Improvement and bug fix for brier_score_loss Aug 16, 2017

qinhanmin2014 mentioned this pull request Aug 16, 2017

[MRG+1] Add scorer based on brier_score_loss #9521

Merged

amueller reviewed Aug 16, 2017

View reviewed changes

jnothman reviewed Aug 28, 2017

View reviewed changes

use _check_binary_probabilistic_predictions

adfbbec

jnothman reviewed Aug 28, 2017

View reviewed changes

qinhanmin2014 added 3 commits August 30, 2017 21:35

cut some fix

2d95ec9

cut the fix

310a514

cut the fix

ccf58b7

qinhanmin2014 changed the title ~~[MRG] Improvement and bug fix for brier_score_loss~~ [MRG] Raise an error for multiclass y_true in brier_score_loss Aug 30, 2017

jnothman reviewed Sep 4, 2017

View reviewed changes

qinhanmin2014 added 2 commits September 10, 2017 10:29

restore the fix

60a6177

minor fix

986de75

add test

08e4189

qinhanmin2014 mentioned this pull request Sep 20, 2017

roc_auc_score should be calculated regardless of classification label #9805

Closed

qinhanmin2014 changed the title ~~[MRG] Raise an error for multiclass y_true in brier_score_loss~~ [MRG] Improve and bug fix for brier_score_loss Oct 15, 2017

qinhanmin2014 changed the title ~~[MRG] Improve and bug fix for brier_score_loss~~ [MRG] Improvement and bug fix for brier_score_loss Oct 15, 2017

qinhanmin2014 added 3 commits October 15, 2017 17:10

Merge branch 'master' into my-feature-2

87f4c82

minor improve

4accc42

Merge remote-tracking branch 'upstream/master' into my-feature-2

9fd3206

This was referenced Oct 25, 2017

[MRG+2] ENH&BUG Add pos_label parameter and fix a bug in average_precision_score #9980

Merged

Different(wrong?) meaning of pos_label=None #10010

Open

qinhanmin2014 closed this Nov 2, 2017

qinhanmin2014 mentioned this pull request Feb 1, 2018

[MRG] Fixed brier_score_loss function when y_true only has one label #8459

Closed

qinhanmin2014 mentioned this pull request Apr 12, 2019

[MRG] FIX Correct brier_score_loss when there's only one class in y_true #13628

Merged

qinhanmin2014 deleted the my-feature-2 branch May 5, 2019 05:08

Uh oh!

[MRG] Improvement and bug fix for brier_score_loss #9562

[MRG] Improvement and bug fix for brier_score_loss #9562

Uh oh!

Conversation

qinhanmin2014 commented Aug 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qinhanmin2014 commented Aug 22, 2017

Uh oh!

qinhanmin2014 commented Aug 28, 2017

Uh oh!

jnothman commented Aug 28, 2017

Uh oh!

qinhanmin2014 commented Aug 28, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qinhanmin2014 commented Aug 28, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qinhanmin2014 commented Aug 30, 2017

Uh oh!

jnothman Sep 4, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qinhanmin2014 commented Sep 10, 2017

Uh oh!

qinhanmin2014 commented Sep 28, 2017

Uh oh!

qinhanmin2014 commented Oct 17, 2017

Uh oh!

jnothman commented Oct 17, 2017 via email

Uh oh!

qinhanmin2014 commented Oct 18, 2017

Uh oh!

qinhanmin2014 commented Nov 2, 2017

Uh oh!

Uh oh!

qinhanmin2014 commented Aug 16, 2017 •

edited

Loading

jnothman Sep 4, 2017 •

edited

Loading