brier_score_loss returns incorrect value when all y_true values are True/1 #9300

gnsiva · 2017-07-08T11:17:54Z

In [7]: brier_score_loss(np.array([1, 1, 1]), np.array([1, 1, 1]))
Out[7]: 1.0

In [8]: brier_score_loss(np.array([0, 0, 0]), np.array([0, 0, 0]))
Out[8]: 0.0

In [9]: brier_score_loss(np.array([True, True, True]), np.array([1, 1, 1]))
Out[9]: 1.0

In [10]: brier_score_loss(np.array(["foo", "foo", "foo"]), np.array([1, 1, 1]), pos_label="foo")
Out[10]: 1.0

In all of these cases the output should be 0, as the y_pred correctly predicts y_true.

The function calls are brier_score_loss -> _check_binary_probabilistic_predictions -> label_binarize, with the issue starting in label_binarize.

brier_score_loss has a pos_label parameter, but setting this to 1, does not fix the issue. It then overwrites y_true with True and False based on the pos_label parameter. It then calls _check_binary_probabilistic_predictions to check the y_true and y_pred values, and returns the output of label_binarize(y_true, labels), where labels are the unique values in y_label. Manually setting this to [0, 1] for the case where y_true is all 1s does not fix the issue.

In label_binarize, if there is only one class, the array is returned as 0 + the negative label

    if y_type == "binary":
        if n_classes == 1:
            if sparse_output:
                return sp.csr_matrix((n_samples, 1), dtype=int)
            else:
                Y = np.zeros((len(y), 1), dtype=np.int)
                Y += neg_label
                return Y

The resulting comparison in brier_score_loss - np.average((y_true - y_prob) ** 2, weights=sample_weight) then has y_true values of all 0s instead of 1s, hence the incorrect score.

Operating system: Ubuntu 17.04 64 bit, problem present in master, 18.1 and 18.2.

The text was updated successfully, but these errors were encountered:

gnsiva · 2017-07-08T11:42:34Z

I have made a tentative pull request here.

Erotemic · 2018-04-05T19:17:59Z

I'm still experiencing this issue in 0.19.1. PR #9980 seems to address this, but #9300 and #9980 do not reference each other. This comment should fix that.

gnsiva changed the title ~~brier_score_loss returns incorrect value when all y_true values are True/1~~ brier_score_loss returns incorrect value when all y_true values are True/1 Jul 8, 2017

This was referenced Jul 8, 2017

Brier score loss bug gnsiva/scikit-learn#1

Closed

[MRG + 1] brier_score_loss returns incorrect value when all y_true values are True/1 #9301

Closed

jnothman added the Bug label Jul 8, 2017

qinhanmin2014 mentioned this issue Sep 10, 2017

[MRG] Improvement and bug fix for brier_score_loss #9562

Closed

jnothman mentioned this issue Jun 12, 2018

brier_score_loss error #11245

Closed

qinhanmin2014 mentioned this issue Apr 12, 2019

[MRG] FIX Correct brier_score_loss when there's only one class in y_true #13628

Merged

glemaitre closed this as completed in #13628 Apr 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

brier_score_loss returns incorrect value when all y_true values are True/1 #9300

brier_score_loss returns incorrect value when all y_true values are True/1 #9300

gnsiva commented Jul 8, 2017 •

edited

Loading

gnsiva commented Jul 8, 2017

Erotemic commented Apr 5, 2018 •

edited

Loading

brier_score_loss returns incorrect value when all y_true values are True/1 #9300

brier_score_loss returns incorrect value when all y_true values are True/1 #9300

Comments

gnsiva commented Jul 8, 2017 • edited Loading

gnsiva commented Jul 8, 2017

Erotemic commented Apr 5, 2018 • edited Loading

gnsiva commented Jul 8, 2017 •

edited

Loading

Erotemic commented Apr 5, 2018 •

edited

Loading