-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG + 1] brier_score_loss returns incorrect value when all y_true values are True/1 #9301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
looks good. Can you please add an entry to the changelog in whatsnew? |
Sure, thanks for reviewing it @amueller |
doc/whats_new/v0.20.rst
Outdated
@@ -434,6 +434,10 @@ Metrics | |||
:issue:`9515` by :user:`Alan Liddell <aliddell>` and | |||
:user:`Manh Dao <manhdao>`. | |||
|
|||
- Fixed a bug in: :func:`metrics.brier_score_loss`, when all `y_true` values are 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
single backticks don't do anything. please use double backticks! (they are fine for issue and user below, but not on their own)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, will fix now!
sklearn/metrics/classification.py
Outdated
@@ -1951,7 +1952,7 @@ def brier_score_loss(y_true, y_prob, sample_weight=None, pos_label=None): | |||
|
|||
pos_label : int or str, default=None | |||
Label of the positive class. If None, the maximum label is used as | |||
positive class | |||
positive class. If all values are 0/False, then 1 is used as pos_label. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe say "0 or False"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
Sounds good, I'll look into it |
|
||
|
||
@ignore_warnings | ||
def test_all_true_pos_label(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this.
Should we be explicitly passing pos_label=1?
And I'm not sure about what we're saying about metrics where the second arg is a score or probability here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to explicitly pass that.
I thought the test would be like the equivalent of check_classifiers_one_label
but for those metrics.
doc/whats_new/v0.20.rst
Outdated
@@ -444,6 +444,10 @@ Metrics | |||
:issue:`9515` by :user:`Alan Liddell <aliddell>` and | |||
:user:`Manh Dao <manhdao>`. | |||
|
|||
- Fixed a bug in: :func:`metrics.brier_score_loss`, when all ``y_true`` values are 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drop the first :. Change "1," to "1." and start a new sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
@jnothman thanks for taking a look before. I've made the changes suggested, please let me know if there is anything else you would like me to do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise LGTM
sklearn/metrics/tests/test_common.py
Outdated
all_ones = np.array([1, 1, 1]) | ||
|
||
for name in METRICS_WITH_POS_LABEL: | ||
if not name == 'roc_curve': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be !=
sklearn/metrics/tests/test_common.py
Outdated
for name in METRICS_WITH_POS_LABEL: | ||
if not name == 'roc_curve': | ||
metric = ALL_METRICS[name] | ||
perfect_score = metric(examples, examples, pos_label=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metric(examples, examples) is not necessarily perfect prediction score if the second argument is meant to be a score rather than a label. You've already seen that in ROC AUC. Is there some better way to characterise metrics for which 0 score does not necessarily mean "predicting the negative class", so that we can give the excepted metrics some name (like how THRESHOLDED_METRICS is used elsewhere in this file)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a tough time thinking of a snappy name, but would something like METRICS_TARGET_VS_PREDICTION_WITH_POS_LABEL
work? It would then encompass all the functions in METRICS_WITH_POS_LABEL
except for roc_curve
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
POSITIVE_SCORE_MEANS_POSITIVE_CLASS
?
@jnothman thanks have made that change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I'm just trying to work out how to express it in the negative. Perhaps just MAY_NOT_BE
To have a separate list containing only |
I don't intuitively understand what those names mean. could you please
describe them?
|
Yes sure, so the second argument of So in the case of |
Except that brier score takes a score?
|
oops, good point |
what's the status of this? |
I think the main issue here is the definition of |
Fixes #9300. Fixes #8459
This is in reference to this bug report.
Some of the added tests would not pass using the current version of
brier_score_loss
. I have added a potential fix for the issue. The solution isn't too pretty, but the only alternative I can see is to change thelabel_binarizer
function in a way that is probably undesirable.