Description
In the documentation, section 3.3.1.1. "Common cases: predefined values" includes the remark
All scorer objects follow the convention that higher return values are better than lower return values.
As far as I can tell, this is true for all of the listed metrics, except the brier_score_loss
. In the case of brier_score_loss
, a lower loss value is better. This is because brier_score_loss
measures the mean-square difference between a predicted probability and a categorical outcome; the Brier score is minimized at 0.0 because all summands are either (0 - 0) ^ 2=0
or (1 -1) ^ 2=0
when the model is making perfect predictions. On the other hand, the Brier score is maximized at 1.0 when all predictions are opposite the correct label, as all summands are either (0 - 1)^2=1
or (1 - 0)^2=1
.
Therefore, the definition of the brier_score_loss
is not consistent with the quotation from section 3.3.1.1.
I suggest making 2 changes to relieve this confusion.
-
Implement a function
neg_brier_score_loss
which simply negates the value ofbrier_score_loss
; this is a direct analogy to what is done in the case ofneg_log_loss
. A better model has a lower value of log-loss (categorical cross-entropy loss), therefore a larger value of the negative log-loss implies a better model. Naturally, the same is true for Brier score, where it is also the case that a better model is assigned a lower loss. -
Remove reference to
brier_score_loss
from section 3.3.1.1. Brier score is useful in lots of ways; however, because it does not have the property that a larger value implies a better model, it seems confusing to mention it in the context of section 3.3.1.1. References tobrier_score_loss
can be replaced withneg_brier_score_loss
, which has the property that better models have large values, just like accuracy, ROC AUC and the rest of the listed metrics.