Description
P/R/F are famous for handling class imbalance in the binary classification case. Correct me if I'm wrong (@arjoly?), but imbalance against a majority negative class should also be handled in the multiclass case. In particular, while the documentation currently states that micro-averaged P = R = F, this is not true of the case where a negative class is ignored; but it should be possible to ignore a negative class for any of the average
settings.
Indeed, I think the pos_label
argument is a mistake (except in that you can more reliably provide a default value than for neg_label
): it only applies to the binary case and overrides the average setting; neg_label
would apply to all multiclass averaging methods.
It should be easy to implement: treat the problem as multilabel and delete the neg_label
column from the label indicator matrix. I.e. it is the case where each instance is assigned 0 or 1 label.
The tricky part is the interface: should pos_label
be deprecated? Deprecation makes sense as pos_label
and neg_label
should not be necessary together. But if so, how do we ensure the binary case works by default?