ENH P/R/F should be able to ignore a majority class in the multiclass case

P/R/F are famous for handling class imbalance in the binary classification case. Correct me if I'm wrong (@arjoly?), but imbalance against a majority negative class should also be handled in the multiclass case. In particular, while the documentation currently states that micro-averaged P = R = F, this is not true of the case where a negative class is ignored; but it should be possible to ignore a negative class for any of the `average` settings.

Indeed, I think the `pos_label` argument is a mistake (except in that you can more reliably provide a default value than for `neg_label`): it only applies to the binary case and overrides the average setting; `neg_label` would apply to all multiclass averaging methods.

It should be easy to implement: treat the problem as multilabel and delete the `neg_label` column from the label indicator matrix. I.e. it is the case where each instance is assigned 0 or 1 label.

The tricky part is the interface: should `pos_label` be deprecated? Deprecation makes sense as `pos_label` and `neg_label` should not be necessary together. But if so, how do we ensure the binary case works by default?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH P/R/F should be able to ignore a majority class in the multiclass case #1983

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ENH P/R/F should be able to ignore a majority class in the multiclass case #1983

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions