Skip to content

ENH P/R/F should be able to ignore a majority class in the multiclass case #1983

Closed
@jnothman

Description

@jnothman

P/R/F are famous for handling class imbalance in the binary classification case. Correct me if I'm wrong (@arjoly?), but imbalance against a majority negative class should also be handled in the multiclass case. In particular, while the documentation currently states that micro-averaged P = R = F, this is not true of the case where a negative class is ignored; but it should be possible to ignore a negative class for any of the average settings.

Indeed, I think the pos_label argument is a mistake (except in that you can more reliably provide a default value than for neg_label): it only applies to the binary case and overrides the average setting; neg_label would apply to all multiclass averaging methods.

It should be easy to implement: treat the problem as multilabel and delete the neg_label column from the label indicator matrix. I.e. it is the case where each instance is assigned 0 or 1 label.

The tricky part is the interface: should pos_label be deprecated? Deprecation makes sense as pos_label and neg_label should not be necessary together. But if so, how do we ensure the binary case works by default?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions