BUG need to ensure classification metrics are sane under (non-stratified) cross-validation

Where a dataset is split up and not all evaluated at once, some classes may be missing from evaluation. Metrics implementations get around problems relating to classes appearing not in both the `y_true` and `y_pred` by considering the union of their labels. However, this is insufficient if a label that existed in the training set for a fold is absent from both the predicted and true test targets.

This is at least a problem for the P/R/F family of metrics with `average='macro'` and `labels` unspecified, and it should be documented (though a user shouldn't be using `'macro'` if there are infrequent labels). I haven't thought yet about whether it is an issue elsewhere, or whether it can be reasonably tested.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG need to ensure classification metrics are sane under (non-stratified) cross-validation #2029

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

BUG need to ensure classification metrics are sane under (non-stratified) cross-validation #2029

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions