Skip to content

Conversation

thomasjpfan
Copy link
Member

Reference Issues/PRs

Fixes #12309 for multilabel and multioutput

What does this implement/fix? Explain your changes.

Adds test to check that multilabel and multioutput metrics are invariant under label permutations.

score = metric(y_true, y_score)

for perm in permutations(range(n_classes), n_classes):
inv_perm = np.zeros(n_classes, dtype=int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Any reason you first do the inv_perm instead of directly using the permutations? Would it change the test at all if you're going through all permutations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! This PR was updated to use the permutations directly.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @thomasjpfan!

@adrinjalali adrinjalali changed the title [MRG] Adds multilabels permutation tests to metrics/test_common [MRG+1] Adds multilabels permutation tests to metrics/test_common Dec 18, 2018


@pytest.mark.parametrize(
'name', MULTILABELS_METRICS - NOT_SYMMETRIC_METRICS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need symmetry?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not need it. This was a little too restrictive.

This PR was updated to remove the "unnormalized_multilabel_confusion_matrix" from this test. It returns a matrix as a metric, which makes the result dependent on the order of the labels.

y_true = random_state.randint(0, 2, size=(n_samples, n_classes))
y_score = random_state.normal(size=y_true.shape)

# Makes sure all samples have multiple classes. This works around errors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"multiple classes" -> at least one label
?

@jnothman
Copy link
Member

Otherwise LGTM.

@jnothman jnothman merged commit e6cbd26 into scikit-learn:master Dec 20, 2018
@jnothman
Copy link
Member

thanks for all your great work, @thomasjpfan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add multiclass and multilabel metric tests for label permutations
3 participants