Skip to content

[MRG + 1] Sort labels in precision_recall_fscore_support #4147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

amueller
Copy link
Member

Fixes #3670.
The code already stores and restores the ordering, so we can discard it here.

@amueller
Copy link
Member Author

Possibly of interest to @jnothman and @arjoly

@amueller
Copy link
Member Author

I guess this is also fixed in #2610 but I'm not sure what the state of that is.... (actually not sure, but definitely related)

@amueller amueller force-pushed the precision_recall_unsorted_indices branch from 4e5221b to b2ffd3e Compare January 22, 2015 21:23
@jnothman
Copy link
Member

I've not looked at the fix, but I think you should probably do so, even if I get #2610 up...

@@ -303,6 +303,18 @@ def test_precision_recall_f1_score_multiclass():
assert_array_equal(s, [24, 20, 31])


def test_precision_refcall_f1_score_multilabel_unordered_labels():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well test this in the multiclass case and for different averaging strategies, too, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be made into an invariant tests.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I use average="sample" then support is None. Is that expected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it is...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is.

@arjoly
Copy link
Member

arjoly commented Jan 23, 2015

Thanks @amueller

@amueller
Copy link
Member Author

Was that a 👍 @arjoly ? ;)

@@ -847,7 +847,7 @@ def precision_recall_fscore_support(y_true, y_pred, beta=1.0, labels=None,
if labels is None:
labels = unique_labels(y_true, y_pred)
else:
labels = np.asarray(labels)
labels = np.sort(labels)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a regression test for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't take this comment into account. :-)

@arjoly
Copy link
Member

arjoly commented Jan 26, 2015

Besides Joel comment and my small addendum, +1

@amueller
Copy link
Member Author

@arjoly I tried

import numpy as np

y_true = np.array([[1, 1, 0, 0], [1, 1, 0, 0]])
y_pred = np.array([[0, 0, 1, 1], [0, 1, 1, 0]])
labels = np.array([4, 1, 2, 3])

for name in set(MULTILABELS_METRICS).intersection(METRICS_WITH_LABELS):
    metric = ALL_METRICS[name]
    score_labels = metric(y_true, y_pred, labels=labels)
    score = metric(y_true, y_pred)
    assert_equal(score_labels, score)

but that already passes on master, I'm not sure why :-/

@amueller
Copy link
Member Author

Shouldn't the same problem pop up in precision_score and so on? They use same code path, right?

@amueller amueller force-pushed the precision_recall_unsorted_indices branch 2 times, most recently from e2abb81 to bfea61c Compare January 26, 2015 21:11
@amueller
Copy link
Member Author

Ok, never mind, forgot to add average=None. Added the test now.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.0%) to 94.78% when pulling bfea61c on amueller:precision_recall_unsorted_indices into 50b83ae on scikit-learn:master.

@amueller amueller force-pushed the precision_recall_unsorted_indices branch from bfea61c to 3566167 Compare January 26, 2015 21:46
@amueller amueller changed the title [MRG] Sort labels in precision_recall_fscore_support [MRG + 1] Sort labels in precision_recall_fscore_support Jan 26, 2015
@amueller amueller force-pushed the precision_recall_unsorted_indices branch from 3566167 to fd9b03a Compare January 26, 2015 22:07
larsmans added a commit that referenced this pull request Jan 27, 2015
FIX Sort labels in precision_recall_fscore_support
@larsmans larsmans merged commit 92e1e39 into scikit-learn:master Jan 27, 2015
@arjoly
Copy link
Member

arjoly commented Jan 27, 2015

Great thanks @amueller !

@amueller amueller deleted the precision_recall_unsorted_indices branch January 27, 2015 21:59
@amueller
Copy link
Member Author

Thanks @larsmans and @arjoly for the reviews :)

def test_no_averaging_labels():
# test labels argument when not using averaging
# in multi-class and multi-label cases
y_true_multilabel = np.array([[1, 1, 0, 0], [1, 1, 0, 0]])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you referencing label 4 when the only labels available are 0, 1, 2, 3? It's a wonder that this works... Or rather, it's no surprise this breaks #4287...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants