-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Add unicode support to sklearn.metrics.classification_report #2462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… and NOT_SYMMETRIC_METRICS dicts
… to be safely used in finally statement
'{0}'.format(arg) doesn't promote the whole string to unicode if arg is unicode - it tries to encode arg to sys.getdefaultencoding() instead. "%s" doesn't have this gotcha.
Looks good to me. Merging. Thanks a lot. |
[MRG] Add unicode support to sklearn.metrics.classification_report
Jenkins seems to fail since this PR has been merged.
|
@@ -1138,6 +1159,10 @@ def test_invariance_string_vs_numbers_labels(): | |||
labels_str = ["eggs", "spam"] | |||
|
|||
for name, metric in CLASSIFICATION_METRICS.items(): | |||
if isinstance(metric, partial) and 'labels' in metric.keywords: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, why doing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 3 tests that run on ALL_METRICS; two other tests support classification metrics without 'labels' argument, this one doesn't support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I wrong? I wrote some code to support this see line 1168 and beyond. But it is true that your solution is smarter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you proposing? Check that fails is
assert_array_equal(measure_with_number, measure_with_str,
err_msg="{0} failed string vs number invariance "
"test".format(name))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Smarter test code is usually a worse test code :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, we assume that the user put the appropriate labels in the labels
argument.
If labels
is None, we try to infer it the best we can.
(This could lead to some problems see #2029 and the discussion in #2094)
So "I'm not sure what is correct behavior here. y values are "egg" and "spam" and labels are [1, 2, 3]"
is undetermined behavior at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it was correct to skip that test, at least now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The correct action would be to delete "confusion_matrix_with_labels" from CLASSIFICATION_METRICS
.
By adding this line if isinstance(metric, partial) and 'labels' in metric.keywords:
, you skip any metric define
in CLASSIFICATION_METRICS
with a partial and a label
keyword. It means that you also skip test for macro/micro/weighted precision/recall/f-score.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it skips those tests: keywords
are keyword arguments passed to functool.partial
function, not arguments of the decorated function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right on this. Sorry.
Interesting: this is not just a formatting issue, numbers in classification report are different. And I've just copied a first part of test_classification_report_multiclass_with_string_label into that test and perform one seemingly innocent tweak. It seems that the only difference that can cause this failure is
vs
|
@@ -710,9 +711,9 @@ def test_precision_recall_f1_score_multiclass_pos_label_none(): | |||
def test_zero_precision_recall(): | |||
"""Check that pathological cases do not bring NaNs""" | |||
|
|||
try: | |||
old_error_settings = np.seterr(all='raise') | |||
old_error_settings = np.seterr(all='raise') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, we could have used with np.errstate
.
I think that what cause the failure is bytes vs unicode change itself - it turns out that LabelEncoder works incorrectly for unicode values under Python 2.6 + numpy 1.3, and this error propagates up to classification_report. |
Thanks @kmike for investigating !!! |
sklearn.metrics.classification_report
with unicode labels was broken in Python 2.x because of how '{0}'.format works.