[MRG] Add unicode support to sklearn.metrics.classification_report #2462

kmike · 2013-09-20T16:57:41Z

sklearn.metrics.classification_report with unicode labels was broken in Python 2.x because of how '{0}'.format works.

… and NOT_SYMMETRIC_METRICS dicts

… to be safely used in finally statement

'{0}'.format(arg) doesn't promote the whole string to unicode if arg is unicode - it tries to encode arg to sys.getdefaultencoding() instead. "%s" doesn't have this gotcha.

GaelVaroquaux · 2013-09-22T01:16:15Z

Looks good to me. Merging. Thanks a lot.

[MRG] Add unicode support to sklearn.metrics.classification_report

glouppe · 2013-09-25T07:08:58Z

Jenkins seems to fail since this PR has been merged.

FAIL: sklearn.metrics.tests.test_metrics.test_classification_report_multiclass_with_unicode_label
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/slave/virtualenvs/cpython-2.6/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "<https://jenkins.shiningpanda-ci.com/scikit-learn/job/python-2.6-numpy-1.3.0-scipy-0.7.2/ws/sklearn/metrics/tests/test_metrics.py",> line 856, in test_classification_report_multiclass_with_unicode_label
    assert_equal(report, expected_report)
AssertionError: u'             precision    recall  f1-score   support\n\n      blue\xa2       0.78      0.44      0.56        16\n     green\xa2       0.52      0.31      0.39        39\n       red\xa2       0.42      0.90      0.57        20\n\navg / total       0.55      0.49      0.47        75\n' != u'             precision    recall  f1-score   support\n\n      blue\xa2       0.83      0.79      0.81        24\n     green\xa2       0.33      0.10      0.15        31\n       red\xa2       0.42      0.90      0.57        20\n\navg / total       0.51      0.53      0.47        75\n'
>>  raise self.failureException, \
          (None or '%r != %r' % (u'             precision    recall  f1-score   support\n\n      blue\xa2       0.78      0.44      0.56        16\n     green\xa2       0.52      0.31      0.39        39\n       red\xa2       0.42      0.90      0.57        20\n\navg / total       0.55      0.49      0.47        75\n', u'             precision    recall  f1-score   support\n\n      blue\xa2       0.83      0.79      0.81        24\n     green\xa2       0.33      0.10      0.15        31\n       red\xa2       0.42      0.90      0.57        20\n\navg / total       0.51      0.53      0.47        75\n'))

Any idea? CC: @kmike @arjoly @larsmans @jnothman

arjoly · 2013-09-25T07:56:34Z

sklearn/metrics/tests/test_metrics.py

@@ -1138,6 +1159,10 @@ def test_invariance_string_vs_numbers_labels():
    labels_str = ["eggs", "spam"]

    for name, metric in CLASSIFICATION_METRICS.items():
+        if isinstance(metric, partial) and 'labels' in metric.keywords:


By the way, why doing that?

There are 3 tests that run on ALL_METRICS; two other tests support classification metrics without 'labels' argument, this one doesn't support it.

Am I wrong? I wrote some code to support this see line 1168 and beyond. But it is true that your solution is smarter.

What are you proposing? Check that fails is

assert_array_equal(measure_with_number, measure_with_str, err_msg="{0} failed string vs number invariance " "test".format(name))

Smarter test code is usually a worse test code :)

At the moment, we assume that the user put the appropriate labels in the labels argument.
If labels is None, we try to infer it the best we can.
(This could lead to some problems see #2029 and the discussion in #2094)

So "I'm not sure what is correct behavior here. y values are "egg" and "spam" and labels are [1, 2, 3]"
is undetermined behavior at the moment.

So it was correct to skip that test, at least now?

The correct action would be to delete "confusion_matrix_with_labels" from CLASSIFICATION_METRICS.

By adding this line if isinstance(metric, partial) and 'labels' in metric.keywords:, you skip any metric define
in CLASSIFICATION_METRICS with a partial and a label keyword. It means that you also skip test for macro/micro/weighted precision/recall/f-score.

I don't think it skips those tests: keywords are keyword arguments passed to functool.partial function, not arguments of the decorated function.

You are right on this. Sorry.

kmike · 2013-09-25T10:16:13Z

Interesting: this is not just a formatting issue, numbers in classification report are different. And I've just copied a first part of test_classification_report_multiclass_with_string_label into that test and perform one seemingly innocent tweak.

It seems that the only difference that can cause this failure is

    y_true = np.array(["blue", "green", "red"])[y_true]
    y_pred = np.array(["blue", "green", "red"])[y_pred]

vs

    labels = np.array([u("blue\xa2"), u("green\xa2"), u("red\xa2")])
    y_true = labels[y_true]
    y_pred = labels[y_pred]

arjoly · 2013-09-25T11:11:12Z

sklearn/metrics/tests/test_metrics.py

@@ -710,9 +711,9 @@ def test_precision_recall_f1_score_multiclass_pos_label_none():
 def test_zero_precision_recall():
    """Check that pathological cases do not bring NaNs"""

-    try:
-        old_error_settings = np.seterr(all='raise')
+    old_error_settings = np.seterr(all='raise')


Here, we could have used with np.errstate.

kmike · 2013-09-28T00:21:56Z

I think that what cause the failure is bytes vs unicode change itself - it turns out that LabelEncoder works incorrectly for unicode values under Python 2.6 + numpy 1.3, and this error propagates up to classification_report.

arjoly · 2013-09-28T09:21:04Z

Thanks @kmike for investigating !!!

kmike added 3 commits September 20, 2013 22:24

TST "confusion_matrix" was a duplicated key in CLASSIFICATION_METRICS…

0c79334

… and NOT_SYMMETRIC_METRICS dicts

TST style fix: old_error_settings should be outside try-finally block…

3a2173d

… to be safely used in finally statement

FIX classification_report shouldn't fail on unicode labels in Python 2.x

f8c7ca1

'{0}'.format(arg) doesn't promote the whole string to unicode if arg is unicode - it tries to encode arg to sys.getdefaultencoding() instead. "%s" doesn't have this gotcha.

GaelVaroquaux added a commit that referenced this pull request Sep 22, 2013

Merge pull request #2462 from kmike/metrics-unicode

3b8c0a1

[MRG] Add unicode support to sklearn.metrics.classification_report

GaelVaroquaux merged commit 3b8c0a1 into scikit-learn:master Sep 22, 2013

arjoly reviewed Sep 25, 2013
View reviewed changes

kmike mentioned this pull request Sep 28, 2013

LabelEncoder doesn't work correctly for unicode labels in Python 2.6 + numpy 1.3 #2481

Closed

Uh oh!

[MRG] Add unicode support to sklearn.metrics.classification_report #2462

[MRG] Add unicode support to sklearn.metrics.classification_report #2462

Uh oh!

Conversation

kmike commented Sep 20, 2013

Uh oh!

GaelVaroquaux commented Sep 22, 2013

Uh oh!

glouppe commented Sep 25, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kmike commented Sep 25, 2013

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kmike commented Sep 28, 2013

Uh oh!

arjoly commented Sep 28, 2013

Uh oh!

Uh oh!