Testing log_loss and hinge_loss under THRESHOLDED_METRICS #2717

MechCoder · 2014-01-06T07:11:35Z

Fixes Issue #2637

MechCoder · 2014-01-06T07:13:23Z

@arjoly , Would you be able to review this?

coveralls · 2014-01-06T07:20:04Z

Coverage remained the same when pulling 6b474aa on Manoj-Kumar-S:test_log_loss into bb743dd on scikit-learn:master.

arjoly · 2014-01-06T09:25:31Z

sklearn/metrics/tests/test_metrics.py

-    for name, metrics in THRESHOLDED_METRICS.items():
+    # Hinge_loss raises TypeError
+    THRESHOLDED_METRICS_ITEMS = THRESHOLDED_METRICS.items()
+    THRESHOLDED_METRICS_ITEMS.remove(('hinge_loss', hinge_loss))


~~If it doesn't raise a ValueError, then it should give proper results with string input. Can you add a test for that?~~

edit: see below

A comment like that ("raises TypeError") needs to be a bit more explanatory.

Also, exceptions in invariance tests are usually coded as:

for ... : if exceptional condition: continue

Can you add test to check that it raises TypeError? It's strange, I though that log_loss support string label.

Thanks @jnothman

Can you add test to check that it raises TypeError? It's strange, I though that log_loss support string label.

The docstring of los_loss have the following example

>>> log_loss(["spam", "ham", "ham", "spam"], # doctest: +ELLIPSIS ... [[.1, .9], [.9, .1], [.8, .2], [.35, .65]])

Thus this test should pass for the log_loss.

Note that a bug has glimpsed into the test and it shouldn't try to test with y2_str, but with y2.

Since y2_str has non float/int values doing y_true*pred_decision raises TypeError.

EDIT:
Something similar to this.

>>> hinge_loss(["spam", "ham", "ham", "spam"], ["ham", "spam", "ham", "spam"] ) TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.ndarray

It is hinge_loss by the way. log_loss does raise a ValueError

Why not, it's consistent with the documentation. This features can be considered as not implemented (not tested means not implemented in my opinion).

I'm confused what you are asking for, @arjoly. I think the test is correct at master, and hinge_loss should also raise ValueError

What I mean is

for name, metrics in THRESHOLDED_METRICS_ITEMS: assert_raises(ValueError, metrics, y1_str, y2_str)

should be (or at least the following lines of code added)

for name, metrics in THRESHOLDED_METRICS_ITEMS: assert_raises(ValueError, metrics, y1_str, y2)

Because y2_str can't be a valid input for thresholded metrics. The goal of the test_invariance_string_vs_numbers_labelsin my opinion is to check that metrics can handle string labels correctly.

If log_loss supports string input for labels, it should also be tested with y1_str.astype('O'). Either the invariance test could be enhanced, either one more assertion could be added in test_log_loss.

Is it clearer?

I think the test is correct at master, and hinge_loss should also raise ValueError

Sorry, I wasn't clear. I wasn't talking about the exception to return.

Yes, it's clearer. Though I don't understand whether or not you're suggesting it should actually raise an error when the first arg is strings and the second thresholds.

Though I don't understand whether or not you're suggesting it should actually raise an error when the first arg is strings and the second thresholds.

If it doesn't work with string, it should raise an helpful error message as in #2723. Otherwise, It should be tested that it give the correct answer. :-)

MechCoder · 2014-01-06T10:47:21Z

@arjoly , @jnothman . Thanks for reviewing. Do you have any other comments before I push in another commit.

jnothman · 2014-01-06T11:25:45Z

sklearn/metrics/metrics.py

@@ -265,7 +271,12 @@ def hinge_loss(y_true, pred_decision, pos_label=None, neg_label=None):
    else:
        y_true = LabelBinarizer(neg_label=-1).fit_transform(y_true)[:, 0]

-    margin = y_true * np.asarray(pred_decision)
+    # Done to be homogeneous with the output of LabelBinarizer.


I don't think this comment is clear, and the code speaks well enough for itself, given the error message.

@jnothman This was actually meant for the line

pred_decision = np.ravel(pred_decision)

below. I wrote this comment because, y_true is a 1-D array after

y_true = LabelBinarizer(neg_label=-1).fit_transform(y_true)[:, 0]

So pred_decision should also be 1-D for correct multiplication

amueller · 2014-01-06T18:47:25Z

+1 for merge, preferably with the two comments about check_arrays addressed.

coveralls · 2014-01-07T12:12:20Z

Changes Unknown when pulling 9d1276b on Manoj-Kumar-S:test_log_loss into * on scikit-learn:master*.

MechCoder · 2014-01-07T12:13:41Z

@arjoly , @jnothman : Good for merge now?

coveralls · 2014-01-07T17:50:34Z

Coverage remained the same when pulling b1bf71e on Manoj-Kumar-S:test_log_loss into a19bbe3 on scikit-learn:master.

coveralls · 2014-01-07T17:56:10Z

Coverage remained the same when pulling b1bf71e on Manoj-Kumar-S:test_log_loss into a19bbe3 on scikit-learn:master.

MechCoder · 2014-01-07T17:58:50Z

@arjoly and @jnothman : I fixed up a few wrong commits by rebasing against master. This should be good to go now.

MechCoder · 2014-01-08T09:39:49Z

@jnothman : Any more comments or are you +1 with this?

jnothman · 2014-01-08T09:43:32Z

Yes, LGTM.

arjoly · 2014-01-08T12:43:31Z

sklearn/metrics/metrics.py

+    T, Y = check_arrays(T, Y)
+    if T.shape[1] != Y.shape[1]:
+        raise ValueError("y_true and y_pred have different number of classes "
+                         "%d, %d" % (T.shape[1], Y.shape[1]))


This line is missed in the line coverage.

@arjoly You mean a test for this?

Yes, it would be nice :-) to have an invariance test for that !!

arjoly · 2014-01-08T12:44:26Z

I would also add a test with string object array for the log_loss function.

arjoly · 2014-01-08T12:44:40Z

except from the minor comment, LGTM

MechCoder · 2014-01-08T12:48:44Z

@arjoly: A string object array is already tested.

On Wed, Jan 8, 2014 at 6:14 PM, Arnaud Joly notifications@github.comwrote:

I would also add a test with string object array for the log_loss function.

arjoly · 2014-01-08T12:51:09Z

@arjoly: A string object array is already tested.

Sorry for the noise, can you point the testing to refresh my memory ?

MechCoder · 2014-01-08T12:56:43Z

@arjoly Doesn't L1500-1501 in test_metrics.py(in master) test the case when y_true and y_pred are both strings.

Also the new test for invariance test can be added under test_log_loss right?

arjoly · 2014-01-08T13:19:02Z

@arjoly Doesn't L1500-1501 in test_metrics.py(in master) test the case when y_true and y_pred are both strings.

If you remember my previous comment,

    for name, metrics in THRESHOLDED_METRICS.items():
        assert_raises(ValueError, metrics, y1_str, y2_str)

will alway raise an exception since y_score must be an array of float. I think this a bug / misstake that I made. It should be instead

    for name, metrics in THRESHOLDED_METRICS.items():
        assert_raises(ValueError, metrics, y1_str, y2)

where y2 correspond to numerical scores.

Also the new test for invariance test can be added under test_log_loss right?

Yes, if you plan to test only log_loss.

MechCoder · 2014-01-08T13:21:30Z

Ah okay, Sorry I misinterpreted string object array

MechCoder · 2014-01-08T13:33:02Z

@arjoly : Please check the latest commit and can you merge if tests pass?

coveralls · 2014-01-08T13:40:18Z

Coverage remained the same when pulling 1775fc8 on Manoj-Kumar-S:test_log_loss into a19bbe3 on scikit-learn:master.

arjoly · 2014-01-08T15:38:55Z

Can you replace lines 1501-1503 by

    # y2_str should be a score
    for name, metrics in THRESHOLDED_METRICS.items():
        assert_raises(ValueError, metrics, y1_str, y2_str)

    for name, metric in THRESHOLDED_METRICS.items():
        if name in ("log_loss", "hinge_loss"):
            measure_with_number = metric(y1, y2)
            measure_with_str = metric(y1_str, y2)
            assert_array_equal(measure_with_number, measure_with_str,
                               err_msg="{0} failed string vs number invariance "
                                       "test".format(name))

            measure_with_strobj = metric(y1_str.astype('O'), y2)
            assert_array_equal(measure_with_number, measure_with_strobj,
                               err_msg="{0} failed string object vs number "
                                       "invariance test".format(name))
        else:
            # TODO those metrics doesn't support string label yet
            assert_raises(ValueError, metrics, y1_str, y2)
            assert_raises(ValueError, metrics, y1_str.astype('O'), y2)

in test_invariance_string_vs_numbers_labels`?

MechCoder · 2014-01-08T15:43:16Z

Ah okay, now I understand what you mean.

MechCoder · 2014-01-08T15:46:50Z

@arjoly : Done.

coveralls · 2014-01-08T16:08:26Z

Coverage remained the same when pulling c9714b5 on Manoj-Kumar-S:test_log_loss into 834b375 on scikit-learn:master.

coveralls · 2014-01-08T16:09:49Z

Coverage remained the same when pulling c9714b5 on Manoj-Kumar-S:test_log_loss into 834b375 on scikit-learn:master.

MechCoder · 2014-01-08T16:11:43Z

@arjoly : Tests pass

arjoly · 2014-01-08T19:41:13Z

Cool thanks !!

Testing log_loss and hinge_loss under THRESHOLDED_METRICS

arjoly reviewed Jan 6, 2014
View reviewed changes

jnothman reviewed Jan 6, 2014
View reviewed changes

arjoly reviewed Jan 8, 2014
View reviewed changes

MechCoder added 4 commits January 8, 2014 21:21

Testing log_loss and hinge_loss under THRESHOLDED_METRICS

60ba427

FIX: Removed redundant code

10e384c

TST: Test class variance and string input

86b7051

FIX: Changed str vs float invariance test

c9714b5

arjoly added a commit that referenced this pull request Jan 8, 2014

Merge pull request #2717 from Manoj-Kumar-S/test_log_loss

8f2d8b9

Testing log_loss and hinge_loss under THRESHOLDED_METRICS

arjoly merged commit 8f2d8b9 into scikit-learn:master Jan 8, 2014

MechCoder deleted the test_log_loss branch January 8, 2014 19:45

jnothman mentioned this pull request Jan 8, 2014

log_loss and hinge_loss are not tested in the invariance metric system #2637

Closed

Testing log_loss and hinge_loss under THRESHOLDED_METRICS #2717

Testing log_loss and hinge_loss under THRESHOLDED_METRICS #2717

Conversation

MechCoder commented Jan 6, 2014

MechCoder commented Jan 6, 2014

coveralls commented Jan 6, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MechCoder commented Jan 6, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amueller commented Jan 6, 2014

coveralls commented Jan 7, 2014

MechCoder commented Jan 7, 2014

coveralls commented Jan 7, 2014

coveralls commented Jan 7, 2014

MechCoder commented Jan 7, 2014

MechCoder commented Jan 8, 2014

jnothman commented Jan 8, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arjoly commented Jan 8, 2014

arjoly commented Jan 8, 2014

MechCoder commented Jan 8, 2014

arjoly commented Jan 8, 2014

MechCoder commented Jan 8, 2014

arjoly commented Jan 8, 2014

MechCoder commented Jan 8, 2014

MechCoder commented Jan 8, 2014

coveralls commented Jan 8, 2014

arjoly commented Jan 8, 2014

MechCoder commented Jan 8, 2014

MechCoder commented Jan 8, 2014

coveralls commented Jan 8, 2014

coveralls commented Jan 8, 2014

MechCoder commented Jan 8, 2014

arjoly commented Jan 8, 2014