Skip to content

Testing log_loss and hinge_loss under THRESHOLDED_METRICS #2717

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 8, 2014

Conversation

MechCoder
Copy link
Member

Fixes Issue #2637

@MechCoder
Copy link
Member Author

@arjoly , Would you be able to review this?

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 6b474aa on Manoj-Kumar-S:test_log_loss into bb743dd on scikit-learn:master.

for name, metrics in THRESHOLDED_METRICS.items():
# Hinge_loss raises TypeError
THRESHOLDED_METRICS_ITEMS = THRESHOLDED_METRICS.items()
THRESHOLDED_METRICS_ITEMS.remove(('hinge_loss', hinge_loss))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it doesn't raise a ValueError, then it should give proper results with string input. Can you add a test for that?

edit: see below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment like that ("raises TypeError") needs to be a bit more explanatory.

Also, exceptions in invariance tests are usually coded as:

for ... :
    if exceptional condition:
        continue

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add test to check that it raises TypeError? It's strange, I though that log_loss support string label.

Thanks @jnothman

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add test to check that it raises TypeError? It's strange, I though that log_loss support string label.

The docstring of los_loss have the following example

    >>> log_loss(["spam", "ham", "ham", "spam"],  # doctest: +ELLIPSIS
    ...          [[.1, .9], [.9, .1], [.8, .2], [.35, .65]])

Thus this test should pass for the log_loss.

Note that a bug has glimpsed into the test and it shouldn't try to test with y2_str, but with y2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since y2_str has non float/int values doing y_true*pred_decision raises TypeError.

EDIT:
Something similar to this.

>>> hinge_loss(["spam", "ham", "ham", "spam"], ["ham", "spam", "ham", "spam"] )
        TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.ndarray            

It is hinge_loss by the way. log_loss does raise a ValueError

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not, it's consistent with the documentation. This features can be considered as not implemented (not tested means not implemented in my opinion).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused what you are asking for, @arjoly. I think the test is correct at master, and hinge_loss should also raise ValueError

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is

    for name, metrics in THRESHOLDED_METRICS_ITEMS:
          assert_raises(ValueError, metrics, y1_str, y2_str)

should be (or at least the following lines of code added)

    for name, metrics in THRESHOLDED_METRICS_ITEMS:
          assert_raises(ValueError, metrics, y1_str, y2)

Because y2_str can't be a valid input for thresholded metrics. The goal of the test_invariance_string_vs_numbers_labelsin my opinion is to check that metrics can handle string labels correctly.

If log_loss supports string input for labels, it should also be tested with y1_str.astype('O'). Either the invariance test could be enhanced, either one more assertion could be added in test_log_loss.

Is it clearer?

I think the test is correct at master, and hinge_loss should also raise ValueError

Sorry, I wasn't clear. I wasn't talking about the exception to return.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's clearer. Though I don't understand whether or not you're suggesting it should actually raise an error when the first arg is strings and the second thresholds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I don't understand whether or not you're suggesting it should actually raise an error when the first arg is strings and the second thresholds.

If it doesn't work with string, it should raise an helpful error message as in #2723. Otherwise, It should be tested that it give the correct answer. :-)

@MechCoder
Copy link
Member Author

@arjoly , @jnothman . Thanks for reviewing. Do you have any other comments before I push in another commit.

@@ -265,7 +271,12 @@ def hinge_loss(y_true, pred_decision, pos_label=None, neg_label=None):
else:
y_true = LabelBinarizer(neg_label=-1).fit_transform(y_true)[:, 0]

margin = y_true * np.asarray(pred_decision)
# Done to be homogeneous with the output of LabelBinarizer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this comment is clear, and the code speaks well enough for itself, given the error message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnothman This was actually meant for the line

pred_decision = np.ravel(pred_decision)

below. I wrote this comment because, y_true is a 1-D array after

y_true = LabelBinarizer(neg_label=-1).fit_transform(y_true)[:, 0]

So pred_decision should also be 1-D for correct multiplication

@amueller
Copy link
Member

amueller commented Jan 6, 2014

+1 for merge, preferably with the two comments about check_arrays addressed.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 9d1276b on Manoj-Kumar-S:test_log_loss into * on scikit-learn:master*.

@MechCoder
Copy link
Member Author

@arjoly , @jnothman : Good for merge now?

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling b1bf71e on Manoj-Kumar-S:test_log_loss into a19bbe3 on scikit-learn:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling b1bf71e on Manoj-Kumar-S:test_log_loss into a19bbe3 on scikit-learn:master.

@MechCoder
Copy link
Member Author

@arjoly and @jnothman : I fixed up a few wrong commits by rebasing against master. This should be good to go now.

@MechCoder
Copy link
Member Author

@jnothman : Any more comments or are you +1 with this?

@jnothman
Copy link
Member

jnothman commented Jan 8, 2014

Yes, LGTM.

T, Y = check_arrays(T, Y)
if T.shape[1] != Y.shape[1]:
raise ValueError("y_true and y_pred have different number of classes "
"%d, %d" % (T.shape[1], Y.shape[1]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is missed in the line coverage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arjoly You mean a test for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would be nice :-) to have an invariance test for that !!

@arjoly
Copy link
Member

arjoly commented Jan 8, 2014

I would also add a test with string object array for the log_loss function.

@arjoly
Copy link
Member

arjoly commented Jan 8, 2014

except from the minor comment, LGTM

@MechCoder
Copy link
Member Author

@arjoly: A string object array is already tested.

On Wed, Jan 8, 2014 at 6:14 PM, Arnaud Joly notifications@github.comwrote:

I would also add a test with string object array for the log_loss function.

@arjoly
Copy link
Member

arjoly commented Jan 8, 2014

@arjoly: A string object array is already tested.

Sorry for the noise, can you point the testing to refresh my memory ?

@MechCoder
Copy link
Member Author

@arjoly Doesn't L1500-1501 in test_metrics.py(in master) test the case when y_true and y_pred are both strings.

Also the new test for invariance test can be added under test_log_loss right?

@arjoly
Copy link
Member

arjoly commented Jan 8, 2014

@arjoly Doesn't L1500-1501 in test_metrics.py(in master) test the case when y_true and y_pred are both strings.

If you remember my previous comment,

    for name, metrics in THRESHOLDED_METRICS.items():
        assert_raises(ValueError, metrics, y1_str, y2_str)

will alway raise an exception since y_score must be an array of float. I think this a bug / misstake that I made. It should be instead

    for name, metrics in THRESHOLDED_METRICS.items():
        assert_raises(ValueError, metrics, y1_str, y2)

where y2 correspond to numerical scores.

Also the new test for invariance test can be added under test_log_loss right?

Yes, if you plan to test only log_loss.

@MechCoder
Copy link
Member Author

Ah okay, Sorry I misinterpreted string object array

@MechCoder
Copy link
Member Author

@arjoly : Please check the latest commit and can you merge if tests pass?

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 1775fc8 on Manoj-Kumar-S:test_log_loss into a19bbe3 on scikit-learn:master.

@arjoly
Copy link
Member

arjoly commented Jan 8, 2014

Can you replace lines 1501-1503 by

    # y2_str should be a score
    for name, metrics in THRESHOLDED_METRICS.items():
        assert_raises(ValueError, metrics, y1_str, y2_str)

    for name, metric in THRESHOLDED_METRICS.items():
        if name in ("log_loss", "hinge_loss"):
            measure_with_number = metric(y1, y2)
            measure_with_str = metric(y1_str, y2)
            assert_array_equal(measure_with_number, measure_with_str,
                               err_msg="{0} failed string vs number invariance "
                                       "test".format(name))

            measure_with_strobj = metric(y1_str.astype('O'), y2)
            assert_array_equal(measure_with_number, measure_with_strobj,
                               err_msg="{0} failed string object vs number "
                                       "invariance test".format(name))
        else:
            # TODO those metrics doesn't support string label yet
            assert_raises(ValueError, metrics, y1_str, y2)
            assert_raises(ValueError, metrics, y1_str.astype('O'), y2)

in test_invariance_string_vs_numbers_labels`?

@MechCoder
Copy link
Member Author

Ah okay, now I understand what you mean.

@MechCoder
Copy link
Member Author

@arjoly : Done.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling c9714b5 on Manoj-Kumar-S:test_log_loss into 834b375 on scikit-learn:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling c9714b5 on Manoj-Kumar-S:test_log_loss into 834b375 on scikit-learn:master.

@MechCoder
Copy link
Member Author

@arjoly : Tests pass

@arjoly
Copy link
Member

arjoly commented Jan 8, 2014

Cool thanks !!

arjoly added a commit that referenced this pull request Jan 8, 2014
Testing log_loss and hinge_loss under THRESHOLDED_METRICS
@arjoly arjoly merged commit 8f2d8b9 into scikit-learn:master Jan 8, 2014
@MechCoder MechCoder deleted the test_log_loss branch January 8, 2014 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants