Issue 1527: normalize option for zero-one loss #1534

kyleabeauchamp · 2013-01-08T01:45:23Z

I think this is the easiest way to address this issue.

mblondel · 2013-01-08T07:34:59Z

sklearn/metrics/metrics.py

-    Positive integer (number of misclassifications). The best performance
-    is 0.
+    Positive integer (number of misclassifications) or float (fraction of
+    misclassifications). The best performance is 0.


or float (fraction of misclassifications) when normalize=True. Other than that, +1 for merge.

amueller · 2013-01-08T08:07:09Z

Thank for the pull requestion.
That looks good.
+1 for merge apart from @mblondel's comment.

For the future I think it is good if the titile of the pull request is a bit more descriptive than just the issue number. These are a bit hard to remember ;)

amueller · 2013-01-08T08:14:49Z

Oh, and can you please also a test for the new functionality?

arjoly · 2013-01-08T08:55:40Z

sklearn/metrics/metrics.py

    Returns
    -------
    loss : float

    """
    y_true, y_pred = check_arrays(y_true, y_pred)
-    return np.sum(y_pred != y_true)
+    if normalize == False:


if not normalize: is better

arjoly · 2013-01-08T08:56:45Z

Is it possible to wait a little before merging?
The pr #1512 is almost finished. You would have the possibility to update the narrative doc.

By the way, an example would be nice in the docstring.

jaquesgrobler · 2013-01-08T09:09:31Z

appart from above comments, 👍 for merge

arjoly · 2013-01-08T17:40:04Z

@kyleabeauchamp As we discussed in #1512, don't wait for my pr . I'll update the narrative doc.

arjoly · 2013-01-08T17:44:12Z

For the example, you can use:

Examples
--------
>>> from sklearn.metrics import zero_one
>>> y_pred = [0, 2, 1, 3, 4]
>>> y_true = [0, 1, 2, 3, 4]
>>> zero_one(y_true, y_pred)
2
>>> zero_one(y_true, y_pred, normalize=True)
0.4

arjoly · 2013-01-08T17:47:51Z

For the testing part, you can add and adapt existing tests in test_metrics.py.

arjoly · 2013-01-08T17:49:24Z

When you have finished, add your contribution in doc/whats_new.rst.

Thanks for the pr. :-)

kyleabeauchamp · 2013-01-08T19:10:34Z

OK I think I integrated everyone's comments. I think it's ready to go.

One note--the following doctest didn't work as written because of floating point representation:

zero_one(y_true, y_pred, normalize=True)
0.4

I did some rounding to avoid FP ambiguity.

amueller · 2013-01-08T20:44:54Z

instead of rounding, you should use ... in doctests and add the ellipsis flag (git grep DOCTESTS or something, I can never remember the exact syntax)

kyleabeauchamp · 2013-01-08T20:50:32Z

Are you sure this is the best approach? The python docs (http://docs.python.org/2/library/doctest.html) seem to suggest that rounding is the best way to go. My concern with ELLPISIS is that it's a general regex capability, while rounding allows us to directly control the level of precision required.

vene · 2013-01-08T20:53:11Z

I would use rounding for tests and ellipsis for doctests, since in doctests the prime concern is for the example to be clear to the user, and using round might be confusing and someone unfamiliar might think you somehow need to round everytime, or something like that. Just my 2p.

amueller · 2013-01-08T20:53:28Z

This is the usual way it is done in sklearn.
We don't really rely on doctests for tests, they are just examples that we know that work.
And examples are easier to read without a call to round imho.
Btw, you can still directly controll the required precion.

vene · 2013-01-08T20:55:18Z

sklearn/metrics/metrics.py

+    >>> y_true = [0, 1, 2, 3, 4]
+    >>> zero_one(y_true, y_pred)
+    2
+    >>> round(zero_one(y_true, y_pred, normalize=True),5)


if we decide on sticking with round here, please add a space between the comma and the 5, this is not PEP8

vene · 2013-01-08T20:57:55Z

@amueller we cross-posted... at least we agreed :)

amueller · 2013-01-08T20:59:10Z

@vene yeah, saw that. I take that as a good sign ;)

kyleabeauchamp · 2013-01-08T21:07:37Z

OK I agree that ELLIPSE is prettier for users.

arjoly · 2013-01-08T21:19:40Z

Instead of having round or ellipse, it might be better to change the example.
Just a quick ipython session:

In [1]: import numpy as np

In [2]: y_true = np.array([1, 2, 3, 4])

In [3]: y_pred = np.array([1, 3, 2 , 4])

In [4]: np.mean(y_true != y_pred)
Out[4]: 0.5

In [5]: np.sum(y_true != y_pred)
Out[5]: 2

arjoly · 2013-01-08T21:20:51Z

sklearn/metrics/tests/test_metrics.py

@@ -517,6 +519,10 @@ def test_symmetry():
    # symmetric
    assert_equal(zero_one(y_true, y_pred),
                 zero_one(y_pred, y_true))
+
+    assert_almost_equal(zero_one(y_true, y_pred, normalize=True),
+                 zero_one(y_pred, y_true, normalize=True), 2)


amueller · 2013-01-08T21:20:53Z

sure that also works ;)

arjoly · 2013-01-08T21:43:30Z

I tend to agree with the comment of @vene on the sentence formulation.
But if @vene is ok, then it is ok for me. :-) Ok for you @vene?

Except from the above comment, +1 for merge.

Thanks for your quick responses @kyleabeauchamp !!!

ogrisel · 2013-01-08T22:12:14Z

Looks good. +1 for merging as well.

amueller · 2013-01-08T22:14:19Z

crazy amount of reviewers out there tonight. Sweet :) will merge by rebase! (pretty sure @vene won't mind ;)

amueller · 2013-01-08T22:18:02Z

merged

amueller · 2013-01-08T22:18:41Z

Thanks a lot @kyleabeauchamp, in particular for the immediate responses! 👍

amueller · 2013-01-16T21:38:30Z

@kyleabeauchamp When you added yourself to whatsnew.rst, you made your name a link, but didn't add a website. I removed the link for now. Give me a shout if you want to add your website.

kyleabeauchamp · 2013-01-16T21:41:30Z

Oops--sorry about that, I copy pasted a previous line on whatsnew.rst
and copied my name in there. I didn't fully realize what the formatting
was doing. Thanks.

On 01/16/2013 01:38 PM, Andreas Mueller wrote:

@kyleabeauchamp https://github.com/kyleabeauchamp When you added
yourself to whatsnew.rst, you made your name a link, but didn't add a
website. I removed the link for now. Give me a shout if you want to
add your website.

—
Reply to this email directly or view it on GitHub
#1534 (comment).

amueller · 2013-01-16T21:45:12Z

On 01/16/2013 10:41 PM, kyleabeauchamp wrote:

Oops--sorry about that, I copy pasted a previous line on whatsnew.rst
and copied my name in there. I didn't fully realize what the formatting
was doing. Thanks.
I guessed it was something like that ;)
No problem. Just wanted to give you the opportunity to add a link if you
like.

kyleabeauchamp added 3 commits January 7, 2013 17:31

Added feature for issue #1527

fa80871

Minor PEP8 fixes for issue #1527

e291646

Minor docstring fix for issue #1527

a9adfa4

kyleabeauchamp mentioned this pull request Jan 8, 2013

Add "normalize" option to zero-one loss #1527

Closed

mblondel reviewed Jan 8, 2013
View reviewed changes

arjoly reviewed Jan 8, 2013
View reviewed changes

arjoly mentioned this pull request Jan 8, 2013

[MRG] Metric documentation (mainly in classification) #1512

Merged

Added tests and docs for normalized zero_one loss

b876c45

vene reviewed Jan 8, 2013
View reviewed changes

arjoly reviewed Jan 8, 2013
View reviewed changes

Fixed pep8 spacing issue and floating point doctest issue

83582d1

amueller closed this Jan 8, 2013

Uh oh!

Issue 1527: normalize option for zero-one loss #1534

Issue 1527: normalize option for zero-one loss #1534

Uh oh!

Conversation

kyleabeauchamp commented Jan 8, 2013

Uh oh!

mblondel Jan 8, 2013

Choose a reason for hiding this comment

Uh oh!

amueller commented Jan 8, 2013

Uh oh!

amueller commented Jan 8, 2013

Uh oh!

arjoly Jan 8, 2013

Choose a reason for hiding this comment

Uh oh!

arjoly commented Jan 8, 2013

Uh oh!

jaquesgrobler commented Jan 8, 2013

Uh oh!

arjoly commented Jan 8, 2013

Uh oh!

arjoly commented Jan 8, 2013

Uh oh!

arjoly commented Jan 8, 2013

Uh oh!

arjoly commented Jan 8, 2013

Uh oh!

kyleabeauchamp commented Jan 8, 2013

Uh oh!

amueller commented Jan 8, 2013

Uh oh!

kyleabeauchamp commented Jan 8, 2013

Uh oh!

vene commented Jan 8, 2013

Uh oh!

amueller commented Jan 8, 2013

Uh oh!

vene Jan 8, 2013

Choose a reason for hiding this comment

Uh oh!

vene commented Jan 8, 2013

Uh oh!

amueller commented Jan 8, 2013

Uh oh!

kyleabeauchamp commented Jan 8, 2013

Uh oh!

arjoly commented Jan 8, 2013

Uh oh!

arjoly Jan 8, 2013

Choose a reason for hiding this comment

Uh oh!

amueller commented Jan 8, 2013

Uh oh!

arjoly commented Jan 8, 2013

Uh oh!

ogrisel commented Jan 8, 2013

Uh oh!

amueller commented Jan 8, 2013

Uh oh!

amueller commented Jan 8, 2013

Uh oh!

amueller commented Jan 8, 2013

Uh oh!

amueller commented Jan 16, 2013

Uh oh!

kyleabeauchamp commented Jan 16, 2013

Uh oh!

amueller commented Jan 16, 2013

Uh oh!

Uh oh!