Skip to content

Issue 1527: normalize option for zero-one loss #1534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from
Closed

Issue 1527: normalize option for zero-one loss #1534

wants to merge 5 commits into from

Conversation

kyleabeauchamp
Copy link
Contributor

I think this is the easiest way to address this issue.

Positive integer (number of misclassifications). The best performance
is 0.
Positive integer (number of misclassifications) or float (fraction of
misclassifications). The best performance is 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or float (fraction of misclassifications) when normalize=True. Other than that, +1 for merge.

@amueller
Copy link
Member

amueller commented Jan 8, 2013

Thank for the pull requestion.
That looks good.
+1 for merge apart from @mblondel's comment.

For the future I think it is good if the titile of the pull request is a bit more descriptive than just the issue number. These are a bit hard to remember ;)

@amueller
Copy link
Member

amueller commented Jan 8, 2013

Oh, and can you please also a test for the new functionality?

Returns
-------
loss : float

"""
y_true, y_pred = check_arrays(y_true, y_pred)
return np.sum(y_pred != y_true)
if normalize == False:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not normalize: is better

@arjoly
Copy link
Member

arjoly commented Jan 8, 2013

Is it possible to wait a little before merging?
The pr #1512 is almost finished. You would have the possibility to update the narrative doc.

By the way, an example would be nice in the docstring.

@jaquesgrobler
Copy link
Member

appart from above comments, 👍 for merge

@arjoly
Copy link
Member

arjoly commented Jan 8, 2013

@kyleabeauchamp As we discussed in #1512, don't wait for my pr . I'll update the narrative doc.

@arjoly
Copy link
Member

arjoly commented Jan 8, 2013

For the example, you can use:

Examples
--------
>>> from sklearn.metrics import zero_one
>>> y_pred = [0, 2, 1, 3, 4]
>>> y_true = [0, 1, 2, 3, 4]
>>> zero_one(y_true, y_pred)
2
>>> zero_one(y_true, y_pred, normalize=True)
0.4

@arjoly
Copy link
Member

arjoly commented Jan 8, 2013

For the testing part, you can add and adapt existing tests in test_metrics.py.

@arjoly
Copy link
Member

arjoly commented Jan 8, 2013

When you have finished, add your contribution in doc/whats_new.rst.

Thanks for the pr. :-)

@kyleabeauchamp
Copy link
Contributor Author

OK I think I integrated everyone's comments. I think it's ready to go.

One note--the following doctest didn't work as written because of floating point representation:

zero_one(y_true, y_pred, normalize=True)
0.4

I did some rounding to avoid FP ambiguity.

@amueller
Copy link
Member

amueller commented Jan 8, 2013

instead of rounding, you should use ... in doctests and add the ellipsis flag (git grep DOCTESTS or something, I can never remember the exact syntax)

@kyleabeauchamp
Copy link
Contributor Author

Are you sure this is the best approach? The python docs (http://docs.python.org/2/library/doctest.html) seem to suggest that rounding is the best way to go. My concern with ELLPISIS is that it's a general regex capability, while rounding allows us to directly control the level of precision required.

@vene
Copy link
Member

vene commented Jan 8, 2013

I would use rounding for tests and ellipsis for doctests, since in doctests the prime concern is for the example to be clear to the user, and using round might be confusing and someone unfamiliar might think you somehow need to round everytime, or something like that. Just my 2p.

@amueller
Copy link
Member

amueller commented Jan 8, 2013

This is the usual way it is done in sklearn.
We don't really rely on doctests for tests, they are just examples that we know that work.
And examples are easier to read without a call to round imho.
Btw, you can still directly controll the required precion.

>>> y_true = [0, 1, 2, 3, 4]
>>> zero_one(y_true, y_pred)
2
>>> round(zero_one(y_true, y_pred, normalize=True),5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we decide on sticking with round here, please add a space between the comma and the 5, this is not PEP8

@vene
Copy link
Member

vene commented Jan 8, 2013

@amueller we cross-posted... at least we agreed :)

@amueller
Copy link
Member

amueller commented Jan 8, 2013

@vene yeah, saw that. I take that as a good sign ;)

@kyleabeauchamp
Copy link
Contributor Author

OK I agree that ELLIPSE is prettier for users.

@arjoly
Copy link
Member

arjoly commented Jan 8, 2013

Instead of having round or ellipse, it might be better to change the example.
Just a quick ipython session:

In [1]: import numpy as np

In [2]: y_true = np.array([1, 2, 3, 4])

In [3]: y_pred = np.array([1, 3, 2 , 4])

In [4]: np.mean(y_true != y_pred)
Out[4]: 0.5

In [5]: np.sum(y_true != y_pred)
Out[5]: 2

@@ -517,6 +519,10 @@ def test_symmetry():
# symmetric
assert_equal(zero_one(y_true, y_pred),
zero_one(y_pred, y_true))

assert_almost_equal(zero_one(y_true, y_pred, normalize=True),
zero_one(y_pred, y_true, normalize=True), 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8

@amueller
Copy link
Member

amueller commented Jan 8, 2013

sure that also works ;)

@arjoly
Copy link
Member

arjoly commented Jan 8, 2013

I tend to agree with the comment of @vene on the sentence formulation.
But if @vene is ok, then it is ok for me. :-) Ok for you @vene?

Except from the above comment, +1 for merge.

Thanks for your quick responses @kyleabeauchamp !!!

@ogrisel
Copy link
Member

ogrisel commented Jan 8, 2013

Looks good. +1 for merging as well.

@amueller
Copy link
Member

amueller commented Jan 8, 2013

crazy amount of reviewers out there tonight. Sweet :) will merge by rebase! (pretty sure @vene won't mind ;)

@amueller
Copy link
Member

amueller commented Jan 8, 2013

merged

@amueller amueller closed this Jan 8, 2013
@amueller
Copy link
Member

amueller commented Jan 8, 2013

Thanks a lot @kyleabeauchamp, in particular for the immediate responses! 👍

@amueller
Copy link
Member

@kyleabeauchamp When you added yourself to whatsnew.rst, you made your name a link, but didn't add a website. I removed the link for now. Give me a shout if you want to add your website.

@kyleabeauchamp
Copy link
Contributor Author

Oops--sorry about that, I copy pasted a previous line on whatsnew.rst
and copied my name in there. I didn't fully realize what the formatting
was doing. Thanks.

On 01/16/2013 01:38 PM, Andreas Mueller wrote:

@kyleabeauchamp https://github.com/kyleabeauchamp When you added
yourself to whatsnew.rst, you made your name a link, but didn't add a
website. I removed the link for now. Give me a shout if you want to
add your website.


Reply to this email directly or view it on GitHub
#1534 (comment).

@amueller
Copy link
Member

On 01/16/2013 10:41 PM, kyleabeauchamp wrote:

Oops--sorry about that, I copy pasted a previous line on whatsnew.rst
and copied my name in there. I didn't fully realize what the formatting
was doing. Thanks.
I guessed it was something like that ;)
No problem. Just wanted to give you the opportunity to add a link if you
like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants