[DOC] clarified hamming loss docstrings #13760

XavierSATTLER · 2019-05-01T19:24:38Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

This issue started because of a typo in hamming loss docstrings:

The Hamming loss is upperbounded by the subset zero-one loss. When normalized over samples, the Hamming loss is always between 0 and 1.

It made it believe that the Hamming loss had to be normalized. In fact, it is the zero-one loss that has to be normalized to be able to compare it with Hamming loss.

I then saw an other typo in the multilabel classification paragraph, and added few words to make the documentation clearer.

agramfort · 2019-05-05T07:28:50Z

sklearn/metrics/classification.py

@@ -1989,16 +1989,16 @@ def hamming_loss(y_true, y_pred, labels=None, sample_weight=None):
    -----
    In multiclass classification, the Hamming loss corresponds to the Hamming
    distance between ``y_true`` and ``y_pred`` which is equivalent to the
-    subset ``zero_one_loss`` function.
+    subset ``zero_one_loss`` function, when zero-one loss is normalized.


I don't understand what you mean by normalized. The zero-one loss is either 0 or 1. Maybe you mean averaged over samples?

Hi !

Please correct me if I'm wrong. I think zero-one loss is either 0 or 1 for a sample, but we are talking about a subset (so a batch) of samples here. In this case, one has two choices. One can sum the losses of the samples, or take what I would call a "normalized sum", which is the sum rescaled to be 0 if all samples are right and 1 if they are all wrong.

I use the name "normalized sum" because in sklearn.metrics.zero_one_loss, you can choose the second behaviour by setting the normalize parameter to True. It will be a simple sum if normalize is False.

To my understanding, the fact that this "normalized sum" is equal to the average over samples is a simple consequence of zero-one loss being 0 or 1 (so it is a consequence, not a property, which is why I am not so comfortable with the use of "average over samples").

The clarification I want to make in the documentation is that in multiclass classification, hamming loss is equivalent to zero-one loss if and only if zero-one loss is in its "normalized" form.

looking simply at the diff it was not clear that you were referring to the "normalize" parameter of the function. Can you maybe just clarify that it refers to the normalize parameter. Then looking at normalize parameter it's clearer what it means. thanks

jnothman · 2019-05-08T03:24:31Z

Thanks @XavierSATTLER

XavierSATTLER added 4 commits May 1, 2019 20:56

Clarified link between hamming loss and zero-one loss

59b2f4b

Updated typo in hamming loss multilabel classification note

a1f3d4a

Added precision in hamming loss notes

f19590c

Added precision in hamming loss notes

a9b13c5

XavierSATTLER changed the title ~~Xavier sattler/doc/hamming loss 13734~~ [DOC] clarified hamming loss docstrings May 1, 2019

Merge branch 'master' into XavierSATTLER/doc/hamming_loss_13734

df12609

agramfort reviewed May 5, 2019

View reviewed changes

XavierSATTLER added 3 commits May 7, 2019 20:42

clarified normalized term

2517c27

put normalize terme in coding format

37d2148

respected linter

6dacd1e

jnothman approved these changes May 8, 2019

View reviewed changes

jnothman merged commit 0caf914 into scikit-learn:master May 8, 2019

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request May 8, 2019

DOC clarified hamming loss docstrings (scikit-learn#13760)

3cdeb44

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019

DOC clarified hamming loss docstrings (scikit-learn#13760)

12cdf5b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[DOC] clarified hamming loss docstrings #13760

[DOC] clarified hamming loss docstrings #13760

Uh oh!

XavierSATTLER commented May 1, 2019

Uh oh!

agramfort May 5, 2019

Uh oh!

XavierSATTLER May 5, 2019 •

edited

Loading

Uh oh!

agramfort May 5, 2019

Uh oh!

jnothman commented May 8, 2019

Uh oh!

Uh oh!

Uh oh!

[DOC] clarified hamming loss docstrings #13760

[DOC] clarified hamming loss docstrings #13760

Uh oh!

Conversation

XavierSATTLER commented May 1, 2019

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Uh oh!

agramfort May 5, 2019

Choose a reason for hiding this comment

Uh oh!

XavierSATTLER May 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agramfort May 5, 2019

Choose a reason for hiding this comment

Uh oh!

jnothman commented May 8, 2019

Uh oh!

Uh oh!

XavierSATTLER May 5, 2019 •

edited

Loading