-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Multi-label metrics: accuracy, hamming loss and zero-one loss #1606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
||
Parameters | ||
---------- | ||
y_true : array-like or list of labels or label binary matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very happy that you decided to support both list of labels and label binary matrix. Regarding the name of the latter, maybe label indicator matrix or class membership matrix would be more explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks "label indicator matrix" is better name than "label binary matrix".
I think I'm +1 for moving |
Since those function don't assess the performance of an estimator, I am not sure that the metrics module is the best place. I was thinking about a |
Let's put them in |
+1 for multiclass |
--------- | ||
- :func:`metrics.accuracy_score` and :func:`metrics.zero_one_loss` support | ||
multi-label classification. A new metric :func:`metrics.hamming_loss` is | ||
added with mullti-label support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add your name here: credit where it belongs!
+1 for multiclass too. |
When I put |
Another possible place would be in the utils. |
what is the circle? |
In preprocessing, |
Maybe the best place is in
So it would be logical to find function check or analyze raw data. |
OK, I thank that tells me that we need to move things in utils. |
All right! I will create a new utils module. |
There is now a I have pull the unique_labels functionality from |
Could you add multilabel support to precision / recall / f1 score? Once this is done, the multilabel tests in the multiclass module can be updated to use the metrics directly: |
I intended to do that in my next pull request. |
+1 for separate pr |
The voice of the reason: small and reviewable pr. Don't worry @mblondel I intend to add an another pr with precision, recall and f-score. Perhaps one thing that could change is the name of the new utils module:
|
I rebase on top of master. |
No worries. Could you discuss the relationship between hamming loss and zero-one loss in the docstring? Thanks. |
@mblondel I think that I have taken your remarks into account. By the way, I add some more invariance tests. |
In the multiclass (not multilabel) case, they are the same, right? |
No, they differ. In the hamming loss, you divide each error by the number of labels. One small example In [22]: y2 = np.random.randint(0, 4, size=(5, ))
In [23]: y1 = np.random.randint(0, 4, size=(5, ))
In [24]: y1
Out[24]: array([2, 0, 3, 2, 2])
In [25]: y2
Out[25]: array([3, 1, 2, 1, 2])
In [26]: hamming_loss(y1, y2)
Out[26]: 0.40000000000000002
In [27]: zero_one_loss(y1, y2)
Out[27]: 0.80000000000000004 But thinking about it, the hamming loss is always smaller than the zero one loss. I will correct this. |
Did you decide to add this normalization or is it always implemented like this in multilabel papers? http://en.wikipedia.org/wiki/Hamming_distance uses the unormalized count. |
I'm asking because this is important that our implementation of the metrics is as standard as possible. We could add a |
The following papers agree on the normalization with the number labels:
|
Great. Maybe you can cite the first one then. |
Done |
I will have time this week to work on the precision, recall and F-measure metrics to support the multi-labels format. Furthemore, I would like to add the jaccard similiarty measure (an example based accuracy measure). What do you advise to me? I will need some of the function in |
Maybe do a PR on top of this one? We really should try to get this one in :-/ |
If I do a pr on top of this one, will I have problem if I rebase this one on top of master? |
This one merges cleanly, no reason to rebase (though I'd like to You can branch off current master, merge this branch into your new branch, then add the functionality you want. |
I will do as you suggest! Thanks! |
Btw, it's easier if you first squash the commits using a |
Or someone can give this one a second +1 and we merge it ;) |
I'll try to review the PR this afternoon. I'll merge it if I think it's ready. |
awesome, thanks :) |
We've got an inconsistency in the documentation. The dev docs say |
@@ -599,13 +667,16 @@ classification loss (:math:`L_{0-1}`) over :math:`n_{\text{samples}}`. By | |||
defaults, the function normalizes over the sample. To get the sum of the | |||
:math:`L_{0-1}`, set ``normalize`` to ``False``. | |||
|
|||
In multilabel classification, the :func:`zero_one_loss` function corresponds | |||
to the subset zero one loss: the subset of labels must be correctly predict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this sentence.
Ok, pushed to master after squashing. Thanks @arjoly for tackling this important problem: evaluation can be dull and it can make your head hurt, but it's crucial for a machine learning toolkit. |
Thanks to all reviewers !!! |
We have a failure on the Numpy 1.3/Scipy 0.7 build bot: |
Can any one have a look at the failing doctest? https://jenkins.shiningpanda-ci.com/scikit-learn/job/python-2.6-numpy-1.3.0-scipy-0.7.2/1656/console https://jenkins.shiningpanda-ci.com/scikit-learn/job/python-2.6-numpy-1.3.0-scipy-0.7.2/ |
I will have a look this afternoon. |
I am not able to install numpy 1.3 with python 2.6. :-$ |
I think you can reproduce it with numpy 1.3 on python 2.7. I don't see why it would be specific to 2.6. |
No, I suspect it's Numpy-specific. |
I am working on it. |
I bet SciPy has very little to do with this, so you can try a later version first to see if you get the failures. (Otherwise, try finding an old version of a Linux distro that has these versions and install it in a VM.) |
I haven't been able to install to have python 2.7, numpy 1.3 with scipy 0.7.
Same with scipy 0.6 or any scipy 0.7.x version... With python 2.6, I am not able to install numpy 1.3 due to a problem with unicode character (ucs2 and ucs4 problem). Lastly, scipy 0.8 need at least numpy 1.4... I suppose that a simple |
Any suggestion for a linux distro with the required package (and if possible easy installation of) python 2.6, numpy 1.3 and scipy 0.7? |
Ubuntu lucid should do. |
This pull request intents to bring 3 new features:
unique_labels
function;accuracy_score
andzero_one_loss
functions;hamming_loss
) with multi-label support.Before merging, I would like to suggest to add a new module where multi-labels utilities such as
unique_labels
and_is_label_indicator_matrix
are collected.Furthermore, I have to re-organise (cosmit) some of the function in a multi-label categories in
metrics.py
. But I will wait that reviews are done.This pull request also tackles issue #558. Reviews and comments are welcome! :-)