[MRG] average parameter for jaccard_similarity_score #10083

gxyd · 2017-11-07T19:57:49Z

Reference Issues/PRs

Fixes #7332

What does this implement/fix? Explain your changes.

Fixes the issue of 'jaccard similarity' being same as accuracy score for multi-class classification problem.

Any other comments?

Fixes scikit-learn#7332

jnothman · 2017-11-07T22:10:46Z

sklearn/metrics/classification.py

-        score = y_true == y_pred
+        C = confusion_matrix(y_true, y_pred, sample_weight=sample_weight)
+        den = C.sum(0) + C.sum(1) - C.diagonal()
+        score = C.diagonal()/den


Leave spaces around / please

jnothman · 2017-11-07T22:12:04Z

sklearn/metrics/classification.py

-        score = y_true == y_pred
+        C = confusion_matrix(y_true, y_pred, sample_weight=sample_weight)
+        den = C.sum(0) + C.sum(1) - C.diagonal()
+        score = C.diagonal()/den

    return _weighted_sum(score, sample_weight, normalize)


Can this still work??

I don't seem to understand as to why this might be a problem.

Well in the previous code, score was an array of length n_samples. Now score is an array of length n_classes. It clearly cannot be weighted by sample_weight which is an array of length n_samples.

gxyd · 2017-11-08T16:31:41Z

I can see that I am getting error like

======================================================================
ERROR: /home/gxyd/foss/scikit-learn/sklearn/metrics/tests/test_common.py.test_sample_weight_invariance:check_sample_weight_invariance(jaccard_similarity_score)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gxyd/anaconda3/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/gxyd/foss/scikit-learn/sklearn/utils/testing.py", line 774, in __call__
    return self.check(*args, **kwargs)
  File "/home/gxyd/foss/scikit-learn/sklearn/utils/testing.py", line 309, in wrapper
    return fn(*args, **kwargs)
  File "/home/gxyd/foss/scikit-learn/sklearn/metrics/tests/test_common.py", line 957, in check_sample_weight_invariance
    metric(y1, y2, sample_weight=np.ones(shape=len(y1))),
  File "/home/gxyd/foss/scikit-learn/sklearn/metrics/classification.py", line 468, in jaccard_similarity_score
    return _weighted_sum(score, sample_weight, normalize)
  File "/home/gxyd/foss/scikit-learn/sklearn/metrics/classification.py", line 108, in _weighted_sum
    return np.average(sample_score, weights=sample_weight)
  File "/home/gxyd/anaconda3/lib/python3.6/site-packages/numpy/lib/function_base.py", line 944, in average
    "Axis must be specified when shapes of a and weights "
TypeError: Axis must be specified when shapes of a and weights differ.

gxyd · 2017-11-09T16:09:21Z

I hope it wouldn't be a problem if I get back on this after my exams are over, I'll have more time then.

jnothman · 2017-11-09T21:16:06Z

sure, just let us know if time runs away and you'd like someone else to complete it

gxyd · 2017-11-10T08:51:13Z

I had some time, I tried to fix the issue. But still there seems to be some error

FAIL: sklearn.metrics.tests.test_common.test_normalize_option_binary_classification
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gxyd/anaconda3/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/gxyd/foss/scikit-learn/sklearn/metrics/tests/test_common.py", line 776, in test_normalize_option_binary_classification
    / n_samples, measure)
  File "/home/gxyd/anaconda3/lib/python3.6/site-packages/numpy/testing/utils.py", line 539, in assert_almost_equal
    raise AssertionError(_build_err_msg())
AssertionError: 
Arrays are not almost equal to 7 decimals
 ACTUAL: 0.037259615384615384
 DESIRED: 0.37259615384615385
>>  raise AssertionError(_build_err_msg())
    

======================================================================
FAIL: sklearn.metrics.tests.test_common.test_normalize_option_multiclass_classification
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/gxyd/anaconda3/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/gxyd/foss/scikit-learn/sklearn/metrics/tests/test_common.py", line 792, in test_normalize_option_multiclass_classification
    / n_samples, measure)
  File "/home/gxyd/anaconda3/lib/python3.6/site-packages/numpy/testing/utils.py", line 539, in assert_almost_equal
    raise AssertionError(_build_err_msg())
AssertionError: 
Arrays are not almost equal to 7 decimals
 ACTUAL: 0.016666666666666666
 DESIRED: 0.083333333333333329
>>  raise AssertionError(_build_err_msg())

I am guessing I need to modify 'other' test. I'll give some more time and see.

jnothman · 2017-11-11T11:14:04Z

There shouldn't be a normalize option on jaccard, as far as I'm concerned... But I think you will need to change test_common to make sure these tests only run for metrics that normalize over the sample size.

gxyd · 2017-11-19T08:07:14Z

So I should remove `normalize` from the API of `jaccard_similarity_score`? If so, I have seen that most of the metric scores do have `normalize` in their API.

jnothman

I think we need to reconsider how we're going about this. Even in the multilabel case it's appropriate to do this averaging across classes (as discussed in the issue), rather than the sample-wise measure we currently report. This is equivalent to the average=macro/samples distinction in P/R/F. The current implementation arguably is faithfully average=samples, even in the multiclass case. As in P/R/F with average=micro, it is only an interesting metric if a subset of classes is selected with a labels parameter, and as there, the binary case, where one class is important and the other disregarded, needs to be distinguished from the all-classes average. Normalize in the usual sense is only applicable if average=samples. Basically we need to make the interface the same as for precision_recall_fscore_support, but perhaps retaining normalize in the average=samples case. Does that all make sense? Sorry it makes the task much larger. I hope you are able to nonetheless complete it!

gxyd · 2017-11-23T16:45:14Z

BTW, are you using P/R/F for 'Precision/Recall/F1'? Also is this generally used this way in machine learning? (I searched on the web, but still couldn't find any reference to usage of P/R/F).

jnothman · 2017-11-23T21:20:14Z

yes, that is my abbreviation, i suppose. but you will often see results reported with columns named P, R and F, at least in NLP...

gxyd · 2017-11-24T05:35:16Z

There is probably an issue with a docstring example (it is not printed correctly, may be even not tested), here http://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_fscore_support.html (look at the bottom).

I tried building docs locally using make html in doc directory, but I get an error

Exception occurred:
  File "/home/gxyd/anaconda3/lib/python3.6/site-packages/sphinx_gallery/gen_gallery.py", line 322, in sumarize_failing_examples
    "\n" + "-" * 79)
ValueError: Here is a summary of the problems encountered when running the examples

Unexpected failing examples:
/home/gxyd/foss/scikit-learn/examples/ensemble/plot_feature_transformation.py failed leaving traceback:
Traceback (most recent call last):
  File "/home/gxyd/anaconda3/lib/python3.6/site-packages/sphinx_gallery/gen_rst.py", line 450, in execute_code_block
    exec(code_block, example_globals)
  File "<string>", line 15, in <module>
ImportError: cannot import name 'CategoricalEncoder'


/home/gxyd/foss/scikit-learn/examples/ensemble/plot_gradient_boosting_early_stopping.py failed leaving traceback:
Traceback (most recent call last):
  File "/home/gxyd/anaconda3/lib/python3.6/site-packages/sphinx_gallery/gen_rst.py", line 450, in execute_code_block
    exec(code_block, example_globals)
  File "<string>", line 40, in <module>
TypeError: __init__() got an unexpected keyword argument 'validation_fraction'


-------------------------------------------------------------------------------
The full traceback has been saved in /tmp/sphinx-err-nnff1ycd.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!
make: *** [html] Error 1

gxyd · 2017-11-24T11:59:18Z

Okay. For the last message I got some help on gitter, and I have addressed my own last comment. I'll now try to address your comment.

gxyd · 2017-11-24T18:24:21Z

I've tried to move in that direction. I'll add tests and make documentation changes, its getting a little late for today.

jnothman · 2017-11-24T19:10:07Z

This pull request introduces 1 alert - view on lgtm.com

new alerts:

1 for Variable defined multiple times

Comment posted by lgtm.com

gxyd · 2017-11-25T06:33:56Z

I'll need to think about multi-label case, about whether we can have all three types of 'averages' (i.e. macro, micro, weighted).

Am I doing things right?

gxyd · 2017-11-25T11:07:51Z

Should I mention any references to point to the current definitions of changes made in Jaccard index? I remember reading on the issue that the current references don't suffice for the 'correct definition' of Jaccard index.

jnothman · 2017-11-26T03:56:55Z

I think your doc building issue is just that you've not got the dev version of scikit-learn on your path (e.g. pip install --editable ~/path/to/scikit-learn)

…

On 25 November 2017 at 22:07, Gaurav Dhingra ***@***.***> wrote: Should I mention any references to point to the current definitions of changes made in Jaccard index? I remember reading on the issue that the current references don't suffice for the 'correct definition' of Jaccard index. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#10083 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz65xXfWfX4AR19G2KxaKLpcF-kxpGks5s5_UJgaJpZM4QVXCl> .

gxyd · 2017-11-26T03:59:05Z

I fixed the issue of doc building though. I don't seem to be having problem with it now, I got help on gitter.

gxyd · 2017-11-26T11:47:49Z

@jnothman I am not able figure out how do I handle the case of multi-label case. Things like, what should happen if the user passes average=micro and the problem is a multi-label case problem?

jnothman · 2017-11-26T12:41:07Z

micro is easy in the multilabel case. I'm not sure it makes much sense in the multiclass case. In the multilabel case, it's identical to treating it all as one binary problem: you divide total true positives over total (tp + fp + fn).

gxyd · 2017-11-26T13:12:25Z

Can you please explain the meaning of multi-label problem in scikit-learn. I am looking at an example, for this I don't understand the meaning of y_true = np.array([[0, 1], [1, 1]]) and y_pred = np.array([[1, 1], [1, 1]]), what does this input represent in a physical machine learning problem?

I tried searching but haven't found understandable explanation for this on the docs. If you could direct me to docs link, that had be great.

jnothman · 2017-11-26T21:09:08Z

think of each column as a category or a set that the instance is either in or not. see for instance the Reuters rcv1 data.

amueller · 2018-07-20T14:55:21Z

doesn't seem ready and not a regression from what I can tell. untagging for 0.20. complain if you disagree.

jnothman · 2018-07-22T06:21:44Z

This PR is ready, and it fixes a bad implementation (i.e. not a regression, but against expectations), but it is not pretty. Using multilabel confusion matrix in #11179 is a much cleaner way of coding the same.

jnothman · 2018-10-30T11:37:39Z

We can now make use of multilabel_confusion_matrix to make the implementation here very small (see precision_recall_fscore_support now). Are you interested in changing the implementation, @gxyd, or should I get someone else to?

jnothman · 2018-10-30T11:37:57Z

(Also, the changelog entry will need to move to doc/whats_new/v0.21.rst)

…ccard-sim

jnothman · 2018-11-18T23:29:19Z

Yay! tests pass. I might yet make a few tweaks myself, but I'd be keen on others' review here. I consider this a bug fix as well as substantial enhancement to Jaccard.

amueller · 2018-11-20T18:29:35Z

Is this related to the cron failure in master?
https://travis-ci.org/scikit-learn/scikit-learn/builds/457399104?utm_source=github_status&utm_medium=notification

I hadn't seen that one before :-/

jnothman · 2018-11-20T23:24:45Z

I don't see how this can relate to that travis failure. That's jaccard as a distance metric, this is jaccard as an evaluation metric. Seeing as that failure is on Cron, and it's related to Cython code, perhaps it's a change there.

jnothman · 2019-02-04T21:19:52Z

I'm going to pull this into a new PR, just to clear out cobwebs and make it clear that I'm championing it and awaiting review.

jnothman · 2019-02-04T21:25:58Z

Superseded by #13092

multiclass jaccard similarity not equal to accurary_score

64e30d6

Fixes scikit-learn#7332

gxyd changed the title ~~multiclass jaccard similarity not equal to accurary_score~~ [MRG] multiclass jaccard similarity not equal to accurary_score Nov 7, 2017

jnothman reviewed Nov 7, 2017

View reviewed changes

add space and fix input

a495cfc

score being a n_class size array and weight already taken care of

fcba7f0

jnothman reviewed Nov 20, 2017

View reviewed changes

jnothman mentioned this pull request Nov 20, 2017

Add class_weight feature to MLPClassifier #9113

Open

add space to fix printing of doctest

d49ccab

add support for 'average' of type 'macro', 'micro', 'weighted'

615ac9a

add tests and make documentation changes

78b2a84

jnothman added the Bug label May 30, 2018

glemaitre added this to the 0.20 milestone Jun 8, 2018

ShangwuYao mentioned this pull request Jun 9, 2018

[MRG] FEA multilabel confusion matrix #11179

Merged

8 tasks

amueller modified the milestones: 0.20, 0.21 Jul 20, 2018

jnothman mentioned this pull request Aug 4, 2018

Classification metrics incosinstencies #11743

Closed

jnothman added 12 commits November 6, 2018 23:54

Merge branch 'master' into HEAD

c73605f

Merge branch 'master' into jaccard

b536ac6

Fix merge error

4fe8a1f

Merge branch 'master' of github.com:scikit-learn/scikit-learn into ja…

095a02e

…ccard-sim

WIP

2c9b356

Make tests pass

0e9e12d

Credit in what's new

95dfada

Clean merge error in what's new

99fdd5c

Remove debug print

80520e9

PEP8

4ba98bc

new array printing format

5b5f04c

new array printing format scikit-learn#2

dfe58f4

Revert changes to v0.20.rst

7422982

jnothman mentioned this pull request Feb 4, 2019

FIX binary/multiclass jaccard_similarity_score and extend to handle averaging #13092

Closed

jnothman closed this Feb 4, 2019

Uh oh!

[MRG] average parameter for jaccard_similarity_score #10083

[MRG] average parameter for jaccard_similarity_score #10083

Uh oh!

Conversation

gxyd commented Nov 7, 2017

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

jnothman Nov 7, 2017

Choose a reason for hiding this comment

Uh oh!

jnothman Nov 7, 2017

Choose a reason for hiding this comment

Uh oh!

gxyd Nov 8, 2017

Choose a reason for hiding this comment

Uh oh!

jnothman Nov 8, 2017

Choose a reason for hiding this comment

Uh oh!

gxyd commented Nov 8, 2017

Uh oh!

gxyd commented Nov 9, 2017

Uh oh!

jnothman commented Nov 9, 2017 via email

Uh oh!

gxyd commented Nov 10, 2017

Uh oh!

jnothman commented Nov 11, 2017 via email

Uh oh!

gxyd commented Nov 19, 2017 via email

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

gxyd commented Nov 23, 2017

Uh oh!

jnothman commented Nov 23, 2017 via email

Uh oh!

gxyd commented Nov 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gxyd commented Nov 24, 2017

Uh oh!

gxyd commented Nov 24, 2017

Uh oh!

jnothman commented Nov 24, 2017

Uh oh!

gxyd commented Nov 25, 2017

Uh oh!

gxyd commented Nov 25, 2017

Uh oh!

jnothman commented Nov 26, 2017 via email

Uh oh!

gxyd commented Nov 26, 2017 via email

Uh oh!

gxyd commented Nov 26, 2017

Uh oh!

jnothman commented Nov 26, 2017

Uh oh!

gxyd commented Nov 26, 2017

Uh oh!

jnothman commented Nov 26, 2017 via email

Uh oh!

amueller commented Jul 20, 2018

Uh oh!

jnothman commented Jul 22, 2018 via email

Uh oh!

jnothman commented Oct 30, 2018

Uh oh!

jnothman commented Oct 30, 2018

Uh oh!

jnothman commented Nov 18, 2018

Uh oh!

amueller commented Nov 20, 2018

Uh oh!

jnothman commented Nov 20, 2018

Uh oh!

gxyd commented Nov 24, 2017 •

edited

Loading