[MRG] Adds multiclass ROC AUC #12789

thomasjpfan · 2018-12-14T20:54:02Z

Reference Issues/PRs

Resolves #3298
Continues #7663, #10481, #12311

Adds common test for multiclass label permutations, addressing some of #12309

What does this implement/fix? Explain your changes.

The One-vs-One implementation is 100% based on Hand and Till: https://link.springer.com/article/10.1023/A:1010920819831. The weighting by prevalence does not seem to be in literature, but it looks like a simple extension of Hand and Till.
The One-vs-Rest implementation is functionally the same as the multi-label case.
The plot_roc.py example has been refactored to bring the descriptions closer to the code, allowing a user to read through it without losing too much context.
y_score in test_multiclass_sample_weight_invariance was normalized, because roc_auc_score checks for this condition.

Any other comments?

Currently, roc_auc_score does not support sample_weight with multiclass="ovo". This comes from the fact that the binary_metric calls in _average_multiclass_ovo_score become dependent on each other when the masked sample weights are passed to binary_metric.

…ndicator format

…c-new

jnothman

Apart from that API decision, I'm quite happy with where this, especially the example, has come, and would be happy to Approve.

jnothman · 2019-06-20T08:23:08Z

sklearn/metrics/ranking.py

+        should be either equal to ``None`` or ``1.0`` as AUC ROC partial
+        computation currently is not supported for multiclass.
+
+    multiclass : string, 'ovr' or 'ovo', optional(default='ovr')


Suggested change

multiclass : string, 'ovr' or 'ovo', optional(default='ovr')

multiclass : string, 'ovr' or 'ovo', optional (default='ovr')

I do wonder if we should make the user intentionally specify ovo or ovr. Unlike say the average parameter of P/R/F, we don't need the user to tell us that the data is to be treated as multiclass (we can use the number of columns in y_score for that) but I suspect we should be trying to encourage literacy in the idea that there are several ways to extend roc_auc to multiclass.

I am okay with this. We can set the default to None and raising an error before going down the multi-class code path.

We can set the default to None and raising an error before going down the multi-class code path.

Do others have opinions on this? @amueller?

Since we are moving to strings as keywords more, multiclass would most likely be better as 'error' for raising an exception, or 'warn' if we decide on just warning.

Also to be consistent with the rest of sklearn, we should use multi_class.

Let's just do it. We can always change it later to have a default value.

thomasjpfan · 2019-06-24T18:05:56Z

How did we pick the default again? I usually feel OVR is easier to grasp and it would make the default multi-class behavior be consistent with the multi-label behavior.

@amueller The default is ovr in roc_auc_score. This PR defines both ovo and ovr in SCORERS, do you want to remove ovo from SCORERS?

jnothman · 2019-07-02T23:41:36Z

Change of heart? CI is unhappy.

thomasjpfan · 2019-07-02T23:59:22Z

Change of heart? CI is unhappy.

It should be happy now. Had to add roc_auc_score to METRIC_UNDEFINED_MULTICLASS, since it does not support multiclass by default. (Also added ovr_roc_auc and weighted_ovr_roc_auc to the common test)

jnothman · 2019-07-03T01:59:51Z

Makes sense!

thomasjpfan · 2019-07-04T16:32:59Z

test_explained_variance_components_10_20 fail is unrelated.

amueller · 2019-07-17T17:34:05Z

is there an issue for the test failures?

amueller

looks good apart from nitpicks

amueller · 2019-07-17T18:45:19Z

sklearn/metrics/ranking.py

+        return _average_binary_score(
+            _binary_roc_auc_score, y_true, y_score, average,
+            sample_weight=sample_weight)
+    else:


when is this else active? can you add a comment maybe (unless I'm just being very slow)

multilabel-indicator

amueller · 2019-07-17T18:47:03Z

sklearn/metrics/ranking.py

+            sample_weight=sample_weight)
+
+
+def _multiclass_roc_auc_score(binary_metric, y_true, y_score, labels,


isn't binary_metric alwasy _binary_roc_auc_score?

_binary_roc_auc_score is only defined in the scope of roc_auc_score

amueller · 2019-07-17T18:49:35Z

sklearn/metrics/ranking.py

+
+    labels : array, shape = [n_classes] or None, optional (default=None)
+        List of labels to index ``y_score`` used for multiclass. If ``None``,
+        the lexicon order of ``y_true`` is used to index ``y_score``.


Suggested change

the lexicon order of ``y_true`` is used to index ``y_score``.

the lexical order of ``y_true`` is used to index ``y_score``.

GaelVaroquaux · 2019-07-17T20:52:46Z

Very nice! I had been wanting that for quite a while. (drawback is that I'm going to have to revise my tutorials :D :D)

maskani-moh and others added 30 commits January 16, 2018 11:23

Add Hand & Till (OvO) and Provost & Domingos (OvR) implementations

a666180

Add multi-class implementation in roc_auc_score method

118a700

Add tests for multi-class settings OvO and OvR

3371b1d

Fix binary case roc computation

d74ce16

Make scores add up to 1.0

805d804

Fix typo

2bd693e

Differenciate binary case explicitly to avoid error when multilabel-i…

fc54dde

…ndicator format

Fix prediciton scores

133a09a

Merge remote-tracking branch 'upstream/master' into multiclass-roc-au…

bc40110

…c-new

Merge remote-tracking branch 'upstream/master' into multiclass-roc-au…

0d035e3

…c-new

Fix test error by setting param dtype=None

d08f084

Quick fix

4c7a656

Raise error for partial computation in multiclass

4723b00

Fix pep8

aa6dd49

Merge branch 'master' into multiclass_roc_auc

5af924b

try adding ovo multiclass scores

5c094cd

allow roc_auc and macro_roc_auc for multiclass in test_common

d0393d7

add multiclass roc_auc metrics to scores, more common tests

4a0ded6

ovr is same as multilabel

d599552

remove non-existant import

2cc343a

Merge remote-tracking branch 'upstream/master' into multiclass_roc_auc_2

74bef0d

RFC: Removes unrelated diffs

c91a9bd

ENH: Optimizes ovo

e4d2443

WIP: Adds tests back

0f5a088

WIP: ovr supports sample_weigth

1de4333

RFC: Rename with weighted prefix

e169e0d

RFC: Moves permutation test to common

95a117c

RFC: Uses pytest parameters

01ba344

Merge remote-tracking branch 'upstream/master' into multiclass_roc_auc_3

0517eae

RFC: Minimizes diffs

67f2376

jnothman reviewed Jun 20, 2019

View reviewed changes

thomasjpfan added 2 commits June 24, 2019 14:06

Merge remote-tracking branch 'upstream/master' into multiclass_roc_auc_3

3b2b436

DOC Spacing

89de04f

thomasjpfan mentioned this pull request Jun 27, 2019

Support for multi-class roc_auc scores #3298

Closed

thomasjpfan added 2 commits July 2, 2019 13:27

Merge remote-tracking branch 'upstream/master' into multiclass_roc_auc_3

7999125

ENH Raises when multi_class is not specified

11e87bb

jnothman approved these changes Jul 2, 2019

View reviewed changes

thomasjpfan added 2 commits July 2, 2019 19:16

REV Defaults to ovr

df7efe0

STY Minor

0646612

TST roc_auc_score defaults to not support multiclass

bfc73c9

thomasjpfan added 2 commits July 2, 2019 23:45

Merge remote-tracking branch 'upstream/master' into multiclass_roc_auc_3

eaf979b

ENH Adds weighted scorers

65fea8e

Merge remote-tracking branch 'upstream/master' into multiclass_roc_auc_3

2085b4d

amueller approved these changes Jul 17, 2019

View reviewed changes

thomasjpfan added 2 commits July 17, 2019 15:12

CLN Address comments

c5101e2

CLN Uses partial

1399ddd

amueller merged commit dc9955b into scikit-learn:master Jul 17, 2019

This was referenced Jul 17, 2019

WIP Multiclass roc auc #12311

Closed

[MRG] Multi-class roc_auc_score #10481

Closed

[MRG] Support for multi-class roc_auc scores #7663

Closed

TomDLT pushed a commit to TomDLT/scikit-learn that referenced this pull request Jul 25, 2019

[MRG] Adds multiclass ROC AUC (scikit-learn#12789)

1d9f033

oulenz mentioned this pull request Oct 18, 2019

clarify doc-string of roc_auc_score, add references #15293

Merged

efiegel mentioned this pull request Nov 25, 2020

TST add binary and multiclass test for scorers #18904

Merged

	multiclass : string, 'ovr' or 'ovo', optional(default='ovr')
	multiclass : string, 'ovr' or 'ovo', optional (default='ovr')

		sample_weight=sample_weight)


		def _multiclass_roc_auc_score(binary_metric, y_true, y_score, labels,

	the lexicon order of ``y_true`` is used to index ``y_score``.
	the lexical order of ``y_true`` is used to index ``y_score``.

Uh oh!

[MRG] Adds multiclass ROC AUC #12789

[MRG] Adds multiclass ROC AUC #12789

Uh oh!

Conversation

thomasjpfan commented Dec 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thomasjpfan commented Jun 24, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Jul 2, 2019

Uh oh!

thomasjpfan commented Jul 2, 2019

Uh oh!

jnothman commented Jul 3, 2019 via email

Uh oh!

thomasjpfan commented Jul 4, 2019

Uh oh!

amueller commented Jul 17, 2019

Uh oh!

amueller left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GaelVaroquaux commented Jul 17, 2019 via email

Uh oh!

Uh oh!

thomasjpfan commented Dec 14, 2018 •

edited

Loading

thomasjpfan commented Jun 24, 2019 •

edited

Loading