-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Adds multiclass ROC AUC #12789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Adds multiclass ROC AUC #12789
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from that API decision, I'm quite happy with where this, especially the example, has come, and would be happy to Approve.
sklearn/metrics/ranking.py
Outdated
should be either equal to ``None`` or ``1.0`` as AUC ROC partial | ||
computation currently is not supported for multiclass. | ||
|
||
multiclass : string, 'ovr' or 'ovo', optional(default='ovr') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiclass : string, 'ovr' or 'ovo', optional(default='ovr') | |
multiclass : string, 'ovr' or 'ovo', optional (default='ovr') |
I do wonder if we should make the user intentionally specify ovo or ovr. Unlike say the average parameter of P/R/F, we don't need the user to tell us that the data is to be treated as multiclass (we can use the number of columns in y_score
for that) but I suspect we should be trying to encourage literacy in the idea that there are several ways to extend roc_auc to multiclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with this. We can set the default to None
and raising an error before going down the multi-class code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can set the default to None and raising an error before going down the multi-class code path.
Do others have opinions on this? @amueller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are moving to strings as keywords more, multiclass
would most likely be better as 'error'
for raising an exception, or 'warn'
if we decide on just warning.
Also to be consistent with the rest of sklearn, we should use multi_class
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just do it. We can always change it later to have a default value.
@amueller The default is |
Change of heart? CI is unhappy. |
It should be happy now. Had to add |
Makes sense!
|
|
is there an issue for the test failures? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good apart from nitpicks
sklearn/metrics/ranking.py
Outdated
return _average_binary_score( | ||
_binary_roc_auc_score, y_true, y_score, average, | ||
sample_weight=sample_weight) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when is this else active? can you add a comment maybe (unless I'm just being very slow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multilabel-indicator
sklearn/metrics/ranking.py
Outdated
sample_weight=sample_weight) | ||
|
||
|
||
def _multiclass_roc_auc_score(binary_metric, y_true, y_score, labels, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't binary_metric
alwasy _binary_roc_auc_score
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_binary_roc_auc_score
is only defined in the scope of roc_auc_score
sklearn/metrics/ranking.py
Outdated
|
||
labels : array, shape = [n_classes] or None, optional (default=None) | ||
List of labels to index ``y_score`` used for multiclass. If ``None``, | ||
the lexicon order of ``y_true`` is used to index ``y_score``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the lexicon order of ``y_true`` is used to index ``y_score``. | |
the lexical order of ``y_true`` is used to index ``y_score``. |
Very nice! I had been wanting that for quite a while.
(drawback is that I'm going to have to revise my tutorials :D :D)
|
Reference Issues/PRs
Resolves #3298
Continues #7663, #10481, #12311
Adds common test for multiclass label permutations, addressing some of #12309
What does this implement/fix? Explain your changes.
The One-vs-One implementation is 100% based on Hand and Till: https://link.springer.com/article/10.1023/A:1010920819831. The weighting by prevalence does not seem to be in literature, but it looks like a simple extension of Hand and Till.
The One-vs-Rest implementation is functionally the same as the multi-label case.
The
plot_roc.py
example has been refactored to bring the descriptions closer to the code, allowing a user to read through it without losing too much context.y_score
intest_multiclass_sample_weight_invariance
was normalized, becauseroc_auc_score
checks for this condition.Any other comments?
Currently,
roc_auc_score
does not supportsample_weight
withmulticlass="ovo"
. This comes from the fact that thebinary_metric
calls in_average_multiclass_ovo_score
become dependent on each other when the masked sample weights are passed tobinary_metric
.