-
-
Notifications
You must be signed in to change notification settings - Fork 26k
[MRG] Multilabel-indicator roc auc and average precision #2460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -176,17 +176,17 @@ Classification metrics | |||
|
|||
The :mod:`sklearn.metrics` implements several losses, scores and utility | |||
functions to measure classification performance. | |||
Some metrics might require probability estimates of the positive class, | |||
confidence values or binary decisions value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
binary decisions value => binary decision values?
Thanks @ogrisel for reviewing !!! |
@@ -2045,9 +2151,6 @@ def r2_score(y_true, y_pred): | |||
""" | |||
y_type, y_true, y_pred = _check_reg_targets(y_true, y_pred) | |||
|
|||
if len(y_true) == 1: | |||
raise ValueError("r2_score can only be computed given more than one" | |||
" sample.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message was wrong but there is still a zero devision error (undefined metrics problem) if there is only one sample or if the y_true
is constant or one element of y_true
is exactly equal to its mean isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is already treated in the function. If the denominator is zero and the numerator is zero, then the score is 1. If the denominator is zero and the numerator is non-zero, then the score is 0.
This makes r2_score
behave as the explained_variance_score
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good, I could not see it from the diff view in github.
I just had a quick look. I don't have time to review it deeper now. Could you please put a png rendering of the new plots in the PR description? |
_check_averaging(metric, y_true, y_score, y_true_binarize, | ||
y_score, is_multilabel) | ||
else: | ||
ValueError("Metric is not recorded has having an average option") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise ValuError
...`
Which new plot? There is not new plot at the moment. |
I thought the ROC example was updated to demonstrate averaging. I think it should :) |
Could you please add a couple of tests for the various averaging case on very tiny (minimalist) inline-defined multi-label datasets that could be checked by computing the expected output manually? |
Good point ! |
@ogrisel I have update the example about roc curves and precision-recall curves. Here are the generated plot: |
I have added some tests on toy data for multilabel-indicator data. |
@@ -57,27 +57,39 @@ | |||
recall. See the corner at recall = .59, precision = .8 for an example of this | |||
phenomenon. | |||
|
|||
Precision-recall curves are typically use in binary classification to study |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use => used
Thanks @glouppe !!! |
Just rebased on top of master ! |
I had a look through quick - all seems well to me. Nice work. |
@jaquesgrobler Thanks for reviewing !!! |
Could you please run a test coverage report and paste the relevant lines here? (and also add more tests if this report highlight uncovered options / blocks / exceptions...) :) |
Current code coverage
All missing lines in |
Now I have 100% coverage for code related to this pr.
|
# first a specific test for the given metric and then add a general test for | ||
# all metrics that have the same behavior. | ||
# | ||
# Two type of datastructures are used in order to implement this system: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
types
Rebased on top of master. I will update the what's new when it's merged. |
Merging! |
[MRG] Multilabel-indicator roc auc and average precision
Thanks, I am working at fixing the jenkins build. |
I think I fixed the python 3 issue. No idea about the numpy 1.3.1 issue. |
The goal of this pr is to add multilabel-indicator support with various types of averaging
for
roc_auc_score
andaverage_precision_score
.Still to do:
roc_auc_score
average_precision_score
A priori, I won't add ranking-based
average_precision_score
.I don't want to add support for the
multilabel-sequence
format.