Skip to content

[MRG] Multilabel-indicator roc auc and average precision #2460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 35 commits into from
Dec 2, 2013

Conversation

arjoly
Copy link
Member

@arjoly arjoly commented Sep 19, 2013

The goal of this pr is to add multilabel-indicator support with various types of averaging
for roc_auc_score and average_precision_score.

Still to do:

A priori, I won't add ranking-based average_precision_score.
I don't want to add support for the multilabel-sequence format.

@@ -176,17 +176,17 @@ Classification metrics

The :mod:`sklearn.metrics` implements several losses, scores and utility
functions to measure classification performance.
Some metrics might require probability estimates of the positive class,
confidence values or binary decisions value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binary decisions value => binary decision values?

@arjoly
Copy link
Member Author

arjoly commented Sep 20, 2013

Thanks @ogrisel for reviewing !!!

@@ -2045,9 +2151,6 @@ def r2_score(y_true, y_pred):
"""
y_type, y_true, y_pred = _check_reg_targets(y_true, y_pred)

if len(y_true) == 1:
raise ValueError("r2_score can only be computed given more than one"
" sample.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message was wrong but there is still a zero devision error (undefined metrics problem) if there is only one sample or if the y_true is constant or one element of y_true is exactly equal to its mean isn't it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is already treated in the function. If the denominator is zero and the numerator is zero, then the score is 1. If the denominator is zero and the numerator is non-zero, then the score is 0.

This makes r2_score behave as the explained_variance_score.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, I could not see it from the diff view in github.

@ogrisel
Copy link
Member

ogrisel commented Sep 20, 2013

Thanks @ogrisel for reviewing !!!

I just had a quick look. I don't have time to review it deeper now. Could you please put a png rendering of the new plots in the PR description?

_check_averaging(metric, y_true, y_score, y_true_binarize,
y_score, is_multilabel)
else:
ValueError("Metric is not recorded has having an average option")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise ValuError...`

@arjoly
Copy link
Member Author

arjoly commented Sep 20, 2013

Thanks @ogrisel for reviewing !!!

I just had a quick look. I don't have time to review it deeper now. Could you please put a png rendering of the new plots in the PR description?

Which new plot? There is not new plot at the moment.

@ogrisel
Copy link
Member

ogrisel commented Sep 20, 2013

I thought the ROC example was updated to demonstrate averaging. I think it should :)

@ogrisel
Copy link
Member

ogrisel commented Sep 20, 2013

Could you please add a couple of tests for the various averaging case on very tiny (minimalist) inline-defined multi-label datasets that could be checked by computing the expected output manually?

@arjoly
Copy link
Member Author

arjoly commented Sep 20, 2013

I thought the ROC example was updated to demonstrate averaging. I think it should :)

Good point !

@arjoly
Copy link
Member Author

arjoly commented Sep 20, 2013

@ogrisel I have update the example about roc curves and precision-recall curves. Here are the generated plot:

roc_binary
roc_multiclass
pr_binary
pr_multiclass

@arjoly
Copy link
Member Author

arjoly commented Sep 20, 2013

Could you please add a couple of tests for the various averaging case on very tiny (minimalist) inline-defined multi-label datasets that could be checked by computing the expected output manually?

I have added some tests on toy data for multilabel-indicator data.
And, I have also added some tests on binary inputs, since there were none.

@@ -57,27 +57,39 @@
recall. See the corner at recall = .59, precision = .8 for an example of this
phenomenon.

Precision-recall curves are typically use in binary classification to study
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use => used

@arjoly
Copy link
Member Author

arjoly commented Sep 20, 2013

Thanks @glouppe !!!

@arjoly
Copy link
Member Author

arjoly commented Sep 24, 2013

Just rebased on top of master !
More reviews are welcome :-)

@jaquesgrobler
Copy link
Member

I had a look through quick - all seems well to me. Nice work.
Also built the branches docs and all renders fine and looks good on my side.
I'll try and have a more in depth look later, but as far as I can see, all has been covered
👍

@arjoly
Copy link
Member Author

arjoly commented Sep 24, 2013

@jaquesgrobler Thanks for reviewing !!!

@ogrisel
Copy link
Member

ogrisel commented Sep 25, 2013

Could you please run a test coverage report and paste the relevant lines here? (and also add more tests if this report highlight uncovered options / blocks / exceptions...) :)

@arjoly
Copy link
Member Author

arjoly commented Sep 25, 2013

Current code coverage

$ nosetests sklearn/metrics --with-coverage

Name                                                  Stmts   Miss  Cover   Missing
-----------------------------------------------------------------------------------
sklearn.metrics                                           9      0   100%   
sklearn.metrics.metrics                                 409      1    99%   459
sklearn.metrics.scorer                                   85      8    91%   42, 173-181, 184, 187

All missing lines in sklearn.metrics.scorerdoes not concern this pull request.
I will add a test for line 459 in sklearn.metrics.metrics.

@arjoly
Copy link
Member Author

arjoly commented Sep 25, 2013

Now I have 100% coverage for code related to this pr.

$ nosetests sklearn/metrics --with-coverage

Name                                                  Stmts   Miss  Cover   Missing
-----------------------------------------------------------------------------------
sklearn.metrics                                           9      0   100%   
sklearn.metrics.metrics                                 409      0   100%   
sklearn.metrics.scorer                                   85      8    91%   42, 173-181, 184, 187

# first a specific test for the given metric and then add a general test for
# all metrics that have the same behavior.
#
# Two type of datastructures are used in order to implement this system:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

types

@arjoly
Copy link
Member Author

arjoly commented Dec 2, 2013

Rebased on top of master.

I will update the what's new when it's merged.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 8c2c1c5 on arjoly:auc-multilabel into 391c913 on scikit-learn:master.

@ogrisel
Copy link
Member

ogrisel commented Dec 2, 2013

Merging!

ogrisel added a commit that referenced this pull request Dec 2, 2013
[MRG] Multilabel-indicator roc auc and average precision
@ogrisel ogrisel merged commit a82c8ac into scikit-learn:master Dec 2, 2013
@arjoly
Copy link
Member Author

arjoly commented Dec 2, 2013

Thanks, I am working at fixing the jenkins build.

@ogrisel
Copy link
Member

ogrisel commented Dec 2, 2013

I think I fixed the python 3 issue. No idea about the numpy 1.3.1 issue.

@arjoly arjoly deleted the auc-multilabel branch December 2, 2013 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants