Skip to content

[MRG] Adding pos_label parameter to roc_auc_score (#6873) #6874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

[MRG] Adding pos_label parameter to roc_auc_score (#6873) #6874

wants to merge 2 commits into from

Conversation

tswr
Copy link

@tswr tswr commented Jun 9, 2016

Reference Issue

Fixes #6873

What does this implement/fix? Explain your changes.

This PR implements the pos_label parameter in roc_auc_score function. Actually all it does is forwarding this parameter to roc_curve function call.

Any other comments?

Added tests for pos_label param in roc_curve and roc_auc_score.

Results of make: all tests ok.

@@ -250,7 +254,8 @@ def _binary_roc_auc_score(y_true, y_score, sample_weight=None):
raise ValueError("Only one class present in y_true. ROC AUC score "
"is not defined in that case.")

fpr, tpr, tresholds = roc_curve(y_true, y_score,
# use closure for pos_label parameter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to make it explicit and not use closure. Just add the pos_label arg to _binary_roc_auc_score and pass it explicitly...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a reason why I used closure. _binary_roc_auc_score will be called from _average_binary_score as binary_metric(y_true, y_score, sample_weight=sample_weight). And it doesn't look like _average_binary_score should know anything about pos_label.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, Then you could do partial(_binary_roc_auc_score, pos_label=pos_label) and pass that as the binary_metric.

@raghavrv
Copy link
Member

LGTM pending comments.

@agramfort
Copy link
Member

there is no test.

set the PR to MRG when ready

@tswr
Copy link
Author

tswr commented Jun 12, 2016

@agramfort, where is the proper place for the tests? Is it test_grid_search.py? Seems like the only place in sklearn/tests where roc_auc_score is mentioned.

@agramfort
Copy link
Member

test_ranking.py

@tswr
Copy link
Author

tswr commented Jun 12, 2016

Ok, will do.

@tswr tswr changed the title Adding pos_label parameter to roc_auc_score (#6873) [MRG] Adding pos_label parameter to roc_auc_score (#6873) Jun 12, 2016
@tswr
Copy link
Author

tswr commented Jun 14, 2016

Is there something else I should do within this issue?

@raghavrv
Copy link
Member

(Appveyor failure unrelated)

@jnothman
Copy link
Member

jnothman commented Jun 18, 2016

Please add, or at least try adding, this metric to METRICS_WITH_POS_LABEL in test_common.py

@jnothman
Copy link
Member

I'd like to clarify the semantics. Let's say we're using this measure with a binary problem with labels [1, 2] and 1 is the positive class. AFAIK, the predict_proba and decision_function of our classifiers will output a 1d array representing a greater-is-better score for the higher-valued class, i.e. 2. So roc_auc_score(y_true, y_score, pos_label=1) is correctly implemented wrt scikit-learn classifiers if it treats 1 as the positive class in y_true, but such that a lower value of y_score means an instance is more likely classified as positive... Is that what's going on here? (I think not.)

@amueller amueller added this to the 0.19 milestone Oct 25, 2016
@amueller
Copy link
Member

@tswr are you still interested in working on this?
Maybe we should also add a labels parameter?
Hm roc_curve already has a pos_label parameter that assumes you give it the "right" scores so that higher is positive, but here we should take care of that ourselves.
I could swear somewhere in metrics is already the logic for treating the binary special case but I couldn't find it (or is it still in a PR?). We should also do the same for average_precision.

@amueller
Copy link
Member

amueller commented Dec 7, 2016

tagging as "needs contributor" as there was no reply from @tswr

@aradhyamathur
Copy link

Hello I am quite new to scikit-learn and would like to contribute.

@amueller
Copy link
Member

@aradhyamathur I think this is not a good issue for a beginner, it might be tricky to get right.

@aradhyamathur
Copy link

@amueller ohk I understand thanks for your insight :)

@qinhanmin2014
Copy link
Member

This work is continued in #9567. Could you please have a look? Thanks.

@jnothman
Copy link
Member

Closing given #2723 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roc_auc_score doesn't have a pos_class attribute
7 participants