Skip to content

FIX select the probability estimates or transform the decision values when pos_label is provided #18114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Oct 9, 2020

Conversation

glemaitre
Copy link
Member

@glemaitre glemaitre commented Aug 7, 2020

Solve a bug where the appropriate column of the probability estimates was not selected or that the decision values where not inversed when pos_label is passed and that it does not correspond to clf.classes_[1].

@glemaitre
Copy link
Member Author

@jnothman @thomasjpfan OK so I think this is the only fixes required.
The tests are failing in master. It would have involved that someone creating a scorer based on average position with pos_label and string labels with an ordering where the positive label would have been clf.classes_[0] would have get something wrong.
In a GridSearchCV, you would have potentially optimized the opposite of what you would expect.

Most probably, nobody encountered this issue :)

@glemaitre
Copy link
Member Author

Once this PR is merged, I plan to make another one with either an entry in the user guide or an example to show how to pass the pos_label and what could be the drawback if you don't do it.

@glemaitre
Copy link
Member Author

ping @ogrisel as well.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR would handle the case where there is a mismatch between pos_label and estimator.classes_[1].

I keep on wondering if this behavior of selecting the column would be surprising to a user.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also have a similar problem with non-symmetric scorers that take hard class predictions such as f1 score for instance, no?

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@glemaitre
Copy link
Member Author

We might also have a similar problem with non-symmetric scorers that take hard class predictions such as f1 score for instance, no?

Since we don't need to select a column or inverse the decision, we should be safe. But I will add a test.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more review comments:

# provided.
raise ValueError(err_msg)
elif not support_multi_class:
raise ValueError(err_msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand this. In both cases it returns the same message.

Also, what happens when support_multi_class and y_pred.shape[1] != 1?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was what was intended in the past: https://github.com/scikit-learn/scikit-learn/pull/18114/files#diff-e907207584273858caf088b432e38d04L243-L247

Also, what happens when support_multi_class and y_pred.shape[1] != 1

It is a case where y_true is encoded as binary but that the y_pred` is multi-class. It is a case supported by ROC-AUC apparently and I needed to keep it like this for backward compatibility.

I am not sure that there is a better way of inferring the exact encoding in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an example to be more specific. I think that we should investigate if we can pass labels to type_of_target which could be an optional argument given when we are using it in the metrics.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another pass

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still very much confused by the multiclass case. Could you please add tests for multiclass classification problem with proba and threshold scorers with string labels? Maybe that would clear the confusion and help me complete this review.

# [0. , 0.2 , 0.8 ]])
# roc_auc_score(
# y_true, y_score, labels=[0, 1, 2], multi_class='ovo'
# )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still do not understand this condition. This comment refers to the metric function outside of the scorer API.

I assume that this is related to to roc_auc_ovo_scorer which is a _ProbaScorer instance and this condition is about raising a ValueError when y_pred.shape[1] == 1 for some reason but I really do not see how this relates to the example you give here as y_pred has 3 columns in this example so it does not match the case of the condition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is trying to keep the logic from:

elif y_pred.shape[1] == 1: # not multiclass

which I added in #15274. This was added because we can have a binary y_true with an estimator trained on > 2 classes.

y_pred.shape[1] == 1 was use to mean y_pred came from a classifier with only one class. The check for the shape of y_pred was added here: 94db3d9

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in scikit-learn we cannot have a classifier train with a single class, isn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is strange, but it can happen:

from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import make_blobs
import numpy as np

X, y = make_blobs(random_state=0, centers=2)
clf = DecisionTreeClassifier()
clf.fit(X, np.zeros_like(y))

clf.classes_
# array([0])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm. Then, I am confused with check_classifiers_one_label :) I have to check. Anyway, I think that the last change make everything more explicit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see a must_pass in the raises. It would be much more explicit to have tag for that I think: accept_single_label for instance.

@rth rth self-requested a review September 23, 2020 09:45
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is much better organized that previous versions of this PR and the tests are great. Thanks very much for the fix @glemaitre.

@ogrisel ogrisel merged commit 193670c into scikit-learn:master Oct 9, 2020
amrcode pushed a commit to amrcode/scikit-learn that referenced this pull request Oct 19, 2020
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants