-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Different(wrong?) meaning of pos_label=None #10010
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Well, this implicit choice of "which is positive" is made also by On the basis of "explicit is better than implicit" we should perhaps deprecate its effect on scoring. On the other hand, raising an error when a model is being scored is not very a friendly reaction to a user changing their classification problem from a 1/-1 one to a 'yes'/'no' one. In the case of the PR curve, there is rotational symmetry, so there's not much problem if you get it the wrong way around, as long as you understand which axis is which. I think overall I'm in support of deprecating |
@jnothman Thanks a lot for the reply :) |
…ision_score (#9980) <!-- Thanks for contributing a pull request! Please ensure you have taken a look at the contribution guidelines: https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md#pull-request-checklist --> #### Reference Issues/PRs <!-- Example: Fixes #1234. See also #3456. Please use keywords (e.g., Fixes) to create link to the issues or pull requests you resolved, so that they will automatically be closed when your pull request is merged. See https://github.com/blog/1506-closing-issues-via-pull-requests --> part of #9829 #### What does this implement/fix? Explain your changes. (1)add pos_label parameter to average_precision_score (Although we finally decide not to introduce pos_label in roc_auc_score, I think we might need pos_label here. Because there are no relationship between the results if we reverse the true labels, also, precision/recall all support pos_label) (2)fix a bug where average_precision_score will sometimes return nan when sample_weight contains 0 ```python y_true = np.array([0, 0, 0, 1, 1, 1]) y_score = np.array([0.1, 0.4, 0.85, 0.35, 0.8, 0.9]) average_precision_score(y_true, y_score, sample_weight=[1, 1, 0, 1, 1, 0]) # output:nan ``` I do it here because of (3) (3)move average_precision scores out of METRIC_UNDEFINED_BINARY (this should contain the regression test for (1) and (2)) Some comments: (1)For the underlying method(precision_recall_curve), the default value of pos_label is None, but I choose to set the default value of pos_label to 1 because this is what P/R/F is doing. What's more, the meaning of pos_label=None is not clear even in scikit-learn itself (see #10010) (2)I slightly modified the common test. Currently, the part I modified is only designed for brier_score_loss(I'm doing the same thing in #9562) . I think it is right because as a common test, it seems not good to force metrics to accept str y_true without pos_label. #### Any other comments? cc @jnothman Could you please take some time to review or at least judge whether this is the right way to go? Thanks a lot :) <!-- Please be aware that we are a loose team of volunteers so patience is necessary; assistance handling other issues is very welcome. We value all user contributions, no matter how minor they are. If we are slow to review, either the pull request needs some benchmarking, tinkering, convincing, etc. or more likely the reviewers are simply busy. In either case, we ask for your understanding during the review process. For more information, see our FAQ on this topic: http://scikit-learn.org/dev/faq.html#why-is-my-pull-request-not-getting-any-attention. Thanks for contributing! -->
I recently feel that maybe that the user doing |
I used to like pos_label, but now I've changed my mind. |
What I think we would benefit from is a tool that constructed a bunch of
scorers for a task, returning a mapping that could be used directly in
scoring. It gets around the user having to specify things like labels and
pos_label while giving other advantages.
The issue of pos_label=None is related to that of passing around
classes/labels. It generally relates to the need to pass around metadata,
or the problems of inferring things about the task from the data.
|
Should this go on the roadmap? I feel like it's a major API issue that's not reflected there too clearly. |
Let's deprecate I think it's impossible to define pos_label=None, see my previous comment: |
I feel like it's a major API issue that's not reflected there too clearly.
Please clarify what you think the API issue is @amueller.
|
I think we should only do For instance the labels in In particular |
Moving to the 2.0 milestone. Pinging in case you want to get this in faster. |
Updated summary of
It would be good to make (2) - the meaning does sort of make sense, as user needs to set cc @glemaitre @ogrisel ? |
Yes. It think that we should make the parameter consistent whenever possible. I also think that we don't have |
I propose:
@glemaitre WDYT? I'm happy implement or we could wait for another dev to weigh in if you want? |
For completeness, here is a summary of private
Hopefully didn't miss anything. |
After discussion with @glemaitre we have a plan forward (referring to numbered items in above comment): For (2):
For (3):
For (4):
With regard to @ogrisel's comment:
For
And for completeness, PR that adds Thoughts welcome, happy to amend the plan. |
More detailed summary of above: #10010 (comment) Binary input onlyThe following accepts binary input only:
Binary and Multi inputThe following accepts multiclass and multilabel input. The default for the below functions is
|
From #26839 (comment)
Looking at the summary above, @ogrisel what about defaulting to cc also @glemaitre |
To understand, you mean to do the same than what we do with |
No I don't think so. I was not clear. When input is binary only:
When input is multi or binary:
|
I think I like #10010 (comment). Not sure about what to do for methods that only accept binary and multilabel inputs but not multiclass inputs (are there any?). |
AFAICT these functions all extend to multilabel and multiclass by doing one-vs-rest (ref: user guide):
And for So I think all will accept multilabel and multiclass. |
Thinking about this more I am not sure about proposal in #10010 (comment) . In the case of binary input only (1) where currently:
If we were to change the default to I now think we could change all behaviour to: None means pos_label=1 if y_true is in {-1, 1} or {0, 1}, otherwise error raised It is consistent and works for binary and multi inputs. |
ping @glemaitre @ogrisel (and anyone else) I think my last comment stands, and behaviour of:
is what I would hope for (due to not being able to know when |
I agree with this consistency of |
Uh oh!
There was an error while loading. Please reload this page.
Currently, in scikit-learn, we have multiple definitions of pos_label=None and the doc seems not clear, which might confuse users.
(1) precision/recall/f1/fbeta: do not support pos_label=None
(2) brier_score_loss: default pos_label=None, means the greater of two values (the implementation currently under fix by me in #9562)
(3) precision_recall_curve/roc_auc_curve: default pos_label=None, means pos_label=1
According to @jnothman
possible solutions:
(1)deprecate pos_label=None and change the default value to 1
(2)correct the behaviour of pos_label=None in precision_recall_curve/roc_auc_curve
(3)ignore the difference and simply update the doc
Personally, I advocate for (1), at least in precision_recall_curve/roc_auc_curve, because
(1) It will not influence user's code too much, unless you are using something like y_true = ['a', 'b'] and don't provide a pos_label. In this case, it seems not reasonable to let scikit-learn set pos_label='b' for you.
(2) We don't have a correct implementation of pos_label=None in scikit-learn currently. The meaning of pos_label=None is hard to understand by user and hard to define. If there are two labels, ok, we pick the greater one. How about the situation if there is only one label? Pick that one? I'm afraid it is not always the case ,e.g., when there's only negative samples.
cc @jnothman
The text was updated successfully, but these errors were encountered: