-
-
Notifications
You must be signed in to change notification settings - Fork 25.9k
Various improvements and more details in release highlights for 1.5 #29056
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various improvements and more details in release highlights for 1.5 #29056
Conversation
# Note however, that the balanced accuracy is not necessarily the most | ||
# meaningful model selection metric for a given application. It often makes | ||
# sense to optimize the decision threshold directly for a business metric of | ||
# interest. **Custom business metrics can be defined by assigning different costs | ||
# to false positives and false negatives or different gains to true positives | ||
# and true negatives.** Furthermore, those costs and gains can depend on ancilary | ||
# metadata specific to each individual data point such as the amount of a | ||
# transaction in a fraud detection system. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this belongs to the highlights. This is already explained in the examples. And I mean all examples we show in the highlights are always not meaningful toy examples just for the purpose on simple demonstration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think optimizing for custom is the single most important use of this new tool. However the text and the code snippet do not make this point strong enough. I think we need to explain a minimum what a custom metric is and why it can often rely on side metadata to give motivation to the reader to follow the link and read until the section that explains cost-sensitive learning with side metadata.
I agree that this will be too complex to put in the code snippet of the release highlight but I thought this paragraph could help as a middle ground.
are always not meaningful toy examples just for the purpose on simple demonstration.
Doing things sub-optimally in the past should not be a reason to no improve for the future ;)
If the demonstration is not realistic enough, it's hard to assess the value of the new feature and justify spending time and effort to learn more in the doc / examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to explain a minimum what a custom metric is and why it can often rely on side metadata to give motivation to the reader to follow the link and read until the section that explains cost-sensitive learning with side metadata.
Alright
Doing things sub-optimally in the past should not be a reason to no improve for the future
I'm not saying it's suboptimal, to me it's good to make extra simple snippets :)
If the demonstration is not realistic enough, it's hard to assess the value of the new feature and justify spending time and effort to learn more in the doc / examples.
I'm not sure about that. I have the opposite impression. I'd be interested in seing statistics. To me, it drives more attention if it just shows that you can now do something that you couldn't do before in a very simple way. Then if someone wants to see how far he can go, he will dig the docs deeper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we please use a meaningful custom metric then? And avoid all the balanced accuracy stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or simply use accuracy. Here for the threshold, it‘s fine as long as it’s not too extreme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
roc_auc_score
should be invariant to a change in threshold.
Yes but I do not see how it's related to this discussion. Using a threshold-invariant metric is pretty useless to tune the decision threshold. Arguably we should even raise a warning in that in that case.
I would not mind to use a very short custom metric like.
Alright. I will update this PR accordingly then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but I do not see how it's related to this discussion. Using a threshold-invariant metric is pretty useless to tune the decision threshold. Arguably we should even raise a warning in that in that case.
you can ignore, I made a suggestion but quickly realised it was non-sense so I edited my comment 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 09bd60f. Hopefully it will help getting our message through while staying concise enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the custom function to actually introduce a tradeoff between the two classes otherwise one would get a dummy classifier without realizing it.
feac314
to
db03bb8
Compare
Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, regardless of the resolution of #29056 (comment) and #29056 (comment)
Thanks for the reviews. I enabled auto-merge. |
@jeremiedbb I let you handle the backport to 1.5.X? Otherwise I can do the backport. |
I'll have a look at the confusion matrix issue. |
I should be good with @jeremiedbb's fix. |
….5 (scikit-learn#29056) Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
@ogrisel Thanks Olivier for making it better. |
….5 (#29056) Co-authored-by: Jérémie du Boisberranger <jeremie@probabl.ai>
/cc @jeremiedbb.