-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
DOC Rework ROC example with cross-validation #29611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR.
I find the overlapping quantile regions hard to interprete. I think it would be simpler to plot a single 90% percentile region (the one computed by the 0.45 offset).
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
I think the point of the example is to illustrate that "your one ROC curve is not the truth, it will change and fluctuate because it is only an estimate of the true, unknowable ROC curve". Similar to how the mean of a set of observations is an estimate of the mean, not the true, unknowable value of the mean. I think we can use the median and a set of quantiles to illustrate the spread/variability. But I don't understand how that is better. Can you explain what the problem is with using the standard deviation in this example? Naively I'd assume that the mean of the true positive rates at a given value of false positive rate is a quantity you can treat like the sample mean of any other set of observations sampled from a random distribution (normal or not). And that the error on that sample mean is The thing we can't do is interpret the band drawn using the standard deviation as some form of confidence interval. But then we are drawing the standard deviation (in the original example) and not the standard error on the mean. So I assume it is anyway only there to illustrate the spread (in a cartoon kind of way, not a precise statistical statement about confidence intervals or some such). |
The main motivation is: now that we support tuning the decision threshold, confidence intervals are actually important, as they can be directly translated to confidence in a business metric and therefore decision making.
Visualizing different quantiles, also implies different risk acceptance in terms of the business metric. |
Now that #29727 has been merged, I think this PR is good for a second pass of reviews. |
For information, I started to review this PR but I need to read a bit on the literature about ROC averaging before finalizing it. |
Reference Issues/PRs
Somewhat related to #25856.
What does this implement/fix? Explain your changes.
Using quantiles to demonstrate the variance in ROC curves during cross-validation can be more appropriate than standard deviation because it does not assume a Gaussian distribution of the true positive rates (TPR) across different thresholds.
This PR also prefers using
StratifiedShuffleSplit
instead of a simple 5-fold cross-validation to better show the variability across splits. For that purpose I had to use a dataset with more points than iris and changed the svm classifier to a hgbt for faster predictions.Any other comments?
This PR also takes the opportunity to improve the wording of the example's abstract.