Skip to content

DOC Rework ROC example with cross-validation #29611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

ArturoAmorQ
Copy link
Member

@ArturoAmorQ ArturoAmorQ commented Aug 2, 2024

Reference Issues/PRs

Somewhat related to #25856.

What does this implement/fix? Explain your changes.

Using quantiles to demonstrate the variance in ROC curves during cross-validation can be more appropriate than standard deviation because it does not assume a Gaussian distribution of the true positive rates (TPR) across different thresholds.

This PR also prefers using StratifiedShuffleSplit instead of a simple 5-fold cross-validation to better show the variability across splits. For that purpose I had to use a dataset with more points than iris and changed the svm classifier to a hgbt for faster predictions.

Any other comments?

This PR also takes the opportunity to improve the wording of the example's abstract.

Copy link

github-actions bot commented Aug 2, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: f975ff9. Link to the linter CI: here

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

I find the overlapping quantile regions hard to interprete. I think it would be simpler to plot a single 90% percentile region (the one computed by the 0.45 offset).

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@betatim
Copy link
Member

betatim commented Aug 6, 2024

I think the point of the example is to illustrate that "your one ROC curve is not the truth, it will change and fluctuate because it is only an estimate of the true, unknowable ROC curve". Similar to how the mean of a set of observations is an estimate of the mean, not the true, unknowable value of the mean. I think we can use the median and a set of quantiles to illustrate the spread/variability. But I don't understand how that is better. Can you explain what the problem is with using the standard deviation in this example?

Naively I'd assume that the mean of the true positive rates at a given value of false positive rate is a quantity you can treat like the sample mean of any other set of observations sampled from a random distribution (normal or not). And that the error on that sample mean is std/sqrt(n). In our case n would be n_folds. The standard deviation (std) is the square root of the variance, which you can compute for (almost?) any distribution.

The thing we can't do is interpret the band drawn using the standard deviation as some form of confidence interval.

But then we are drawing the standard deviation (in the original example) and not the standard error on the mean. So I assume it is anyway only there to illustrate the spread (in a cartoon kind of way, not a precise statistical statement about confidence intervals or some such).

@ArturoAmorQ
Copy link
Member Author

The thing we can't do is interpret the band drawn using the standard deviation as some form of confidence interval.

The main motivation is: now that we support tuning the decision threshold, confidence intervals are actually important, as they can be directly translated to confidence in a business metric and therefore decision making.

I find the overlapping quantile regions hard to interprete.

Visualizing different quantiles, also implies different risk acceptance in terms of the business metric.

@ArturoAmorQ ArturoAmorQ changed the title DOC Use quantiles instead of std in ROC example with cross-validation DOC Rework ROC example with cross-validation Sep 2, 2024
@ArturoAmorQ
Copy link
Member Author

Now that #29727 has been merged, I think this PR is good for a second pass of reviews.

@ogrisel
Copy link
Member

ogrisel commented Jan 20, 2025

For information, I started to review this PR but I need to read a bit on the literature about ROC averaging before finalizing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants