Skip to content

ENH add from_cv_results in RocCurveDisplay (single RocCurveDisplay) #30399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 72 commits into
base: main
Choose a base branch
from

Conversation

lucyleeow
Copy link
Member

@lucyleeow lucyleeow commented Dec 3, 2024

Reference Issues/PRs

Reference Issues/PRs

Supercedes #25939

This is part of a group of draft PRs to determine best API for adding plots for cv results to our displays.

For all 3 options we take the output of cross_validate, and use the fitted estimator and test indicies. No fitting is done in the display.

We do recalculate the predictions (which would have already been done in cross_validate), which could be avoided if we decided to change cross_validate to optionally return the predictions as well (note this would make cross_val_predict redundant).
See more thread: #25939 (comment)). I think should be outside of the scope of this body of work though.

What does this implement/fix? Explain your changes.

Not 100% I've implemented this optimally.

  • RocCurveDisplay object may contain data (fpr/tpf etc) for single or multi curves
  • RocCurveDisplay returns single mpl Artist object, or list of objects for multi curves
  • RocCurveDisplay.plot handles both single and multi-curve plotting, this has meant a lot more checking is required (c.f. the other 2 implementations, as this is the only case where you can use plot directly to plot multi-curves)

More specific concerns detailed in review comments

Plot looks like:

image

TODO

We should update visualization.rst after this PR is in to add a section about from_cv_results.

Copy link

github-actions bot commented Dec 3, 2024

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 2de0f78. Link to the linter CI: here

Comment on lines 30 to 33
if n_multi is None:
name = self.estimator_name if name is None else name
else:
name = [f"{curve_type} fold {curve_idx}:" for curve_idx in range(n_multi)]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go with this implementation, I thought this change could be used for other multi cv displays. Not 100% sure on this change though.

@jeremiedbb
Copy link
Member

As discussed in today's meeting, this is my favorite solution because it's the simplest and least surprising one from a user point of view, even though it adds a bit more internal complexity than the others. And I think we can mitigate some of it by extracting parts of the plot code into dedicated _plot_single and _plot_multiple methods. Or just into small helpers, that would already help readability.

It also looks like a good portion of the added complexity will be exactly the same for other displays like PRCurveDisplay, so there might be a chance that we'll be able to factorize some parts to be used by several displays.

@lucyleeow
Copy link
Member Author

The changes in f0908e1 and 7e77d4c factorizes out common code (compared to #30508), adding helper function to either _BinaryClassifierCurveDisplayMixin (if function relevant to other binary displays) or sklearn/utils/_plotting.py (if function more generally applicable to more diplays - these potentially could be a parent class method?)

@glemaitre
Copy link
Member

Let me do another round of review.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a couple of nitpicks. It looks great on my side and almost ready to be merged.

@lucyleeow
Copy link
Member Author

I think I have addressed everything, thanks @glemaitre !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants