Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
FEA add CumulativeAccuracyDisplay #28752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEA add CumulativeAccuracyDisplay #28752
Changes from all commits
2630953
5b42e01
cc674b0
aec617f
73ca739
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Check warning on line 108 in sklearn/metrics/_plot/cap_curve.py
sklearn/metrics/_plot/cap_curve.py#L108
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very similar to a Lorenz curve that we manually define in this example.
scikit-learn/examples/linear_model/plot_poisson_regression_non_normal_loss.py
Lines 495 to 507 in 6bf0ba5
The differences are (as far as I can see):
y_pred
is the estimated probability of a positive class membership given by a classifier for CAP/Lift/Gain curves whiley_pred
hold the continuous prediction of a positive target variable for a regressor.sorted_indices = np.argsort(y_pred)[::-1]
is reversed in CAP while it stays in ascending order in a Lorenz curve.So I think we could indeed reuse the same class both for CAP/Lift/Gain curves and Lorenz curves.
To switch between both, we can probably introduce a keyword parameter
x_ordering_style="cap"
vsx_ordering_style="lorenz"
for instance.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in the main thread of the PR, let's keep that for a follow-up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we agree to leave the Lorenz curve / regression case for a follow-up PR I would defer this change for later. Speaking about regression on an class that is designed and document to only accept classifier predictions would be a source confusion.
Check warning on line 282 in sklearn/metrics/_plot/cap_curve.py
sklearn/metrics/_plot/cap_curve.py#L282
Check warning on line 284 in sklearn/metrics/_plot/cap_curve.py
sklearn/metrics/_plot/cap_curve.py#L284
Check warning on line 298 in sklearn/metrics/_plot/cap_curve.py
sklearn/metrics/_plot/cap_curve.py#L298
Check warning on line 301 in sklearn/metrics/_plot/cap_curve.py
sklearn/metrics/_plot/cap_curve.py#L301
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also extend the tests to check for the presence shape and dtypes of other public attributes such as
y_true_cumulative
(new name suggested in https://github.com/scikit-learn/scikit-learn/pull/28752/files#r1553497449).I might also make sense to check that the values of those attribute match (with
assert_allclose
) when callingCumulativeAccuracyDisplay.from_estimator
orCumulativeAccuracyDisplay.from_predictions
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made the change for
y_true_cumulative
, but I'm not sure about shape or dtypes. Are displays supposed to have these attributes (or just this one)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just realised that I've misread it. You meant checking the attributes of y_true_cumulative, not the display itself.