-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FEA add CumulativeAccuracyDisplay #28752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEA add CumulativeAccuracyDisplay #28752
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR.
On a first pass, could you fix the linter CI. I would advise to directly use pre-commit. It would corresponds to the optional step 10 of the guideline(https://scikit-learn.org/dev/developers/contributing.html#how-to-contribute). It is optional but it makes life easier :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for the PR. Here is quick first pass of reviews.
sklearn/metrics/_plot/cap_curve.py
Outdated
|
||
CumulativeAccuracyDisplay.from_predictions(y_test, y_scores, name='Logistic Regression') | ||
plt.title('CAP Curve using from_predictions') | ||
plt.show() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For information when running this code I get:

This looks fine but I would like to also have an option to display the curve for oracle predictions and the curve for the random predictions baseline.
EDIT: we need to check the literature to see if it's common to normalize the cumulated values to 1 instead of plotting absolute numbers. Maybe an option to select between relative and absolute scales would make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, I think we should set better the x- and y-lim (as in ROC curve display).
Also, is it common for this curve to be be in a squared axis? I see this is kind of different from the ROC and PR curve in this regards, where the x- and y-axis are not the same unit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
An option for relative VS absolute scales sounds relevant to me. Don't have a opinion of which should be the default.
-
+1 for squared axis (especially with normalized scale). This paper (Tasche 2006 "Validation of internal rating systems and PD estimates" contains one example with a squared axis cap curve:

It reminds of ROC curve, but maybe there are reasons of not doing so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An option for relative VS absolute scales sounds relevant to me. Don't have a opinion of which should be the default.
I would use relative intuitively as the default.
I've implemented some of the elements mentioned, but I don't have the time to do everything quickly. I also have several concerns:
|
The example I linked above is precisely an example of a Lorenz curve which based on the linked discussion is very related to the CAP curve. EDIT: I also posted an extra reference in the parent issue: #10003 (comment). |
For an example, you can have a look at the tests for You can run all the tests in this module with the pytest command (
you can run a specific test with:
since this is a parametrized test (with
So for your branch, you can create a file named More details on pytest usage in https://scikit-learn.org/dev/developers/tips.html#pytest-tips or on the pytest website itself. |
Some pre-commit tasks such as the black formatted and the ruff import statement formatter can auto-fix the code with you type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments:
|
||
# compute cumulative sums for true positives and all cases | ||
cumulative_true_positives = np.cumsum(y_true_sorted * sample_weight_sorted) | ||
cumulative_total = np.cumsum(sample_weight_sorted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very similar to a Lorenz curve that we manually define in this example.
scikit-learn/examples/linear_model/plot_poisson_regression_non_normal_loss.py
Lines 495 to 507 in 6bf0ba5
def lorenz_curve(y_true, y_pred, exposure): | |
y_true, y_pred = np.asarray(y_true), np.asarray(y_pred) | |
exposure = np.asarray(exposure) | |
# order samples by increasing predicted risk: | |
ranking = np.argsort(y_pred) | |
ranked_frequencies = y_true[ranking] | |
ranked_exposure = exposure[ranking] | |
cumulated_claims = np.cumsum(ranked_frequencies * ranked_exposure) | |
cumulated_claims /= cumulated_claims[-1] | |
cumulated_exposure = np.cumsum(ranked_exposure) | |
cumulated_exposure /= cumulated_exposure[-1] | |
return cumulated_exposure, cumulated_claims |
The differences are (as far as I can see):
y_pred
is the estimated probability of a positive class membership given by a classifier for CAP/Lift/Gain curves whiley_pred
hold the continuous prediction of a positive target variable for a regressor.sorted_indices = np.argsort(y_pred)[::-1]
is reversed in CAP while it stays in ascending order in a Lorenz curve.
So I think we could indeed reuse the same class both for CAP/Lift/Gain curves and Lorenz curves.
To switch between both, we can probably introduce a keyword parameter x_ordering_style="cap"
vs x_ordering_style="lorenz"
for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed in the main thread of the PR, let's keep that for a follow-up PR.
thanks! helps a lot actually (also for tests) |
Great. Now that you have tests, you can see in the diff view that the test coverage analysis has resulted in a few automated inline comments that highlight section of your code that are currently not tested: https://github.com/scikit-learn/scikit-learn/pull/28752/files To merge this PR we will need to have improved test coverage. However, I would rather put the focus on the upgrading the code itself to:
|
To keep this PR minimal, I would postpone this to a follow-up PR. |
Alright, I agree to limit the scope of this PR to the case of CAP / Lift / Gain curve on classification tasks for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said above I am ok to focus this PR on the classification case and leave regression for later. However I think we should already chose public attribute names for the cumulated statistics that are compatible with the anticipated regression/Lorenz curve case (see inline comment below)
Furthermore, I think we should update one of the existing example that displays a ROC curve for a binary classifier to also include a CAP curve. I think that the following example that already displays both a DET curve and a ROC curve is a good candidate:
https://scikit-learn.org/stable/auto_examples/model_selection/plot_det.html
We should probably refactor it a bit to change the title and even the filename to make it explicit that it's about more than DET curves.
The alternative would be to create a new example file dedicated to CAP but I think we all agree that the example gallery has already too many short examples and it's better to consolidate them into larger examples that combine related components from the scikit-learn API and better contrasts the commonalities and differences.
if plot_chance_level: | ||
assert ( | ||
cap_display.chance_level_ is not None | ||
), "Chance level line should be present" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also extend the tests to check for the presence shape and dtypes of other public attributes such as y_true_cumulative
(new name suggested in https://github.com/scikit-learn/scikit-learn/pull/28752/files#r1553497449).
I might also make sense to check that the values of those attribute match (with assert_allclose
) when calling CumulativeAccuracyDisplay.from_estimator
or CumulativeAccuracyDisplay.from_predictions
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made the change for y_true_cumulative
, but I'm not sure about shape or dtypes. Are displays supposed to have these attributes (or just this one)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just realised that I've misread it. You meant checking the attributes of y_true_cumulative, not the display itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review of from_estimator
.
This is also known as a Gain or Lift Curve for classification, and a Lorenz | ||
curve for regression with a positively valued target. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also known as a Gain or Lift Curve for classification, and a Lorenz | |
curve for regression with a positively valued target. | |
This is also known as a Gain or Lift Curve, or the mirrored Lorenz | |
curve of Machine Learning for regression with a positively valued target. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we agree to leave the Lorenz curve / regression case for a follow-up PR I would defer this change for later. Speaking about regression on an class that is designed and document to only accept classifier predictions would be a source confusion.
I would also not add another example. We can replace the lorenz curve with CAP in 2 examples (Poisson and Tweedie). |
But this is only for regression. I would like to illustrate CAP in a classification context and updating an example with a ROC and DET curve would make that possible and also improve the discoverability of the plot. Furthermore I find it interesting to visually compare the 3 plots for the same models / data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sklearn.metrics
should expose the CumulativeAccuracyDisplay
class as part of the public API and include it in its __all__
module level attribute.
I think the docstrings of the 2 class methods should:
- mention that this curve assesses the ranking power of classifiers, that is the ability to correctly order the individual predictions by probability of actually be labeled by the positive class,
- make it explicit this curve is invariant under any monotonic transformation of the predictions, therefore it cannot be used to assess the calibration of the predictions, only the ranking power,
- have a "See also" section (at the end of the docstring) and cross-link
RocCurveDisplay
as an alternative curve to assess the ranking power of classifiers.
I just want to avoid a new example. Therefore, either extend existing examples or postpone to a followup PR. But please, no new example! (Strong opinion there) |
I'm more in favor of including regression from the start. That's where it adds additional value. For classification, it is kind of equivalent to ROC. On top, this helps to find good names. |
@JosephBARBIERDARNAL Why did you close? This would be a valuable addition. |
I forgot that this PR was still open when I deleted my fork, sorry! Any solution for this type of case? |
I created a new PR (#28972) with the same code. I'll hopefully keep this one open. again sorry for the mistake! |
Reference Issue
fix for #10003
What does this implement/fix?
creation of a CumulativeAccuracyDisplay class for plots
"The CAP of a model represents the cumulative number of positive outcomes along the y-axis versus the corresponding cumulative number of a classifying parameter along the x-axis. The output is called a CAP curve.[1] The CAP is distinct from the receiver operating characteristic (ROC) curve, which plots the true-positive rate against the false-positive rate." (wikipedia definition)
It's mainly inspired from the
RocCurveDisplay
class.other
It's currently a work in progress.