-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH add CAP curve #28972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ENH add CAP curve #28972
Conversation
@JosephBARBIERDARNAL Could you please address all the comments from #28752? |
Yes, I'm ready to do it, but I won't be able to get back to it quickly (2 to 3 months) unfortunately. Hope that's okay |
Working on this PR again. Here are the current updates:
Some (likely non-exhaustive) issues:
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import CapCurveDisplay
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)
y_scores = clf.decision_function(X_test)
fig, ax = plt.subplots(ncols=2, dpi=300, figsize=(12,12))
display = CapCurveDisplay.from_predictions(
ax=ax[0],
y_true=y_test,
y_pred=y_scores,
name='normalize_scale=False',
normalize_scale=False,
plot_chance_level=True
)
display = CapCurveDisplay.from_predictions(
ax=ax[1],
y_true=y_test,
y_pred=y_scores,
name='normalize_scale=True',
normalize_scale=True,
plot_chance_level=True
)
|
I'll come back on this PR after the release. This will be one of my priority to be merged for 1.7. |
Ok cool. There's still a few things I need to do anyway before a review (cov test and adding the #30023 check). |
sklearn/metrics/_plot/cap_curve.py
Outdated
|
||
# compute cumulative sums for true positives and all cases | ||
y_true_cumulative = np.cumsum(y_true_sorted * sample_weight_sorted) | ||
cumulative_total = np.cumsum(sample_weight_sorted) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For information, there was a concurrent PR to fix the lack of cumsum
of the sample_weight
to define the x-axis of the Lorenz curves in one of our regression examples. To check that this was the correct fix, we ran a quick check on synthetic data with integer valued sample weights to check that there is an exact equivalence between repeated data points and reweighting them by exposure:
Maybe this PR could be expanded to test for this property also holds for CapCurveDisplay.from_predictions
with non-constant, integer-valued sample_weight
.
EDIT: I am not entirely sure how to write such a test, but possibly we could numpy.interp1d
the CAP curve computed on the repeated for the x-axis location of the points of the weighted CAP curve and check that the two curves match up to a small eps at those locations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several review comments in #28752 that have not yet been addressed.
I would like to make sure that we do not forget about them when iterating on this PR.
@JosephBARBIERDARNAL to sort things out, please reply in the threads of the review of #28752 and make it explicit which comments have been addressed in #28972 and how, and then mark those threads as resolved.
Also, we need some test and update an existing example, where we compare ROC, DET and CAP curves on the same classifier. I suppose this example is the best candidate:
https://scikit-learn.org/stable/auto_examples/model_selection/plot_det.html
@ogrisel That sounds good to me. I won't be able to work on it immediately, but I’ll definitely be able to get to it within the next few weeks. I'll ping you and/or @glemaitre for review |
2 things are important for me:
|
|
I resolved most of the conversations there. I didn't touch some of them if I wasn't sure. Feel free to ping me if any changes are needed. I'm just not sure what |
A few questions:
|
@ogrisel You can review. See my comments in the TODO. |
No strong opinion, but I think I would rather update the other to plot the baseline curve by default instead (to be done later). |
I would rather do that in a follow-up PR that introduces an official Gini scoring function instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @JosephBARBIERDARNAL. Here is another batch of feedback.
examples/linear_model/plot_poisson_regression_non_normal_loss.py
Outdated
Show resolved
Hide resolved
examples/linear_model/plot_poisson_regression_non_normal_loss.py
Outdated
Show resolved
Hide resolved
sklearn/metrics/_plot/cap_curve.py
Outdated
self.cumulative_total = self.cumulative_total / x_max | ||
self.y_true_cumulative = self.y_true_cumulative / y_max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be robust to edge-cases.
self.cumulative_total = self.cumulative_total / x_max | |
self.y_true_cumulative = self.y_true_cumulative / y_max | |
if x_max != 0: | |
self.cumulative_total = self.cumulative_total / x_max | |
if y_max != 0: | |
self.y_true_cumulative = self.y_true_cumulative / y_max |
and please add tests to check that no low level warnings are raised for edge cases like:
CAPCurveClassifier.from_predictions(np.zeros(3), np.asarray([0.1, 0.3, 1.]))
CAPCurveClassifier.from_predictions(np.asarray([0.1, 0.3, 1.]), np.zeros(3))
CAPCurveClassifier.from_predictions(np.zeros(3), np.zeros(3))
Please also test when the classification interpretation forced by setting an explicit pos_label != None
:
CAPCurveClassifier.from_predictions(
np.zeros(3), np.asarray([0.1, 0.3, 1.]), pos_label=0
)
CAPCurveClassifier.from_predictions(
np.zeros(3), np.asarray([0.1, 0.3, 1.]), pos_label=1
)
I am not 100% sure but maybe some of those edge cases should better trigger high level ValueError
with an explicit message instead of plotting a meaningless chart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's now tested with test_edge_cases_from_predictions()
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a really quick review. I'll continue tomorrow.
if sample_weight is None: | ||
sample_weight = np.ones_like(y_true, dtype=np.float64) | ||
|
||
sorted_indices = np.argsort(y_pred) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not see this comment in the GitHub frontend but it seems important to be addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @JosephBARBIERDARNAL. This new feature is definitely very valuable. Here's a batch of comments regarding the modified example.
@@ -4,17 +4,22 @@ | |||
==================================== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this name for the example would be pertinent anymore. Maybe we can go with something like "Compare ROC, DET and CAP curves". Then we would have to rename the section a bit further below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me! Let me know when this is approved by others.
DET curves are a variation of ROC curves where False Negative Rate (FNR) is | ||
plotted on the y-axis instead of the TPR. In this case the origin (bottom left | ||
corner) is the "ideal" point. | ||
corner) is the "ideal" point. Furthermore, the axes use a normal deviate scale |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of a normal deviate scale is already mentioned below. But if you really think it's worth keeping here, let's rather mention it before the discussion of the "ideal" point, to have a more consistent flow of ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok for me if we remove it and keep only the mention below.
ax=ax_roc, | ||
name=name, | ||
pos_label=pos_label, | ||
plot_chance_level=is_last, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we already have the DummyClassifier
, this line is overlapping. I think back in the day we decided that we better avoid the mention to "chance level" if possible
plot_chance_level=is_last, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my comment
# The diagonal black-dotted lines named "chance level" in the plots above | ||
# correspond to a the expected value of a non-informative classifier on an | ||
# infinite evaluation set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I we go with avoiding the term "chance level" I would rather remove this whole paragraph and only mention in the paragraph about the DummyClassifier
below that non-informative classifiers can be used as baseline.
Co-authored-by: Arturo Amor <86408019+ArturoAmorQ@users.noreply.github.com>
A few things that I think need to be addressed:
|
This PR is a dupplicate from #28752. I'm recreating a PR because I accidentally deleted my fork with the old one.
Reference Issue
fix for #10003
What does this implement/fix?
creation of a CumulativeAccuracyDisplay class for plots
"The CAP of a model represents the cumulative number of positive outcomes along the y-axis versus the corresponding cumulative number of a classifying parameter along the x-axis. The output is called a CAP curve.[1] The CAP is distinct from the receiver operating characteristic (ROC) curve, which plots the true-positive rate against the false-positive rate." (wikipedia definition)
It's mainly inspired from the
RocCurveDisplay
class.other
It's currently a work in progress.
TODO
Binary classification
ValueError
infrom_estimator
if the estimator is not fitted or is a classifier that was fitted with more than 3 classes;pos_label
handling when the positive class;response_method="decision_function"
andresponse_method="predict_proba"
for aLogisticRegression
classifier fit with string labels and for all 3 possible values ofpos_label
;test_display_from_estimator_and_from_prediction
;y_true_cumulative
andcumulative_total
have the same dtype asy_pred
in the test aboutfrom_predictions
. We can test fory_pred
passed either asnp.float32
ornp.float64
.CAPCurveDisplay.from_estimator(LinearSVC().fit(X, y), ...)
works (even if it does not have apredict_proba
method. This should cover one of the line reported as uncovered by codecov.test_common_curve_display.py
to reuse some generic tests onCAPCurveDisplay
and maybe remove redundant tests on invalid inputs fromtest_cap_curve_display.py
if any;despine
argument?*Display
classes in scikit-learn. Feel free to open an issue to discuss this with screen shots e.g. on ROC or PR curves and your analysis of pros and cons.despine
keyword for ROC and PR curves #26367). I'm not sure it makes much sense forConfusionMatrixDisplay
(?). I'll open an issue (when this PR will be merged) forCAPCurveDisplay
,PredictionErrorDisplay
andDetCurveDisplay
because I think they're the only ones that don't have this option.Regression
ValueError
with an informative error message ify_true
has negative values;ValueError
if ally_true
are zeros (the plot would be degenerate and would raise a low leveldivide by zero
warning whithnormalize_scale=True
);y_true
are zeros, it will be considered a case of classificationPoissonRegressor
) and check that the regressor curve lie between the "chance level" and "perfect" curves;examples/linear_model/plot_tweedie_regression_insurance_claims.py
andexamples/linear_model/plot_poisson_regression_non_normal_loss.py
) to use theCAPCurveDisplay
class instead of manually plotting the Lorenz curves.Other
doc/whats_new/upcoming_changes/
doc/visualization
) to reference this new tool.DetCurveDisplay
,RocCurveDisplay
and thenCAPCurveDisplay
, which seems inconsistent.Nice to have