ENH use cv_results in the different curve display to add confidence intervals #21211

glemaitre · 2021-10-01T12:44:22Z

This PR intends to add the capability of plotting uncertainty of the different curves (calibration, precision-recall, roc, etc.) by using the results of cross-validation (i.e. the output of cross_validate).

TODO:

add a parameter return_indices in cross_validate to store the train-test indices. It is the safest way to keep track of the train-test splits in the case of stochastic splitting strategies.
add a method from_cv_results in the plotting display to take advantage of the CV computation.
add unit test for from_cv_results
add unit test for the new keyword parameters in CalibrationDisplay
add unit test for the new strategy of binning in calibration_curve

Usage example

# %%
import numpy as np
from sklearn.datasets import make_classification

X, y = make_classification(
    n_samples=10_000, weights=[0.1, 0.9], random_state=42, class_sep=1
)
sample_weight = np.zeros_like(y, dtype=np.float64)
sample_weight[y == 0] = 0.1
sample_weight[y == 1] = 0.9

# %%
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test, sw_train, sw_test = train_test_split(
    X, y, sample_weight, random_state=42
)

# %%
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV

calibration_method = "isotonic"
models = {
    "LR no weights": LogisticRegression(),
    "LR class weights": LogisticRegression(class_weight="balanced"),
    "Calibrated LR no weights": CalibratedClassifierCV(
        LogisticRegression(),
        method=calibration_method,
    ),
    "Calibrated LR class weights": CalibratedClassifierCV(
        LogisticRegression(class_weight="balanced"),
        method=calibration_method,
    ),
    "Calibrated LR sample weights": CalibratedClassifierCV(
        LogisticRegression(),
        method=calibration_method,
    ),
    "Calibrated LR class and sample weights": CalibratedClassifierCV(
        LogisticRegression(class_weight="balanced"),
        method=calibration_method,
    ),
}

# %%
import matplotlib.pyplot as plt
from sklearn.calibration import CalibrationDisplay
from sklearn.metrics import balanced_accuracy_score

fig, ax = plt.subplots()

calibration_display_params = {
    "n_bins": 20,
    "strategy": "quantile",
}
for name, model in models.items():
    if "sample weights" in name:
        model.fit(X_train, y_train, sample_weight=sw_train)
    else:
        model.fit(X_train, y_train)

    score = balanced_accuracy_score(y_test, model.predict(X_test))
    CalibrationDisplay.from_estimator(
        model,
        X_test,
        y_test,
        name=name + f" - {score:.3f}",
        ax=ax,
        **calibration_display_params,
    )
ax.legend(loc="center left", bbox_to_anchor=(1, 0.5), title="Model - Balanced Accuracy")
_ = fig.suptitle(f"Using {calibration_method} calibration")

# %%
from sklearn.model_selection import cross_validate
from sklearn.model_selection import KFold

cv_results = {}
cv = KFold(n_splits=5)
for name, model in models.items():
    if "sample weights" in name:
        fit_params = {"sample_weight": sample_weight}
    else:
        fit_params = {}
    cv_results[name] = cross_validate(
        model,
        X,
        y,
        cv=cv,
        fit_params=fit_params,
        scoring="balanced_accuracy",
        return_estimator=True,
        return_indices=True,
    )

# %%
fig, ax = plt.subplots()
for model_idx, (name, results) in enumerate(cv_results.items()):
    CalibrationDisplay.from_cv_results(
        results, X, y, ax=ax, name=name, **calibration_display_params
    )
ax.legend(loc="center left", bbox_to_anchor=(1, 0.5))

# %%
fig, ax = plt.subplots()
for model_idx, (name, results) in enumerate(cv_results.items()):
    CalibrationDisplay.from_cv_results(
        results,
        X,
        y,
        ax=ax,
        name=name,
        plot_uncertainty_style="fill_between",
        **calibration_display_params,
    )
ax.legend(loc="center left", bbox_to_anchor=(1, 0.5))

# %%
fig, ax = plt.subplots()
for model_idx, (name, results) in enumerate(cv_results.items()):
    CalibrationDisplay.from_cv_results(
        results,
        X,
        y,
        ax=ax,
        name=name,
        plot_uncertainty_style="lines",
        **calibration_display_params,
    )
ax.legend(loc="center left", bbox_to_anchor=(1, 0.5))

# %%

ogrisel · 2021-10-21T13:01:26Z

sklearn/calibration.py

@@ -1067,6 +1135,22 @@ def plot(self, *, ax=None, name=None, ref_line=True, **kwargs):
            If `True`, plots a reference line representing a perfectly
            calibrated classifier.

+        plot_uncertainty_style : {"errorbar", "fill_between", "lines"}, \
+                default="errorbar"


I think the default should plot_uncertainty_style="lines" as it's the easier to understand without being mislead. For plot_uncertainty_style="errorbar" and plot_uncertainty_style="fill_between" we need to know that it's based on the raw standard deviation (as opposed to a pseudo confidence interval based on the standard error of the mean for instance).

We could also accept plot_uncertainty_style=None to only plot the mean CV calibration curve without any uncertainty markers on the plot.

Also plot_uncertainty_style="shade" or plot_uncertainty_style="shaded_area" might be easier to understand than plot_uncertainty_style="fill_between".

ogrisel · 2021-10-21T13:05:13Z

sklearn/calibration.py

+                default="errorbar"
+            Style to plot the uncertainty information. Possibilities are:
+
+            - "errorbar": error bars representing one standard deviation;


two standard deviations: 1 above and 1 below.

I assume (I did not check ;)

I checked and I think I am right:

import numpy as np import matplotlib.pyplot as plt plt.errorbar(np.arange(5), np.ones(5), np.ones(5))

thomasjpfan · 2021-11-19T18:26:48Z

sklearn/model_selection/_validation.py

+    return_indices : bool, default=False
+        Whether to return the train-test indices selected for each split.


Coming from #21664, I agree return_indices is useful. (I wanted to do something like this recently).

adrinjalali · 2024-03-07T10:16:01Z

@glemaitre this seems cool to be continued!

glemaitre · 2024-03-07T11:56:09Z

Yep this also pet of the CZI proposal on inspection. This would be my next effort after the tuning threshold classifier.

ENH add return_indices in cross_validate

6b09fe6

github-actions bot added the module:model_selection label Oct 1, 2021

glemaitre changed the title ~~ENH use uncertainty estimate~~ ENH use cv_results in the different curve display to add confidence intervals Oct 1, 2021

glemaitre marked this pull request as draft October 1, 2021 12:45

glemaitre added 3 commits October 1, 2021 14:45

iter

f5b6efb

ENH add from_cv_results into CalibrationDisplay

e6d86f5

changelog

bae6a9b

ogrisel self-requested a review October 19, 2021 09:18

ogrisel reviewed Oct 21, 2021

View reviewed changes

glemaitre mentioned this pull request Oct 22, 2021

Allow inspect.PartialDependenceDisplay to work with a prediction function #21388

Open

ogrisel mentioned this pull request Oct 28, 2021

FEA Add variable importance to linear models #21170

Open

thomasjpfan reviewed Nov 19, 2021

View reviewed changes

glemaitre mentioned this pull request May 2, 2022

FIX compute precision-recall at 100% recall #23214

Merged

glemaitre mentioned this pull request Jul 25, 2022

Add option to RocCurveDisplay to display the average of different length ROC curves #23983

Open

glemaitre mentioned this pull request Feb 22, 2023

ENH allow to return train/test split indices in cross_validate #25659

Merged

stephanecollot mentioned this pull request Mar 14, 2023

Sampling uncertainty on precision-recall and ROC curves #25856

Open

ogrisel mentioned this pull request Jan 17, 2025

UX CalibrationDisplay's naive use can lead to very confusing results #30664

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH use cv_results in the different curve display to add confidence intervals #21211

ENH use cv_results in the different curve display to add confidence intervals #21211

Uh oh!

glemaitre commented Oct 1, 2021 •

edited

Loading

Uh oh!

ogrisel Oct 21, 2021 •

edited

Loading

Uh oh!

ogrisel Oct 21, 2021 •

edited

Loading

Uh oh!

ogrisel Oct 21, 2021

Uh oh!

ogrisel Oct 21, 2021

Uh oh!

ogrisel Oct 21, 2021

Uh oh!

ogrisel Oct 21, 2021

Uh oh!

thomasjpfan Nov 19, 2021

Uh oh!

adrinjalali commented Mar 7, 2024

Uh oh!

glemaitre commented Mar 7, 2024

Uh oh!

Uh oh!

		return_indices : bool, default=False
		Whether to return the train-test indices selected for each split.

Uh oh!

ENH use cv_results in the different curve display to add confidence intervals #21211

Are you sure you want to change the base?

ENH use cv_results in the different curve display to add confidence intervals #21211

Uh oh!

Conversation

glemaitre commented Oct 1, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO:

Usage example

Uh oh!

ogrisel Oct 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ogrisel Oct 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ogrisel Oct 21, 2021

Choose a reason for hiding this comment

Uh oh!

ogrisel Oct 21, 2021

Choose a reason for hiding this comment

Uh oh!

ogrisel Oct 21, 2021

Choose a reason for hiding this comment

Uh oh!

ogrisel Oct 21, 2021

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Nov 19, 2021

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Mar 7, 2024

Uh oh!

glemaitre commented Mar 7, 2024

Uh oh!

Uh oh!

glemaitre commented Oct 1, 2021 •

edited

Loading

ogrisel Oct 21, 2021 •

edited

Loading

ogrisel Oct 21, 2021 •

edited

Loading