Skip to content
10 changes: 5 additions & 5 deletions doc/modules/grid_search.rst
Original file line number Diff line number Diff line change
Expand Up @@ -602,17 +602,17 @@ parameters of composite or nested estimators such as
>>> from sklearn.datasets import make_moons
>>> X, y = make_moons()
>>> calibrated_forest = CalibratedClassifierCV(
... base_estimator=RandomForestClassifier(n_estimators=10))
... estimator=RandomForestClassifier(n_estimators=10))
>>> param_grid = {
... 'base_estimator__max_depth': [2, 4, 6, 8]}
... 'estimator__max_depth': [2, 4, 6, 8]}
>>> search = GridSearchCV(calibrated_forest, param_grid, cv=5)
>>> search.fit(X, y)
GridSearchCV(cv=5,
estimator=CalibratedClassifierCV(...),
param_grid={'base_estimator__max_depth': [2, 4, 6, 8]})
param_grid={'estimator__max_depth': [2, 4, 6, 8]})

Here, ``<estimator>`` is the parameter name of the nested estimator,
in this case ``base_estimator``.
in this case ``estimator``.
If the meta-estimator is constructed as a collection of estimators as in
`pipeline.Pipeline`, then ``<estimator>`` refers to the name of the estimator,
see :ref:`pipeline_nested_parameters`. In practice, there can be several
Expand All @@ -625,7 +625,7 @@ levels of nesting::
... ('model', calibrated_forest)])
>>> param_grid = {
... 'select__k': [1, 2],
... 'model__base_estimator__max_depth': [2, 4, 6, 8]}
... 'model__estimator__max_depth': [2, 4, 6, 8]}
>>> search = GridSearchCV(pipe, param_grid, cv=5).fit(X, y)

Please refer to :ref:`pipeline` for performing parameter searches over
Expand Down
8 changes: 8 additions & 0 deletions doc/whats_new/v1.2.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,14 @@ Changelog
:pr:`123456` by :user:`Joe Bloggs <joeongithub>`.
where 123456 is the *pull request* number, not the issue number.

:mod:`sklearn.calibration`
..........................

- |API| Rename `base_estimator` to `estimator` in
:class:`CalibratedClassifierCV` to improve readability and consistency. The
parameter `base_estimator` is deprecated and will be removed in 1.4.
:pr:`22054` by :user:`Kevin Roice <kevroi>`.

:mod:`sklearn.cluster`
......................

Expand Down
104 changes: 63 additions & 41 deletions sklearn/calibration.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,17 +72,19 @@ class CalibratedClassifierCV(ClassifierMixin, MetaEstimatorMixin, BaseEstimator)
for model fitting and calibration are disjoint.

The calibration is based on the :term:`decision_function` method of the
`base_estimator` if it exists, else on :term:`predict_proba`.
`estimator` if it exists, else on :term:`predict_proba`.

Read more in the :ref:`User Guide <calibration>`.

Parameters
----------
base_estimator : estimator instance, default=None
estimator : estimator instance, default=None
The classifier whose output need to be calibrated to provide more
accurate `predict_proba` outputs. The default classifier is
a :class:`~sklearn.svm.LinearSVC`.

.. versionadded:: 1.2

method : {'sigmoid', 'isotonic'}, default='sigmoid'
The method to use for calibration. Can be 'sigmoid' which
corresponds to Platt's method (i.e. a logistic regression model) or
Expand All @@ -108,7 +110,7 @@ class CalibratedClassifierCV(ClassifierMixin, MetaEstimatorMixin, BaseEstimator)
Refer to the :ref:`User Guide <cross_validation>` for the various
cross-validation strategies that can be used here.

If "prefit" is passed, it is assumed that `base_estimator` has been
If "prefit" is passed, it is assumed that `estimator` has been
fitted already and all data is used for calibration.

.. versionchanged:: 0.22
Expand All @@ -130,7 +132,7 @@ class CalibratedClassifierCV(ClassifierMixin, MetaEstimatorMixin, BaseEstimator)
Determines how the calibrator is fitted when `cv` is not `'prefit'`.
Ignored if `cv='prefit'`.

If `True`, the `base_estimator` is fitted using training data, and
If `True`, the `estimator` is fitted using training data, and
calibrated using testing data, for each `cv` fold. The final estimator
is an ensemble of `n_cv` fitted classifier and calibrator pairs, where
`n_cv` is the number of cross-validation folds. The output is the
Expand All @@ -139,39 +141,46 @@ class CalibratedClassifierCV(ClassifierMixin, MetaEstimatorMixin, BaseEstimator)
If `False`, `cv` is used to compute unbiased predictions, via
:func:`~sklearn.model_selection.cross_val_predict`, which are then
used for calibration. At prediction time, the classifier used is the
`base_estimator` trained on all the data.
`estimator` trained on all the data.
Note that this method is also internally implemented in
:mod:`sklearn.svm` estimators with the `probabilities=True` parameter.

.. versionadded:: 0.24

base_estimator : estimator instance
This parameter is deprecated. Use `estimator` instead.

.. deprecated:: 1.2
The parameter `base_estimator` is deprecated in 1.2 and will be
removed in 1.4. Use `estimator` instead.

Attributes
----------
classes_ : ndarray of shape (n_classes,)
The class labels.

n_features_in_ : int
Number of features seen during :term:`fit`. Only defined if the
underlying base_estimator exposes such an attribute when fit.
underlying estimator exposes such an attribute when fit.

.. versionadded:: 0.24

feature_names_in_ : ndarray of shape (`n_features_in_`,)
Names of features seen during :term:`fit`. Only defined if the
underlying base_estimator exposes such an attribute when fit.
underlying estimator exposes such an attribute when fit.

.. versionadded:: 1.0

calibrated_classifiers_ : list (len() equal to cv or 1 if `cv="prefit"` \
or `ensemble=False`)
The list of classifier and calibrator pairs.

- When `cv="prefit"`, the fitted `base_estimator` and fitted
- When `cv="prefit"`, the fitted `estimator` and fitted
calibrator.
- When `cv` is not "prefit" and `ensemble=True`, `n_cv` fitted
`base_estimator` and calibrator pairs. `n_cv` is the number of
`estimator` and calibrator pairs. `n_cv` is the number of
cross-validation folds.
- When `cv` is not "prefit" and `ensemble=False`, the `base_estimator`,
- When `cv` is not "prefit" and `ensemble=False`, the `estimator`,
fitted on all the data, and fitted calibrator.

.. versionchanged:: 0.24
Expand Down Expand Up @@ -204,9 +213,9 @@ class CalibratedClassifierCV(ClassifierMixin, MetaEstimatorMixin, BaseEstimator)
>>> X, y = make_classification(n_samples=100, n_features=2,
... n_redundant=0, random_state=42)
>>> base_clf = GaussianNB()
>>> calibrated_clf = CalibratedClassifierCV(base_estimator=base_clf, cv=3)
>>> calibrated_clf = CalibratedClassifierCV(base_clf, cv=3)
>>> calibrated_clf.fit(X, y)
CalibratedClassifierCV(base_estimator=GaussianNB(), cv=3)
CalibratedClassifierCV(...)
>>> len(calibrated_clf.calibrated_classifiers_)
3
>>> calibrated_clf.predict_proba(X)[:5, :]
Expand All @@ -224,12 +233,9 @@ class CalibratedClassifierCV(ClassifierMixin, MetaEstimatorMixin, BaseEstimator)
>>> base_clf = GaussianNB()
>>> base_clf.fit(X_train, y_train)
GaussianNB()
>>> calibrated_clf = CalibratedClassifierCV(
... base_estimator=base_clf,
... cv="prefit"
... )
>>> calibrated_clf = CalibratedClassifierCV(base_clf, cv="prefit")
>>> calibrated_clf.fit(X_calib, y_calib)
CalibratedClassifierCV(base_estimator=GaussianNB(), cv='prefit')
CalibratedClassifierCV(...)
>>> len(calibrated_clf.calibrated_classifiers_)
1
>>> calibrated_clf.predict_proba([[-0.5, 0.5]])
Expand All @@ -238,18 +244,20 @@ class CalibratedClassifierCV(ClassifierMixin, MetaEstimatorMixin, BaseEstimator)

def __init__(
self,
base_estimator=None,
estimator=None,
*,
method="sigmoid",
cv=None,
n_jobs=None,
ensemble=True,
base_estimator="deprecated",
):
self.base_estimator = base_estimator
self.estimator = estimator
self.method = method
self.cv = cv
self.n_jobs = n_jobs
self.ensemble = ensemble
self.base_estimator = base_estimator

def fit(self, X, y, sample_weight=None, **fit_params):
"""Fit the calibrated model.
Expand Down Expand Up @@ -282,25 +290,39 @@ def fit(self, X, y, sample_weight=None, **fit_params):
for sample_aligned_params in fit_params.values():
check_consistent_length(y, sample_aligned_params)

if self.base_estimator is None:
# TODO(1.4): Remove when base_estimator is removed
if self.base_estimator != "deprecated":
if self.estimator is not None:
raise ValueError(
"Both `base_estimator` and `estimator` are set. Only set "
"`estimator` since `base_estimator` is deprecated."
)
warnings.warn(
"`base_estimator` was renamed to `estimator` in version 1.2 and "
"will be removed in 1.4.",
FutureWarning,
)
estimator = self.base_estimator
else:
estimator = self.estimator

if estimator is None:
# we want all classifiers that don't expose a random_state
# to be deterministic (and we don't want to expose this one).
base_estimator = LinearSVC(random_state=0)
else:
base_estimator = self.base_estimator
estimator = LinearSVC(random_state=0)

self.calibrated_classifiers_ = []
if self.cv == "prefit":
# `classes_` should be consistent with that of base_estimator
check_is_fitted(self.base_estimator, attributes=["classes_"])
self.classes_ = self.base_estimator.classes_
# `classes_` should be consistent with that of estimator
check_is_fitted(self.estimator, attributes=["classes_"])
self.classes_ = self.estimator.classes_

pred_method, method_name = _get_prediction_method(base_estimator)
pred_method, method_name = _get_prediction_method(estimator)
n_classes = len(self.classes_)
predictions = _compute_predictions(pred_method, method_name, X, n_classes)

calibrated_classifier = _fit_calibrator(
base_estimator,
estimator,
predictions,
y,
self.classes_,
Expand All @@ -315,10 +337,10 @@ def fit(self, X, y, sample_weight=None, **fit_params):
n_classes = len(self.classes_)

# sample_weight checks
fit_parameters = signature(base_estimator.fit).parameters
fit_parameters = signature(estimator.fit).parameters
supports_sw = "sample_weight" in fit_parameters
if sample_weight is not None and not supports_sw:
estimator_name = type(base_estimator).__name__
estimator_name = type(estimator).__name__
warnings.warn(
f"Since {estimator_name} does not appear to accept sample_weight, "
"sample weights will only be used for the calibration itself. This "
Expand Down Expand Up @@ -351,7 +373,7 @@ def fit(self, X, y, sample_weight=None, **fit_params):
parallel = Parallel(n_jobs=self.n_jobs)
self.calibrated_classifiers_ = parallel(
delayed(_fit_classifier_calibrator_pair)(
clone(base_estimator),
clone(estimator),
X,
y,
train=train,
Expand All @@ -365,7 +387,7 @@ def fit(self, X, y, sample_weight=None, **fit_params):
for train, test in cv.split(X, y)
)
else:
this_estimator = clone(base_estimator)
this_estimator = clone(estimator)
_, method_name = _get_prediction_method(this_estimator)
fit_params = (
{"sample_weight": sample_weight}
Expand Down Expand Up @@ -402,7 +424,7 @@ def fit(self, X, y, sample_weight=None, **fit_params):
)
self.calibrated_classifiers_.append(calibrated_classifier)

first_clf = self.calibrated_classifiers_[0].base_estimator
first_clf = self.calibrated_classifiers_[0].estimator
if hasattr(first_clf, "n_features_in_"):
self.n_features_in_ = first_clf.n_features_in_
if hasattr(first_clf, "feature_names_in_"):
Expand All @@ -418,7 +440,7 @@ def predict_proba(self, X):
Parameters
----------
X : array-like of shape (n_samples, n_features)
The samples, as accepted by `base_estimator.predict_proba`.
The samples, as accepted by `estimator.predict_proba`.

Returns
-------
Expand Down Expand Up @@ -446,7 +468,7 @@ def predict(self, X):
Parameters
----------
X : array-like of shape (n_samples, n_features)
The samples, as accepted by `base_estimator.predict`.
The samples, as accepted by `estimator.predict`.

Returns
-------
Expand Down Expand Up @@ -570,7 +592,7 @@ def _get_prediction_method(clf):
return method, "predict_proba"
else:
raise RuntimeError(
"'base_estimator' has no 'decision_function' or 'predict_proba' method."
"'estimator' has no 'decision_function' or 'predict_proba' method."
)


Expand Down Expand Up @@ -669,7 +691,7 @@ class _CalibratedClassifier:

Parameters
----------
base_estimator : estimator instance
estimator : estimator instance
Fitted classifier.

calibrators : list of fitted estimator instances
Expand All @@ -687,8 +709,8 @@ class _CalibratedClassifier:
non-parametric approach based on isotonic regression.
"""

def __init__(self, base_estimator, calibrators, *, classes, method="sigmoid"):
self.base_estimator = base_estimator
def __init__(self, estimator, calibrators, *, classes, method="sigmoid"):
self.estimator = estimator
self.calibrators = calibrators
self.classes = classes
self.method = method
Expand All @@ -710,11 +732,11 @@ def predict_proba(self, X):
The predicted probabilities. Can be exact zeros.
"""
n_classes = len(self.classes)
pred_method, method_name = _get_prediction_method(self.base_estimator)
pred_method, method_name = _get_prediction_method(self.estimator)
predictions = _compute_predictions(pred_method, method_name, X, n_classes)

label_encoder = LabelEncoder().fit(self.classes)
pos_class_indices = label_encoder.transform(self.base_estimator.classes_)
pos_class_indices = label_encoder.transform(self.estimator.classes_)

proba = np.zeros((_num_samples(X), n_classes))
for class_idx, this_pred, calibrator in zip(
Expand Down
Loading