Skip to content

API Replace n_iter in Bayesian Ridge and ARDRegression #25697

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 2, 2023
18 changes: 15 additions & 3 deletions doc/whats_new/v1.3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -273,9 +273,21 @@ Changelog
:mod:`sklearn.linear_model`
...........................

- |Enhancement| :class:`SGDClassifier`, :class:`SGDRegressor` and
:class:`SGDOneClassSVM` now preserve dtype for `numpy.float32`.
:pr:`25587` by :user:`Omar Salman <OmarManzoor>`
- |Enhancement| :class:`linear_model.SGDClassifier`,
:class:`linear_model.SGDRegressor` and :class:`linear_model.SGDOneClassSVM`
now preserve dtype for `numpy.float32`.
:pr:`25587` by :user:`Omar Salman <OmarManzoor>`.

- |API| Deprecates `n_iter` in favor of `max_iter` in
:class:`linear_model.BayesianRidge` and :class:`linear_model.ARDRegression`.
`n_iter` will be removed in scikit-learn 1.5. This change makes those
estimators consistent with the rest of estimators.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to also mention that n_iter_ was added to ARDRegression.

Copy link
Contributor Author

@jpangas jpangas Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done so. Please check and see if it's looking good. Is there another particular reason I have missed behind including n_iter_ attribute in ARDRegression?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be beneficial to include them as two entries? It's a bit clunky as a single entry

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could work if it was on it’s own. WDYT @glemaitre ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make it two entries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On it. I will push the change before EOD.

:pr:`25697` by :user:`John Pangas <jpangas>`.

- |Enhancement| The `n_iter_` attribute has been included in
:class:`linear_model.ARDRegression` to expose the actual number of iterations
required to reach the stopping criterion.
:pr:`25697` by :user:`John Pangas <jpangas>`.

:mod:`sklearn.metrics`
......................
Expand Down
114 changes: 100 additions & 14 deletions sklearn/linear_model/_bayes.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
# Authors: V. Michel, F. Pedregosa, A. Gramfort
# License: BSD 3 clause

import warnings
from math import log
from numbers import Integral, Real
import numpy as np
Expand All @@ -15,7 +16,49 @@
from ..utils.extmath import fast_logdet
from scipy.linalg import pinvh
from ..utils.validation import _check_sample_weight
from ..utils._param_validation import Interval
from ..utils._param_validation import Interval, Hidden, StrOptions


# TODO(1.5) Remove
def _deprecate_n_iter(n_iter, max_iter):
"""Deprecates n_iter in favour of max_iter. Checks if the n_iter has been
used instead of max_iter and generates a deprecation warning if True.

Parameters
----------
n_iter : int,
Value of n_iter attribute passed by the estimator.

max_iter : int, default=None
Value of max_iter attribute passed by the estimator.
If `None`, it corresponds to `max_iter=300`.

Returns
-------
max_iter : int,
Value of max_iter which shall further be used by the estimator.

Notes
-----
This function should be completely removed in 1.5.
"""
if n_iter != "deprecated":
if max_iter is not None:
raise ValueError(
"Both `n_iter` and `max_iter` attributes were set. Attribute"
" `n_iter` was deprecated in version 1.3 and will be removed in"
" 1.5. To avoid this error, only set the `max_iter` attribute."
)
warnings.warn(
"'n_iter' was renamed to 'max_iter' in version 1.3 and "
"will be removed in 1.5",
FutureWarning,
)
max_iter = n_iter
elif max_iter is None:
max_iter = 300
return max_iter


###############################################################################
# BayesianRidge regression
Expand All @@ -32,8 +75,12 @@ class BayesianRidge(RegressorMixin, LinearModel):

Parameters
----------
n_iter : int, default=300
Maximum number of iterations. Should be greater than or equal to 1.
max_iter : int, default=None
Maximum number of iterations over the complete dataset before
stopping independently of any early stopping criterion. If `None`, it
corresponds to `max_iter=300`.

.. versionchanged:: 1.3

tol : float, default=1e-3
Stop the algorithm if w has converged.
Expand Down Expand Up @@ -83,14 +130,21 @@ class BayesianRidge(RegressorMixin, LinearModel):
verbose : bool, default=False
Verbose mode when fitting the model.

n_iter : int
Maximum number of iterations. Should be greater than or equal to 1.

.. deprecated:: 1.3
`n_iter` is deprecated in 1.3 and will be removed in 1.5. Use
`max_iter` instead.

Attributes
----------
coef_ : array-like of shape (n_features,)
Coefficients of the regression model (mean of distribution)

intercept_ : float
Independent term in decision function. Set to 0.0 if
``fit_intercept = False``.
`fit_intercept = False`.

alpha_ : float
Estimated precision of the noise.
Expand Down Expand Up @@ -162,7 +216,7 @@ class BayesianRidge(RegressorMixin, LinearModel):
"""

_parameter_constraints: dict = {
"n_iter": [Interval(Integral, 1, None, closed="left")],
"max_iter": [Interval(Integral, 1, None, closed="left"), None],
"tol": [Interval(Real, 0, None, closed="neither")],
"alpha_1": [Interval(Real, 0, None, closed="left")],
"alpha_2": [Interval(Real, 0, None, closed="left")],
Expand All @@ -174,12 +228,16 @@ class BayesianRidge(RegressorMixin, LinearModel):
"fit_intercept": ["boolean"],
"copy_X": ["boolean"],
"verbose": ["verbose"],
"n_iter": [
Interval(Integral, 1, None, closed="left"),
Hidden(StrOptions({"deprecated"})),
],
}

def __init__(
self,
*,
n_iter=300,
max_iter=None, # TODO(1.5): Set to 300
tol=1.0e-3,
alpha_1=1.0e-6,
alpha_2=1.0e-6,
Expand All @@ -191,8 +249,9 @@ def __init__(
fit_intercept=True,
copy_X=True,
verbose=False,
n_iter="deprecated", # TODO(1.5): Remove
):
self.n_iter = n_iter
self.max_iter = max_iter
self.tol = tol
self.alpha_1 = alpha_1
self.alpha_2 = alpha_2
Expand All @@ -204,6 +263,7 @@ def __init__(
self.fit_intercept = fit_intercept
self.copy_X = copy_X
self.verbose = verbose
self.n_iter = n_iter

def fit(self, X, y, sample_weight=None):
"""Fit the model.
Expand All @@ -228,6 +288,8 @@ def fit(self, X, y, sample_weight=None):
"""
self._validate_params()

max_iter = _deprecate_n_iter(self.n_iter, self.max_iter)

X, y = self._validate_data(X, y, dtype=[np.float64, np.float32], y_numeric=True)

if sample_weight is not None:
Expand Down Expand Up @@ -274,7 +336,7 @@ def fit(self, X, y, sample_weight=None):
eigen_vals_ = S**2

# Convergence loop of the bayesian ridge regression
for iter_ in range(self.n_iter):
for iter_ in range(max_iter):

# update posterior mean coef_ based on alpha_ and lambda_ and
# compute corresponding rmse
Expand Down Expand Up @@ -430,8 +492,10 @@ class ARDRegression(RegressorMixin, LinearModel):

Parameters
----------
n_iter : int, default=300
Maximum number of iterations.
max_iter : int, default=None
Maximum number of iterations. If `None`, it corresponds to `max_iter=300`.

.. versionchanged:: 1.3

tol : float, default=1e-3
Stop the algorithm if w has converged.
Expand Down Expand Up @@ -470,6 +534,13 @@ class ARDRegression(RegressorMixin, LinearModel):
verbose : bool, default=False
Verbose mode when fitting the model.

n_iter : int
Maximum number of iterations.

.. deprecated:: 1.3
`n_iter` is deprecated in 1.3 and will be removed in 1.5. Use
`max_iter` instead.

Attributes
----------
coef_ : array-like of shape (n_features,)
Expand All @@ -487,6 +558,11 @@ class ARDRegression(RegressorMixin, LinearModel):
scores_ : float
if computed, value of the objective function (to be maximized)

n_iter_ : int
The actual number of iterations to reach the stopping criterion.

.. versionadded:: 1.3

intercept_ : float
Independent term in decision function. Set to 0.0 if
``fit_intercept = False``.
Expand Down Expand Up @@ -542,7 +618,7 @@ class ARDRegression(RegressorMixin, LinearModel):
"""

_parameter_constraints: dict = {
"n_iter": [Interval(Integral, 1, None, closed="left")],
"max_iter": [Interval(Integral, 1, None, closed="left"), None],
"tol": [Interval(Real, 0, None, closed="left")],
"alpha_1": [Interval(Real, 0, None, closed="left")],
"alpha_2": [Interval(Real, 0, None, closed="left")],
Expand All @@ -553,12 +629,16 @@ class ARDRegression(RegressorMixin, LinearModel):
"fit_intercept": ["boolean"],
"copy_X": ["boolean"],
"verbose": ["verbose"],
"n_iter": [
Interval(Integral, 1, None, closed="left"),
Hidden(StrOptions({"deprecated"})),
],
}

def __init__(
self,
*,
n_iter=300,
max_iter=None, # TODO(1.5): Set to 300
tol=1.0e-3,
alpha_1=1.0e-6,
alpha_2=1.0e-6,
Expand All @@ -569,8 +649,9 @@ def __init__(
fit_intercept=True,
copy_X=True,
verbose=False,
n_iter="deprecated", # TODO(1.5): Remove
):
self.n_iter = n_iter
self.max_iter = max_iter
self.tol = tol
self.fit_intercept = fit_intercept
self.alpha_1 = alpha_1
Expand All @@ -581,6 +662,7 @@ def __init__(
self.threshold_lambda = threshold_lambda
self.copy_X = copy_X
self.verbose = verbose
self.n_iter = n_iter

def fit(self, X, y):
"""Fit the model according to the given training data and parameters.
Expand All @@ -603,6 +685,8 @@ def fit(self, X, y):

self._validate_params()

max_iter = _deprecate_n_iter(self.n_iter, self.max_iter)

X, y = self._validate_data(
X, y, dtype=[np.float64, np.float32], y_numeric=True, ensure_min_samples=2
)
Expand Down Expand Up @@ -648,7 +732,7 @@ def update_coeff(X, y, coef_, alpha_, keep_lambda, sigma_):
else self._update_sigma_woodbury
)
# Iterative procedure of ARDRegression
for iter_ in range(self.n_iter):
for iter_ in range(max_iter):
sigma_ = update_sigma(X, alpha_, lambda_, keep_lambda)
coef_ = update_coeff(X, y, coef_, alpha_, keep_lambda, sigma_)

Expand Down Expand Up @@ -688,6 +772,8 @@ def update_coeff(X, y, coef_, alpha_, keep_lambda, sigma_):
if not keep_lambda.any():
break

self.n_iter_ = iter_ + 1

if keep_lambda.any():
# update sigma and mu using updated params from the last iteration
sigma_ = update_sigma(X, alpha_, lambda_, keep_lambda)
Expand Down
34 changes: 32 additions & 2 deletions sklearn/linear_model/tests/test_bayes.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ def test_bayesian_ridge_score_values():
alpha_2=alpha_2,
lambda_1=lambda_1,
lambda_2=lambda_2,
n_iter=1,
max_iter=1,
fit_intercept=False,
compute_score=True,
)
Expand Down Expand Up @@ -174,7 +174,7 @@ def test_update_of_sigma_in_ard():
# of the ARDRegression algorithm. See issue #10128.
X = np.array([[1, 0], [0, 0]])
y = np.array([0, 0])
clf = ARDRegression(n_iter=1)
clf = ARDRegression(max_iter=1)
clf.fit(X, y)
# With the inputs above, ARDRegression prunes both of the two coefficients
# in the first iteration. Hence, the expected shape of `sigma_` is (0, 0).
Expand Down Expand Up @@ -292,3 +292,33 @@ def test_dtype_correctness(Estimator):
coef_32 = model.fit(X.astype(np.float32), y).coef_
coef_64 = model.fit(X.astype(np.float64), y).coef_
np.testing.assert_allclose(coef_32, coef_64, rtol=1e-4)


# TODO(1.5) remove
@pytest.mark.parametrize("Estimator", [BayesianRidge, ARDRegression])
def test_bayesian_ridge_ard_n_iter_deprecated(Estimator):
"""Check the deprecation warning of `n_iter`."""
depr_msg = (
"'n_iter' was renamed to 'max_iter' in version 1.3 and will be removed in 1.5"
)
X, y = diabetes.data, diabetes.target
model = Estimator(n_iter=5)

with pytest.warns(FutureWarning, match=depr_msg):
model.fit(X, y)


# TODO(1.5) remove
@pytest.mark.parametrize("Estimator", [BayesianRidge, ARDRegression])
def test_bayesian_ridge_ard_max_iter_and_n_iter_both_set(Estimator):
"""Check that a ValueError is raised when both `max_iter` and `n_iter` are set."""
err_msg = (
"Both `n_iter` and `max_iter` attributes were set. Attribute"
" `n_iter` was deprecated in version 1.3 and will be removed in"
" 1.5. To avoid this error, only set the `max_iter` attribute."
)
X, y = diabetes.data, diabetes.target
model = Estimator(n_iter=5, max_iter=5)

with pytest.raises(ValueError, match=err_msg):
model.fit(X, y)