FIX switch to 'sparse_cg' solver in Ridge when X is sparse and fitting intercept #13995

jeromedockes · 2019-05-31T07:08:17Z

Ridge is failing the check introduced in #13246 to verify that estimators produce the same results for sparse and dense data.
this PR enforces selecting the sparse_cg solver when X is sparse and fit_intercept=True, since this solver is the only one to correctly fit an intercept with the Ridge default tol and max_iter when X is sparse at the moment.
@agramfort @glemaitre @ogrisel

rth · 2019-05-31T07:26:01Z

sklearn/linear_model/ridge.py

+            solver = 'sparse_cg'
+            if self.solver not in ['auto', 'sparse_cg']:
+                warnings.warn(
+                    'setting solver to "sparse_cg" because X is sparse')


Better,

"solver={} does not support fitting the intercept on sparse data, " "falling back to solver='sparse_cg'. To avoid this warning either change the solver " "to 'sparse_cg' explicitly or set `fit_intercept=False`.

rth · 2019-05-31T07:28:51Z

sklearn/linear_model/ridge.py

@@ -545,6 +545,13 @@ def fit(self, X, y, sample_weight=None):
                         accept_sparse=_accept_sparse,
                         dtype=_dtype,
                         multi_output=True, y_numeric=True)
+        if sparse.issparse(X) and self.fit_intercept:
+            solver = 'sparse_cg'
+            if self.solver not in ['auto', 'sparse_cg']:


The solver resolution (e.g. for auto is normally done in _ridge_regression) maybe better to move this there as well.

The difficulty is that depending on fit_intercept and whether X is dense we
need to provide _ridge_regression with X_offset and X_scale (computed in
preprocessing) or not:

scikit-learn/sklearn/linear_model/ridge.py

Line 571 in e871a56

params = {'X_offset': X_offset, 'X_scale': X_scale}

I would not have prevented someone to use 'sag' with fit_intercept=True. It's not broken per se it is just that it needs a lore more iterations than the default value.

In this case should there be a warning? Users may be surprised by the number of
iterations they need to set:

>>> x, y = _make_sparse_offset_regression(n_samples=20, n_features=5, random_state=0) >>> sp_ridge = Ridge(solver='sag', max_iter=10000000, tol=1e-8).fit(sparse.csr_matrix(x), y) >>> ridge = Ridge(solver='sag', max_iter=10000000, tol=1e-8).fit(x, y) >>> sp_ridge.n_iter_[0] 566250 >>> ridge.n_iter_[0] 100 >>> np.allclose(sp_ridge.intercept_, ridge.intercept_, rtol=1e-3) False

@agramfort I restored the possibility of using 'sag' and added a warning, let me know if I should remove the warning

rth · 2019-05-31T07:30:33Z

sklearn/linear_model/tests/test_ridge.py

-        assert_raises_regex(ValueError, "In Ridge,", sparse.fit, X_csr, y)
+    for solver in ['saga', 'lsqr', 'sag']:
+        sparse = Ridge(alpha=1., solver=solver, fit_intercept=True)
+        assert_warns(UserWarning, sparse.fit, X_csr, y)


We do need to match the warning message (a lot of unrelated things can raise a UserWarning), even if it means splitting checks for sag and other solvers.

rth · 2019-05-31T08:00:53Z

Also, I wonder if it isn't better to raise an exception on an unsupported solver rather than switch solver with a warning.

jeromedockes · 2019-05-31T09:41:15Z

Also, I wonder if it isn't better to raise an exception on an unsupported solver rather than switch solver with a warning.

thanks! I changed the warning to a ValueError, improved the message as you suggested, and checked the message in the test

glemaitre

IMO, we can consider it as a bug fix rather than a change of default.

So we will need an entry in what's new as a bug fix and document it in model changes as well.

sklearn/linear_model/ridge.py

glemaitre · 2019-06-03T14:45:50Z

sklearn/linear_model/tests/test_ridge.py

-        assert_raises_regex(ValueError, "In Ridge,", sparse.fit, X_csr, y)
+    for solver in ['saga', 'lsqr', 'sag']:
+        sparse = Ridge(alpha=1., solver=solver, fit_intercept=True)
+        assert_raises_regex(


let's use pytest for this one

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

agramfort · 2019-06-10T06:53:04Z

sklearn/linear_model/tests/test_sag.py

@@ -371,7 +371,7 @@ def test_sag_regressor_computed_correctly():
    n_samples = 40
    max_iter = 50
    tol = .000001
-    fit_intercept = True
+    fit_intercept = False


why do you need this @jeromedockes ? SAG fails to get a good intercept?
can you try initializing the intercept to np.mean(y_train) instead of 0 (if done like this now)?

At the moment it does fail to fit a good intercept (with the default tol and
n_iter)

the intercept is indeed initialized with zeros:

scikit-learn/sklearn/linear_model/ridge.py

Line 486 in c315bf9

init = {'coef': np.zeros((n_features + int(return_intercept), 1),

but initializing it with the mean of y probably won't change much since this
mean is 0: y is always assumed to be dense and centered in preprocessing.
The mean of X is what causes the intercept to be nonzero.

I was thinking that we could revert to only allowing sparse_cg for sparse data
in this PR to quickly fix the bug and unlock #13246 , and then see if support
for sparse data can be added to the sag solver in a separate PR. WDYT?

I am +1 for this path. I think that we should ensure that we give proper results. We can investigate later how to fix SAG for this case. @agramfort WDYT?

BTW one of the reasons the intercept takes many iterations to converge can be this decay

scikit-learn/sklearn/linear_model/base.py

Line 43 in 4a6264d

SPARSE_INTERCEPT_DECAY = 0.01

scikit-learn/sklearn/linear_model/base.py

Line 97 in 4a6264d

return dataset, intercept_decay

scikit-learn/sklearn/linear_model/sag.py

Line 299 in 4a6264d

dataset, intercept_decay = make_dataset(X, y, sample_weight, random_state)

…selection

glemaitre

Couple of comments.

doc/whats_new/v0.22.rst

sklearn/linear_model/ridge.py

sklearn/linear_model/tests/test_ridge.py

glemaitre · 2019-06-13T11:59:59Z

sklearn/linear_model/tests/test_ridge.py

-        dense = Ridge(alpha=1., tol=1.e-15, solver=solver, fit_intercept=True)
-        sparse = Ridge(alpha=1., tol=1.e-15, solver=solver, fit_intercept=True)
+    # for now only sparse_cg can fit an intercept with sparse X
+    for solver in ['sparse_cg']:


you can remove the for loop. We will use the parametrization from pytest when we will reintroduce sag

glemaitre · 2019-06-13T12:01:44Z

sklearn/linear_model/tests/test_ridge.py

-        assert_raises_regex(ValueError, "In Ridge,", sparse.fit, X_csr, y)
+    for solver in ['saga', 'lsqr', 'sag']:
+        sparse = Ridge(alpha=1., solver=solver, fit_intercept=True)
+        with pytest.raises(


This will be a bit more compact

err_msg = "solver='{}' does not support".format(solver) with pytest.raises(ValueError, match=err_msg): sparse.fit(X_csr, y)

glemaitre · 2019-06-13T12:02:25Z

sklearn/linear_model/tests/test_ridge.py

-        sparse = Ridge(alpha=1., tol=1.e-15, solver=solver, fit_intercept=True)
-        assert_raises_regex(ValueError, "In Ridge,", sparse.fit, X_csr, y)
+    for solver in ['saga', 'lsqr', 'sag']:
+        sparse = Ridge(alpha=1., solver=solver, fit_intercept=True)


Avoid to call it sparse. I think that we have the following import sometimes: from scipy import sparse.

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

…selection

agramfort · 2019-07-16T13:29:41Z

sklearn/linear_model/tests/test_ridge.py

-        sparse = Ridge(alpha=1., tol=1.e-15, solver=solver, fit_intercept=True)
-        assert_raises_regex(ValueError, "In Ridge,", sparse.fit, X_csr, y)
+    for solver in ['saga', 'lsqr', 'sag']:
+        sparse = Ridge(alpha=1., solver=solver, fit_intercept=True)


agramfort · 2019-07-16T13:43:39Z

sklearn/linear_model/ridge.py

@@ -545,6 +545,13 @@ def fit(self, X, y, sample_weight=None):
                         accept_sparse=_accept_sparse,
                         dtype=_dtype,
                         multi_output=True, y_numeric=True)
+        if sparse.issparse(X) and self.fit_intercept:
+            solver = 'sparse_cg'
+            if self.solver not in ['auto', 'sparse_cg']:


I would not have prevented someone to use 'sag' with fit_intercept=True. It's not broken per se it is just that it needs a lore more iterations than the default value.

agramfort · 2019-07-16T13:45:13Z

doc/whats_new/v0.22.rst

@@ -109,6 +111,10 @@ Changelog
  of the maximization procedure in :term:`fit`.
  :pr:`13618` by :user:`Yoshihiro Uchida <c56pony>`.

+- |Fix| :class:`linear_model.Ridge` now correctly fits an intercept when
+  `X` is sparse, `solver="auto"` and `fit_intercept=True`. Setting the solver to


explain that it is because the default solver is now sparse_cg

sklearn/linear_model/ridge.py

agramfort · 2019-07-17T11:25:26Z

sklearn/linear_model/ridge.py

+                    '"sag" solver requires many iterations to fit '
+                    'an intercept with sparse inputs. Either set the '
+                    'solver to "auto" or "sparse_cg", or set a low '
+                    '"tol" and a high "max_iter".')


what bothers me is that you will get the warning whatever you use for tol or max_iter. Maybe just warn if parameter used are the defaults?

agramfort · 2019-07-17T11:27:06Z

sklearn/linear_model/tests/test_ridge.py

+    # tol and max_iter, sag should raise a warning and is handled in
+    # test_ridge_fit_intercept_sparse_sag
+    # "auto" should switch to "sparse_cg"
+    dense_ridge = Ridge(alpha=1., solver='sparse_cg', fit_intercept=True)


sparse_cg for dense_ridge?

sparse_cg can fit both sparse and dense data. since both "auto" and "sparse_cg" should result in "sparse_cg" being used when X is sparse, the reference is "sparse_cg" with dense data, and Ridge(solver="auto") and Ridge(solver="sparse_cg"), fitted on sparse data, are compared to it

agramfort · 2019-07-17T11:28:12Z

sklearn/linear_model/tests/test_sag.py

@@ -464,6 +464,7 @@ def test_sag_regressor():
    y = 0.5 * X.ravel()

    clf1 = Ridge(tol=tol, solver='sag', max_iter=max_iter,
+                 fit_intercept=False,


I would revert changes to test if these are not necessary. Just catch the warning if need be.

I reverted them, there is no warning because in those tests the tol and max_iter used are not the default ones

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

…edockes/scikit-learn into fix_ridge_solver_selection

agramfort · 2019-07-18T09:10:22Z

sklearn/linear_model/tests/test_ridge.py

+    # for now only sparse_cg can fit an intercept with sparse X with default
+    # tol and max_iter, sag should raise a warning and is handled in
+    # test_ridge_fit_intercept_sparse_sag
+    # "auto" should switch to "sparse_cg"


it this comment still relevant? I don't sag warnings caught here.
If you update the comment can you write what you answered to me just below? thanks

thanks! updated the comment. you don't see warnings here because sag's behaviour in this configuration is tested separately in test_ridge_fit_intercept_sparse_sag

agramfort

Let's wait for another approval before merging. thx @jeromedockes

glemaitre

A single nitpick and then I will merge. LGTM.

sklearn/linear_model/ridge.py

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

glemaitre · 2019-07-19T12:26:56Z

Thanks @jeromedockes

jeromedockes · 2019-07-19T12:33:13Z

thanks a lot for the help and advice @glemaitre @agramfort and @rth

jeromedockes added 2 commits May 31, 2019 08:59

switch to sparse_cg solver in Ridge when X is sparse + intercept

77d5f36

update docstring

fc8b7d7

rth reviewed May 31, 2019

View reviewed changes

change warning to ValueError + better message

a7acff2

fit_intercept=False in sag tests

0e017e7

glemaitre reviewed Jun 3, 2019

View reviewed changes

jeromedockes and others added 3 commits June 3, 2019 17:44

Apply suggestions from code review

92f2920

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

assert_raises_regex -> pytest.raises

63e77e0

update whatsnew

b868112

agramfort reviewed Jun 10, 2019

View reviewed changes

jeromedockes added 2 commits June 11, 2019 15:41

Merge remote-tracking branch 'upstream/master' into fix_ridge_solver_…

4c2cd72

…selection

use "match" not "message" in pytest.raises

88c79e7

glemaitre reviewed Jun 13, 2019

View reviewed changes

glemaitre changed the title ~~switch to sparse_cg solver in Ridge when X is sparse and fit_intercept is True~~ FIX switch to 'sparse_cg' solver in Ridge when X is sparse and fitting intercept Jun 13, 2019

jeromedockes and others added 2 commits June 13, 2019 15:00

Apply suggestions from code review

259e380

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

address comments from @glemaitre review

c22dae0

jeremiedbb mentioned this pull request Jul 12, 2019

PERF Support converting 32-bit matrices directly to liblinear format … #14296

Merged

Merge remote-tracking branch 'upstream/master' into fix_ridge_solver_…

3180570

…selection

agramfort reviewed Jul 16, 2019

View reviewed changes

jeromedockes added 3 commits July 16, 2019 16:01

better name sparse -> sparse_ridge

bef0b58

improve whatsnew

df83c16

allow "sag" with sparse input but warn about slow convergence

11a683f

agramfort reviewed Jul 17, 2019

View reviewed changes

jeromedockes and others added 3 commits July 17, 2019 14:14

Update sklearn/linear_model/ridge.py

055a199

Co-Authored-By: Alexandre Gramfort <alexandre.gramfort@m4x.org>

address review by @agramfort

a4fe446

Merge branch 'fix_ridge_solver_selection' of https://github.com/jerom…

4cb32b2

…edockes/scikit-learn into fix_ridge_solver_selection

agramfort reviewed Jul 18, 2019

View reviewed changes

improve comment in test_ridge_fit_intercept_sparse

0d0c30d

agramfort approved these changes Jul 19, 2019

View reviewed changes

glemaitre reviewed Jul 19, 2019

View reviewed changes

sklearn/linear_model/ridge.py Outdated Show resolved Hide resolved

jeromedockes and others added 2 commits July 19, 2019 14:03

Update sklearn/linear_model/ridge.py

ab4d069

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

whitespace

bfaf0ac

glemaitre merged commit 89e6c96 into scikit-learn:master Jul 19, 2019

jeromedockes deleted the fix_ridge_solver_selection branch July 19, 2019 12:33

amueller mentioned this pull request Jul 23, 2019

[MRG] Release 0.20.4 #14443

Merged

11 tasks

This was referenced Jul 30, 2019

fit_intercept in Ridge (sparse case) #1389

Closed

[WIP] Ridge fit_intercept with sparse X (issue #1389) #1560

Closed

jeromedockes mentioned this pull request Nov 18, 2019

[MRG] Does not store all cv values nor all dual coef in _RidgeGCV fit #15183

Closed

gui-miotto mentioned this pull request Feb 3, 2020

Updates scikit-learn version to 0.22 automl/auto-sklearn#774

Merged

TomDLT mentioned this pull request Dec 16, 2022

Sparse data representations results in worse models than dense data for some classifiers #25198

Open

Uh oh!

FIX switch to 'sparse_cg' solver in Ridge when X is sparse and fitting intercept #13995

FIX switch to 'sparse_cg' solver in Ridge when X is sparse and fitting intercept #13995

Uh oh!

Conversation

jeromedockes commented May 31, 2019

Uh oh!

rth May 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rth commented May 31, 2019

Uh oh!

jeromedockes commented May 31, 2019

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rth May 31, 2019 •

edited

Loading