Fix for ridge regression with sparse matrix input #2552

bdkearns · 2013-10-27T15:04:13Z

Fix for ridge regression with sparse matrix input

Re-enable selecting SVD for sparse input (even though it is converted to
dense, one day sparse solving may be supported, and there are current
valid use cases where sparse input is typical and doesn't blow up the
SVD, but does blow up the eigen solver)

(Fixes #2354)

coveralls · 2013-10-27T15:15:50Z

Coverage remained the same when pulling 8facf356b1d668af863559538bf902ab46d98998 on bdkearns:fix-ridge-with-sparse into 101f0ee on scikit-learn:master.

Re-enable selecting SVD for sparse input (even though it is converted to dense, one day sparse solving may be supported, and there are current valid use cases where sparse input is typical and doesn't blow up the SVD, but does blow up the eigen solver)

coveralls · 2013-10-27T15:49:58Z

Coverage remained the same when pulling c7529c7 on bdkearns:fix-ridge-with-sparse into 101f0ee on scikit-learn:master.

coveralls · 2013-10-27T15:51:39Z

Coverage remained the same when pulling c7529c7 on bdkearns:fix-ridge-with-sparse into 101f0ee on scikit-learn:master.

coveralls · 2013-10-27T15:54:30Z

Coverage remained the same when pulling c7529c7 on bdkearns:fix-ridge-with-sparse into 101f0ee on scikit-learn:master.

larsmans · 2013-10-27T16:28:41Z

I'd rather have sparse matrix SVD support directly, then, using either scipy.sparse.linalg.svds or sklearn.utils.extmath.randomized_svd. (The latter is very fast and stable.)

larsmans · 2013-10-27T16:52:48Z

Ping @agramfort, @mblondel.

bdkearns · 2013-10-27T19:19:57Z

Sure, sparse SVD would be nice to have, but you don't consider it a regression that perfectly working usages under 0.13 don't only raise MemoryError because they get forced into the eigen solver, but are actually forbidden from working even manually specifying the svd solver? And for what gain? Why was this usage suddenly banned in 0.14? IMO the advice for the other person getting a MemoryError with a nearly 100k x 100k square matrix (testing the limits on either solver really, just happened to trigger on SVD) would be to simply specify the solver by hand, not to suddenly force every sparse input into the eigen solver. Seems a bit knee jerk.

Now someone with say 100k samples x 100 features (which only needs a 100x100 economy SVD right?) who happens to have a sparse matrix input (say coming from a textual feature problem) gets forced into eigen (when they used to get svd) and gets MemoryError, and if they happen to deduce why and try to change back to svd, find it banned? I guess they can work around by converting the input to dense beforehand, but that seems an odd thing to force users to do.

mblondel · 2013-10-28T00:37:41Z

sklearn/linear_model/ridge.py

@@ -645,8 +645,8 @@ def _values(self, alpha, y, v, Q, QT_y):
        return y - (c / G_diag), c

    def _pre_compute_svd(self, X, y):
-        if sparse.issparse(X):
-            raise TypeError("SVD not supported for sparse matrices")


I'm -1 on this PR because densifying a sparse matrix behind the scenes is really not a good idea. I'd rather improve the error message to tell users to call X = X.toarray() in their script.

bdkearns · 2013-10-28T00:53:54Z

Well, if that's the case, then why only disable sparse inputs for the SVD solver? The eigen solver does safe_sparse_dot(dense_output=True), creating a MxM dense matrix possibly even larger than the sparse input. And you guys changed it so sparse inputs previously fed to the SVD solver (which worked fine when densifying), now without warning feed to eigen, even when n_samples >> n_features, and give MemoryErrors in these cases. Consider text problems -- the inputs lend themselves to sparse representation (users probably won't think to densify first, even if it is possible), n_samples >> n_features, so eigen will explode by default now, even if densify+SVD would have and did work in the past.

See issue #2354 for an example that creates a memory error under 0.14, was fine under 0.13.

mblondel · 2013-10-28T01:55:59Z

The eigen solver does safe_sparse_dot(dense_output=True)

Yes but the resulting output has shape [n_samples, n_samples] when dense NumPy arrays are passed too. So densifying the output matrix doesn't change anything here.

now without warning feed to eigen, even when n_samples >> n_features, and give MemoryErrors in these cases

Indeed that's bad. Maybe we should disable this behavior then.

mblondel · 2013-10-28T01:56:45Z

I'd rather have sparse matrix SVD support directly, then, using either scipy.sparse.linalg.svds or sklearn.utils.extmath.randomized_svd. (The latter is very fast and stable.)

I'm not sure if we can use a truncated SVD here but it's worth a try.

bdkearns · 2013-10-28T02:09:20Z

Yes but the resulting output has shape [n_samples, n_samples] when dense NumPy arrays are passed too. So densifying the output matrix doesn't change anything here.

Yes, but with dense matrices, anything with n_samples > n_features is sent to SVD, not eigen, and produces n_features x n_features array instead. Which silently changed for sparse, while banning them from SVD?

One can construct sparse matrices that blow up either solver, as they both convert to dense. IMO there are two paths -- allow sparse inputs, obviously documenting that both solvers use dense computations, or flat out ban them from both. This "oh one user blew up one solver with this shape problem, let's ban sparse from that one and send all problems to the other" is careless.

bdkearns · 2013-10-29T02:37:22Z

Also, one might consider that other parts of sklearn like feature_extraction.text.CountVectorizer produce sparse matrices. And users probably expect to be able to feed these directly into the linear models like these, as in the examples. And these sparse matrices are of the sort typically with large n_samp, small n_feat. Changing the model here to ban sparse or force them into a solver that produces huge arrays (n_samp x n_samp even when n_samp >> n_feat) when it didn't before seems odd. You guys are producing a library here, one would think API consistency would be valued. One might think you guys would evaluate typical usages patterns/examples before making changes like this that break consistency.

bdkearns · 2013-10-29T02:52:51Z

And this breakage was for one user who was getting a MemoryError? And what part of the change actually "fixed" their problem? Hard coding all sparse arrays to eigen. It wasn't even the removal of the densify in SVD. And now all users of the problems described above get MemoryErrors! What a 'fix'. Anyways, I'm surprised anyone would support hard coding all sparse inputs to one solver (or equally be -1 against reverting it).

coveralls · 2013-10-29T03:05:24Z

Coverage remained the same when pulling c7529c7 on bdkearns:fix-ridge-with-sparse into 101f0ee on scikit-learn:master.

mblondel · 2013-10-29T04:06:59Z

What line creates a (n_features, n_features) matrix?

bdkearns · 2013-10-29T04:22:11Z

sklearn/linear_model/ridge.py

@@ -701,7 +701,7 @@ def fit(self, X, y, sample_weight=1.0):
        with_sw = len(np.shape(sample_weight))

        if gcv_mode is None or gcv_mode == 'auto':
-            if sparse.issparse(X) or n_features > n_samples or with_sw:
+            if n_features > n_samples or with_sw:
                gcv_mode = 'eigen'


Here, sparse matrices with n_samp >> n_feat (say typical of the type produced by text feature problems) are sent to eigen, where a dense n_samp x n_samp is produced (X dot X.T), when dense matrices of this shape would have been sent to SVD (and in the past sparse were also).

mblondel · 2013-10-29T04:35:52Z

Changing the model here to ban sparse or force them into a solver that produces huge arrays (n_feat x n_feat) when it didn't before seems odd

Please don't edit your posts. Otherwise people who read the discussion later can't understand why I asked

What line creates a (n_features, n_features) matrix?

bdkearns · 2013-10-29T04:40:39Z

Simple typo, don't forget you're free to edit as well.

amueller · 2015-01-23T04:18:53Z

hum was there any conclusion here?

mblondel · 2015-01-23T09:50:30Z

Silently densifying a sparse matrix is a bad idea and we've always tried to avoid it. The real fix would be to support sparse matrices directly (so this should be labeled as "new feature" rather than "bug"). We can try to compute a sparse SVD by svds or sklearn.utils.extmath.randomized_svd but they only compute the first k singular vectors so the results won't be the same. I would be curious to know if this can work well (for k sufficiently large).

I suggest we close this PR. I agree with the problem but not with the solution.

amueller · 2015-01-23T19:30:36Z

I only skimmed it but it seemed like a regression, but if you say it is not then I'll take your word for it ;)

rth · 2019-06-14T07:00:22Z

The associated issue was resolved in #13350, closing.

mblondel reviewed Oct 28, 2013
View reviewed changes

bdkearns closed this Oct 29, 2013

bdkearns reopened this Oct 29, 2013

bdkearns reviewed Oct 29, 2013
View reviewed changes

larsmans force-pushed the master branch from 58a55ad to 4b82379 Compare August 25, 2014 21:50

MechCoder force-pushed the master branch from 6deaea0 to 3f49cee Compare November 3, 2014 12:36

amueller added the Bug label Jan 23, 2015

amueller added this to the 0.16 milestone Jan 23, 2015

amueller removed the Bug label Mar 3, 2015

amueller removed this from the 0.16 milestone Mar 3, 2015

rth closed this Jun 14, 2019

Uh oh!

Fix for ridge regression with sparse matrix input #2552

Fix for ridge regression with sparse matrix input #2552

Uh oh!

Conversation

bdkearns commented Oct 27, 2013

Uh oh!

coveralls commented Oct 27, 2013

Uh oh!

coveralls commented Oct 27, 2013

Uh oh!

coveralls commented Oct 27, 2013

Uh oh!

coveralls commented Oct 27, 2013

Uh oh!

larsmans commented Oct 27, 2013

Uh oh!

larsmans commented Oct 27, 2013

Uh oh!

bdkearns commented Oct 27, 2013

Uh oh!

mblondel Oct 28, 2013

Choose a reason for hiding this comment

Uh oh!

bdkearns commented Oct 28, 2013

Uh oh!

mblondel commented Oct 28, 2013

Uh oh!

mblondel commented Oct 28, 2013

Uh oh!

bdkearns commented Oct 28, 2013

Uh oh!

bdkearns commented Oct 29, 2013

Uh oh!

bdkearns commented Oct 29, 2013

Uh oh!

coveralls commented Oct 29, 2013

Uh oh!

mblondel commented Oct 29, 2013

Uh oh!

bdkearns Oct 29, 2013

Choose a reason for hiding this comment

Uh oh!

mblondel commented Oct 29, 2013

Uh oh!

bdkearns commented Oct 29, 2013

Uh oh!

amueller commented Jan 23, 2015

Uh oh!

mblondel commented Jan 23, 2015

Uh oh!

amueller commented Jan 23, 2015

Uh oh!

rth commented Jun 14, 2019

Uh oh!

Uh oh!