ENH: Add support for lstsq on stacks of matrices #15777

eric-wieser · 2020-03-18T22:20:39Z

Close #8720, at the cost of behavior changes in the resids return value. Marking as draft since I am publishing it primarily to facilitate discussion at that issue.

charris · 2020-03-26T16:02:14Z

numpy/linalg/linalg.py

    Parameters
    ----------
-    a : (M, N) array_like
+    a : (..., M, N) array_like


Hmm, the current docstring here is very old and could use improvement. Just array_like on one line, then a description of the shapes in the explanatory part and how they are treated.

This style of docstring with the shape in the argument name is fairly prevalent throughout this module.

Yep, could use improvement throughout. Not saying you should undertake the task in this PR.

charris · 2020-03-26T16:13:01Z

numpy/linalg/linalg.py

        Least-squares solution. If `b` is two-dimensional,
        the solutions are in the `K` columns of `x`.
-    residuals : {(1,), (K,), (0,)} ndarray
+    residuals : {(), (..., K,)} ndarray
        Sums of residuals; squared Euclidean 2-norm for each column in


Never much liked this, it isn't that useful. There have been a number of requests for returning the covariance matrix of the fitted coefficients.

charris · 2020-03-26T16:50:20Z

I'm going to suggest a new function here. I don't feel strongly about broadcasting either way, but there have been requests for the covariance matrix of the fitted parameters and I often end up computing that myself as it is more informative and allows for generating error bars on curve fits and such. Unfortunately, I can't see if DGELSD provides sufficient information to do that, the documentation simply says that the input design matrix is destroyed. DGELSS looks like an option but I can't find it in our fallback routines. I suppose we could implement it for ourselves. Multiple return options would be required as computing the covariance could take significant time.

eric-wieser · 2020-03-26T17:22:32Z

That sounds like an orthogonal suggestion here - the only point of this change is to enable broadcasting, since lstsq is one of the few remaining unbroadcast functions in np.linalg.

Unfortunately, the meaning of resids has to be tweaked every so slightly to make those semantics possible.

seberg · 2020-03-26T17:30:01Z

It seemed to me the main change is that resids previously had (in some sense) a keepdims in there, while now it does not? I.e. you do not add a dimension of size 1 at the end anymore.
For all practical purposes that probably does not matter much hopefully and I guess without it maybe it generalizes slight less well.
Although, I am not immediately sure, is that change for the simplest call actually necessary (I bet you guys discussed it before)?

eric-wieser · 2020-03-26T17:35:40Z

It seemed to me the main change is that resids previously had (in some sense) a keepdims in there, while now it does not? I.e. you do not add a dimension of size 1 at the end anymore.

It's a little worse than that. The old behavior was to return any of:

A result of shape (0,) for an underconstrained solution (for any input shape)
A result of shape (1,) for a well-constrained solution with input shape (M,)
A result of shape (K,) for a well-constrained solution with input shape (M, K)

The new behavior is

A result of shape (..., N,) for input shape (..., N, M)
A result of shape () for input shape (M,)

So there are two incompatible changes here, not just one.

seberg · 2020-03-26T17:47:15Z

I suppose I was guessing that the majority of use cases is likely currently either in the (K,) category (unchanged) or in the (1,) category which changed to (). The old underconstrained one seems strange, but of course it is not impossible someone actually checks len(constr) or so to see whether it was underconstrained... So in that sense, breaking the (1,) case may be better anyway, because at least it should break loudly when it breaks.

So the only option would be keeping the (0,) and (1,) shapes even though they are ugly... It is super hard to say how disruptive that change could be. It seems a question of boldness, this is the right thing if we are bold enough to risk breaking someone.
With release notes, I think I may be happy if we expect that almost all code that actually could break by this, should break loudly.

charris · 2020-03-26T18:52:28Z

That sounds like an orthogonal suggestion here

Yes, I thought it might be more useful in the long run :) Computing the sum of squared residuals also has a noticeable impact on performance, which has also been the subject of complaints.

But back to the case at hand with some explanation for other reviewers.

The problem with the stacked case it that the dimensions of the returned residual array must be determined by M, N, and K and cannot depend on runtime results like matrix rank. DEGLS will always compute residuals if M > N, regardless of if the matrix is rank N or not, there just might be some missing residuals if it isn't rank N, so I would just return the results from those residuals in that case, although that becomes problematical when there are few residuals. When the matrix has rank >= M, zeros seems the appropriate return if we are not going to return an empty array. In all other cases NaN seems appropriate, meaning "unknown".

Unfortunately, we fixed the function rather than the documentation in #9986, so there is some question as to whether we should retain the NaN array return when the matrix rank is < N regardless.

The upshot is that we need to make the change to deal with stacked arrays. It is enough of a corner case that I think we can do that, having zero/NaN arrays should deal with the case of downstream users being caught without being aware that something has changed.

charris · 2020-03-26T20:08:03Z

In short pseudo code

if M > N:
    # There are residuals returned, use them
    # Note that this result should be divided by (M - N)
    # to get unbiased estimate of variance if the model is any good.
    return sum_of_squares
elseif rank(a) >= M:
   # No residuals returned, but they are zero
    return zeros
else:
   #  No residuals returned, but they exist
   return NaNs

And an array always returned for the residuals.

numpy/linalg/linalg.py

numpy/linalg/tests/test_linalg.py

rgommers · 2021-06-05T06:06:23Z

@IvanYashchuk since you've just spent a bunch of time dealing with this residuals shape issue for PyTorch, would you mind having a look at this PR?

IvanYashchuk

The residuals change looks good to me. It's great to see the lstsq work on the stack of matrices.

I only have concerns about the behavior when the b array is a stack of vectors, numpy.linalg.solve works for this case, while numpy.linalg.lstsq doesn't, which is weird since they would compute the same result for the well-determined square case.

IvanYashchuk · 2021-06-07T08:27:48Z

numpy/linalg/linalg.py

@@ -2180,11 +2180,14 @@ def lstsq(a, b, rcond="warn"):
    Euclidean 2-norm :math:`||b - ax||`. If there are multiple minimizing 
    solutions, the one with the smallest 2-norm :math:`||x||` is returned.

+    .. versionchanged:: 1.19


Should this be 1.21.0?

IvanYashchuk · 2021-06-07T08:32:20Z

numpy/linalg/linalg.py

        "Coefficient" matrix.
-    b : {(M,), (M, K)} array_like
+    b : {(M,), (..., M, K)} array_like


numpy.linalg.solve has the following description for b : {(…, M,), (…, M, K)}, array_like. solve works for the (…, M,) case, while lstsq from this PR raises LinAlgError: Incompatible dimensions.

Should the behavior of this function follow the solve?

That's probably a good idea, I hadn't realized that we can make that change without affecting compatibility. We'd replace the current is_1d logic with something like

numpy/numpy/linalg/linalg.py

Lines 384 to 389 in 29561ed

# We use the b = (..., M,) logic, only if the number of extra dimensions

# match exactly

if b.ndim == a.ndim - 1:

gufunc = _umath_linalg.solve1

else:

gufunc = _umath_linalg.solve

eric-wieser added 01 - Enhancement component: numpy.linalg labels Mar 18, 2020

ENH: Add support for lstsq on stacks of matrices

22d955c

eric-wieser force-pushed the lstsq-resid-change branch from b5d0e0d to 22d955c Compare March 18, 2020 22:23

eric-wieser mentioned this pull request Mar 18, 2020

ENH: broadcast lstsq #8720

Open

seberg added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Mar 25, 2020

charris reviewed Mar 26, 2020

View reviewed changes

seberg added the 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. label Mar 26, 2020

Base automatically changed from master to main March 4, 2021 02:04

eric-wieser mentioned this pull request Jun 3, 2021

Support for stacked inputs to np.linalg.lstsq #19156

Closed

eric-wieser added 2 commits June 3, 2021 15:40

Merge branch 'main' into lstsq-resid-change

819f41c

Create 15777.compatibility.rst

25a51c2

eric-wieser commented Jun 3, 2021

View reviewed changes

numpy/linalg/linalg.py Outdated Show resolved Hide resolved

Update numpy/linalg/linalg.py

cf156b0

eric-wieser commented Jun 3, 2021

View reviewed changes

numpy/linalg/linalg.py Outdated Show resolved Hide resolved

numpy/linalg/linalg.py Outdated Show resolved Hide resolved

numpy/linalg/tests/test_linalg.py Outdated Show resolved Hide resolved

eric-wieser added 3 commits June 3, 2021 15:48

lint fixes

5c10d20

Update linalg.py

8884d9e

Update test_linalg.py

370d7e3

eric-wieser removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jun 3, 2021

Update test_linalg.py

29561ed

IvanYashchuk reviewed Jun 7, 2021

View reviewed changes

charris added the 52 - Inactive Pending author response label Apr 6, 2022

charris closed this Apr 6, 2022

seberg added the 64 - Good Idea Inactive PR with a good start or idea. Consider studying it if you are working on a related issue. label Apr 6, 2022

	# We use the b = (..., M,) logic, only if the number of extra dimensions
	# match exactly
	if b.ndim == a.ndim - 1:
	gufunc = _umath_linalg.solve1
	else:
	gufunc = _umath_linalg.solve

Uh oh!

ENH: Add support for lstsq on stacks of matrices #15777

ENH: Add support for lstsq on stacks of matrices #15777

Uh oh!

Conversation

eric-wieser commented Mar 18, 2020

Uh oh!

charris Mar 26, 2020

Choose a reason for hiding this comment

Uh oh!

eric-wieser Mar 26, 2020

Choose a reason for hiding this comment

Uh oh!

charris Mar 26, 2020

Choose a reason for hiding this comment

Uh oh!

charris Mar 26, 2020

Choose a reason for hiding this comment

Uh oh!

charris commented Mar 26, 2020

Uh oh!

eric-wieser commented Mar 26, 2020

Uh oh!

seberg commented Mar 26, 2020

Uh oh!

eric-wieser commented Mar 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Mar 26, 2020

Uh oh!

charris commented Mar 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

charris commented Mar 26, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rgommers commented Jun 5, 2021

Uh oh!

IvanYashchuk left a comment

Choose a reason for hiding this comment

Uh oh!

IvanYashchuk Jun 7, 2021

Choose a reason for hiding this comment

Uh oh!

IvanYashchuk Jun 7, 2021

Choose a reason for hiding this comment

Uh oh!

eric-wieser Jun 7, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

eric-wieser commented Mar 26, 2020 •

edited

Loading

charris commented Mar 26, 2020 •

edited

Loading