Skip to content

ENH: Add support for lstsq on stacks of matrices #15777

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from

Conversation

eric-wieser
Copy link
Member

Close #8720, at the cost of behavior changes in the resids return value. Marking as draft since I am publishing it primarily to facilitate discussion at that issue.

@seberg seberg added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Mar 25, 2020
Parameters
----------
a : (M, N) array_like
a : (..., M, N) array_like
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the current docstring here is very old and could use improvement. Just array_like on one line, then a description of the shapes in the explanatory part and how they are treated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This style of docstring with the shape in the argument name is fairly prevalent throughout this module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, could use improvement throughout. Not saying you should undertake the task in this PR.

Least-squares solution. If `b` is two-dimensional,
the solutions are in the `K` columns of `x`.
residuals : {(1,), (K,), (0,)} ndarray
residuals : {(), (..., K,)} ndarray
Sums of residuals; squared Euclidean 2-norm for each column in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never much liked this, it isn't that useful. There have been a number of requests for returning the covariance matrix of the fitted coefficients.

@charris
Copy link
Member

charris commented Mar 26, 2020

I'm going to suggest a new function here. I don't feel strongly about broadcasting either way, but there have been requests for the covariance matrix of the fitted parameters and I often end up computing that myself as it is more informative and allows for generating error bars on curve fits and such. Unfortunately, I can't see if DGELSD provides sufficient information to do that, the documentation simply says that the input design matrix is destroyed. DGELSS looks like an option but I can't find it in our fallback routines. I suppose we could implement it for ourselves. Multiple return options would be required as computing the covariance could take significant time.

@eric-wieser
Copy link
Member Author

That sounds like an orthogonal suggestion here - the only point of this change is to enable broadcasting, since lstsq is one of the few remaining unbroadcast functions in np.linalg.

Unfortunately, the meaning of resids has to be tweaked every so slightly to make those semantics possible.

@seberg
Copy link
Member

seberg commented Mar 26, 2020

It seemed to me the main change is that resids previously had (in some sense) a keepdims in there, while now it does not? I.e. you do not add a dimension of size 1 at the end anymore.
For all practical purposes that probably does not matter much hopefully and I guess without it maybe it generalizes slight less well.
Although, I am not immediately sure, is that change for the simplest call actually necessary (I bet you guys discussed it before)?

@eric-wieser
Copy link
Member Author

eric-wieser commented Mar 26, 2020

It seemed to me the main change is that resids previously had (in some sense) a keepdims in there, while now it does not? I.e. you do not add a dimension of size 1 at the end anymore.

It's a little worse than that. The old behavior was to return any of:

  • A result of shape (0,) for an underconstrained solution (for any input shape)
  • A result of shape (1,) for a well-constrained solution with input shape (M,)
  • A result of shape (K,) for a well-constrained solution with input shape (M, K)

The new behavior is

  • A result of shape (..., N,) for input shape (..., N, M)
  • A result of shape () for input shape (M,)

So there are two incompatible changes here, not just one.

@seberg
Copy link
Member

seberg commented Mar 26, 2020

I suppose I was guessing that the majority of use cases is likely currently either in the (K,) category (unchanged) or in the (1,) category which changed to (). The old underconstrained one seems strange, but of course it is not impossible someone actually checks len(constr) or so to see whether it was underconstrained... So in that sense, breaking the (1,) case may be better anyway, because at least it should break loudly when it breaks.

So the only option would be keeping the (0,) and (1,) shapes even though they are ugly... It is super hard to say how disruptive that change could be. It seems a question of boldness, this is the right thing if we are bold enough to risk breaking someone.
With release notes, I think I may be happy if we expect that almost all code that actually could break by this, should break loudly.

@seberg seberg added the 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. label Mar 26, 2020
@charris
Copy link
Member

charris commented Mar 26, 2020

That sounds like an orthogonal suggestion here

Yes, I thought it might be more useful in the long run :) Computing the sum of squared residuals also has a noticeable impact on performance, which has also been the subject of complaints.

But back to the case at hand with some explanation for other reviewers.

The problem with the stacked case it that the dimensions of the returned residual array must be determined by M, N, and K and cannot depend on runtime results like matrix rank. DEGLS will always compute residuals if M > N, regardless of if the matrix is rank N or not, there just might be some missing residuals if it isn't rank N, so I would just return the results from those residuals in that case, although that becomes problematical when there are few residuals. When the matrix has rank >= M, zeros seems the appropriate return if we are not going to return an empty array. In all other cases NaN seems appropriate, meaning "unknown".

Unfortunately, we fixed the function rather than the documentation in #9986, so there is some question as to whether we should retain the NaN array return when the matrix rank is < N regardless.

The upshot is that we need to make the change to deal with stacked arrays. It is enough of a corner case that I think we can do that, having zero/NaN arrays should deal with the case of downstream users being caught without being aware that something has changed.

@charris
Copy link
Member

charris commented Mar 26, 2020

In short pseudo code

if M > N:
    # There are residuals returned, use them
    # Note that this result should be divided by (M - N)
    # to get unbiased estimate of variance if the model is any good.
    return sum_of_squares
elseif rank(a) >= M:
   # No residuals returned, but they are zero
    return zeros
else:
   #  No residuals returned, but they exist
   return NaNs

And an array always returned for the residuals.

Base automatically changed from master to main March 4, 2021 02:04
@eric-wieser eric-wieser removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jun 3, 2021
@rgommers
Copy link
Member

rgommers commented Jun 5, 2021

@IvanYashchuk since you've just spent a bunch of time dealing with this residuals shape issue for PyTorch, would you mind having a look at this PR?

Copy link
Contributor

@IvanYashchuk IvanYashchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The residuals change looks good to me. It's great to see the lstsq work on the stack of matrices.

I only have concerns about the behavior when the b array is a stack of vectors, numpy.linalg.solve works for this case, while numpy.linalg.lstsq doesn't, which is weird since they would compute the same result for the well-determined square case.

@@ -2180,11 +2180,14 @@ def lstsq(a, b, rcond="warn"):
Euclidean 2-norm :math:`||b - ax||`. If there are multiple minimizing
solutions, the one with the smallest 2-norm :math:`||x||` is returned.

.. versionchanged:: 1.19
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be 1.21.0?

"Coefficient" matrix.
b : {(M,), (M, K)} array_like
b : {(M,), (..., M, K)} array_like
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

numpy.linalg.solve has the following description for b : {(…, M,), (…, M, K)}, array_like. solve works for the (…, M,) case, while lstsq from this PR raises LinAlgError: Incompatible dimensions.

Should the behavior of this function follow the solve?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably a good idea, I hadn't realized that we can make that change without affecting compatibility. We'd replace the current is_1d logic with something like

# We use the b = (..., M,) logic, only if the number of extra dimensions
# match exactly
if b.ndim == a.ndim - 1:
gufunc = _umath_linalg.solve1
else:
gufunc = _umath_linalg.solve

@charris charris added the 52 - Inactive Pending author response label Apr 6, 2022
@charris charris closed this Apr 6, 2022
@seberg seberg added the 64 - Good Idea Inactive PR with a good start or idea. Consider studying it if you are working on a related issue. label Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement 52 - Inactive Pending author response 62 - Python API Changes or additions to the Python API. Mailing list should usually be notified. 64 - Good Idea Inactive PR with a good start or idea. Consider studying it if you are working on a related issue. component: numpy.linalg
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: broadcast lstsq
5 participants