MAINT: Remove similar branches from linalg.lstsq #9986

eric-wieser · 2017-11-08T04:10:00Z

Working towards being able to fix #8720.

This doesn't change any behavior, but does add comments pointing out the existing somewhat-questionable behaviour

eric-wieser · 2017-11-08T04:12:24Z

numpy/linalg/linalg.py

+
+    # as documented
+    if rank != n or m <= n:
+        resids = array([], result_real_t)


This is a bizarre interface, and resids already contains 0 in the m <= n case, which is a more meaningful way to say "no residual" than []. But we're stuck with it, because that's how it's documented.

eric-wieser · 2017-11-08T04:13:33Z

numpy/linalg/linalg.py

-                               dtype=result_real_t)
-            else:
-                resids = array([sum((ravel(bstar)[n:])**2)],
-                               dtype=result_real_t)


In what makes no sense at all, this branch produces the same effect as the one that follows it.

eric-wieser · 2017-11-08T06:58:02Z

numpy/linalg/linalg.py


-    st = s[:min(n, m)].astype(result_real_t, copy=True)


This slice was pointless, because len(s) == min(n,m)

mhvk

This all looks good. My comments are mostly nitpicks.

mhvk · 2017-11-08T13:22:43Z

numpy/linalg/linalg.py

@@ -1915,7 +1915,7 @@ def lstsq(a, b, rcond="warn"):
    x : {(N,), (N, K)} ndarray
        Least-squares solution. If `b` is two-dimensional,
        the solutions are in the `K` columns of `x`.
-    residuals : {(), (1,), (K,)} ndarray
+    residuals : {(0,), (1,), (K,)} ndarray


I'd just remove the (0,) - now it is just confusing and the note states what happens for wrong input.

It may be confusing, but that's because the behavior is confusing too!

And it's not even for "wrong input" - just for cases when an exact match is possible. I could move the (0,) to the last item in the list.

Yes, maybe having it as the last item is best, since it is not the most common output.

mhvk · 2017-11-08T13:24:53Z

numpy/linalg/linalg.py

@@ -1997,8 +2001,6 @@ def lstsq(a, b, rcond="warn"):
    if rcond is None:
        rcond = finfo(t).eps * ldb

-    result_real_t = _realType(result_t)
-    real_t = _linalgRealType(t)
    bstar = zeros((ldb, n_rhs), t)
    bstar[:b.shape[0], :n_rhs] = b.copy()


Not this PR, but when I looked at this before, I wondered what would be the point of .copy(); it is not like a view gets taken and this cannot be of much speed benefit for the whole routine.

Yeah, the copy stuff here is weird.

mhvk · 2017-11-08T13:28:17Z

numpy/linalg/linalg.py

+    x       = b_out[:n,:]
+    r_parts = b_out[n:,:]
+    if isComplexType(t):
+        resids = sum(abs(r_parts)**2, axis=-2)


I wish we had a sensible power or so, but to avoid a needless square root, one can do
sum(r_parts.real**2 + r_parts.imag**2, axis=-2)

r_parts * r_parts.conj() is probably a little faster, and also removes the branching. I'd rather leave this untouched though for now, since that would probably change results by a ULP.

It's not (at least on my machine), but fine to let this be.

It's not faster, or it's not guilty of introducing the ULP error?

Seems to me that there must be some value for which abs(x)**2 != x * x.conj(). Of course, the x * x.conj() value is closer to the true result.

I just meant that x*x.conj() is slower than x.real**2 + x.imag**2 (which makes sense, as the former does a few useless multiplications that cancel). I do agree that there must be values of abs(x)**2 that are slightly less correct, given the sqrt and square after calculating x.real**2+x.imag**2

Anyway, fine to not worry about it here!

I've contemplated adding a ufunc for the squared absolute value, the main problem seems to be the name.

mhvk · 2017-11-08T13:32:06Z

numpy/linalg/linalg.py

+        resids = array([], result_real_t)
+
+    # coerce output arrays
+    s = s.astype(result_real_t, copy=True)


Why the copy=True here; s is created in this routine, so no need to copy, it would seem.

I agree - kept only because it was there before.

mhvk · 2017-11-08T13:34:51Z

numpy/linalg/linalg.py

+    # coerce output arrays
+    s = s.astype(result_real_t, copy=True)
+    resids = resids.astype(result_real_t, copy=False)  # array is temporary
+    x = x.astype(result_t, copy=True)


x is a view, so I guess it makes sense to copy. Maybe note that? (Also, copy=True is the default.)

The copy=Trues confuse me, since as you note, they're the default. In fact, before #9888 there was a reasonable amount of code devoted to passing that argument.

Maybe this is trying to deal with a subclass that has a different default for copy?

Comment seems reasonable here

Yes, that's fine. If one were to design this from scratch, one would do the coercion only if an output array was given...

Looking back, the copy=False arguments were introduced in #5909, and the =True is deliberate and for clarity.

If one were to design this from scratch, one would do the coercion only if an output array was given

Or maybe just work with the dtype passed in, rather than always promoting to double before handing off to the ufunc.

That works only if one also uses different LAPACK routines (which is fine, of course), and would be less precise. But seems more logical in any case; just a different rcond.

eric-wieser · 2017-11-08T17:15:19Z

Do you want me to go through and remove the needless copies in another commit, or leave that for another PR?

mhvk · 2017-11-08T18:46:07Z

@eric-wieser - it is a bit up to you whether you want to bother. If you think you get to it anyway with the gufunc implementation, I'm happy also to just merge this.

eric-wieser · 2017-11-09T07:52:37Z

Nits addressed.

This takes numpygh-5909 a little further.

mhvk · 2017-11-09T14:08:31Z

Looks all OK. Maybe squash the commits?

charris · 2017-11-09T15:47:37Z

numpy/linalg/linalg.py

+    b_out = bstar.T
+
+    # b_out contains both the solution and the components of the residuals
+    x       = b_out[:n,:]


PEP8, no alignment like this.

Actually, PEP8 shows a bunch of other whitespace violations in linalg.py, so we could probably use a style PR to clean those up at some point.

charris · 2017-11-09T15:57:27Z

LGTM apart from PEP8 nit.

charris · 2017-11-09T16:05:14Z

I'll fix that nit here and put this in. Thanks Eric.

eric-wieser added 03 - Maintenance component: numpy.linalg labels Nov 8, 2017

eric-wieser requested a review from mhvk November 8, 2017 04:10

eric-wieser commented Nov 8, 2017

View reviewed changes

eric-wieser force-pushed the simplify-lstsq branch 2 times, most recently from b563394 to d813f1a Compare November 8, 2017 06:47

eric-wieser added 2 commits November 7, 2017 22:48

MAINT: Remove similar branches from linalg.lstsq

a1af647

MAINT: collect together type mangling

a311a8d

eric-wieser force-pushed the simplify-lstsq branch from d813f1a to a311a8d Compare November 8, 2017 06:49

eric-wieser commented Nov 8, 2017

View reviewed changes

mhvk approved these changes Nov 8, 2017

View reviewed changes

DOC: Fix incorrect shape in documentation

6f83089

eric-wieser force-pushed the simplify-lstsq branch from 86489a4 to 69d5d6c Compare November 9, 2017 07:50

MAINT: Avoid extra copies in linalg.lstsq

e3a50a9

This takes numpygh-5909 a little further.

eric-wieser force-pushed the simplify-lstsq branch from 69d5d6c to e3a50a9 Compare November 9, 2017 07:55

charris reviewed Nov 9, 2017

View reviewed changes

STY: Fix PEP8 vertical alignment violation.

3402dcf

charris merged commit d185ece into numpy:master Nov 9, 2017

charris mentioned this pull request Mar 26, 2020

ENH: Add support for lstsq on stacks of matrices #15777

Closed

Uh oh!

MAINT: Remove similar branches from linalg.lstsq #9986

MAINT: Remove similar branches from linalg.lstsq #9986

Uh oh!

Conversation

eric-wieser commented Nov 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eric-wieser Nov 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-wieser Nov 8, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-wieser Nov 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-wieser commented Nov 8, 2017

Uh oh!

mhvk commented Nov 8, 2017

Uh oh!

eric-wieser commented Nov 9, 2017

Uh oh!

mhvk commented Nov 9, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charris commented Nov 9, 2017

Uh oh!

charris commented Nov 9, 2017

Uh oh!

Uh oh!

eric-wieser commented Nov 8, 2017 •

edited

Loading

eric-wieser Nov 8, 2017 •

edited

Loading

eric-wieser Nov 8, 2017 •

edited

Loading

eric-wieser Nov 9, 2017 •

edited

Loading