-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
API: Remove broadcasting ambiguity from np.linalg.solve #25914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously the np.linalg.solve documentation stated: a : (..., M, M) array_like Coefficient matrix. b : {(..., M,), (..., M, K)}, array_like however, this is inherently ambiguous. For example, if a has shape (2, 2, 2) and b has shape (2, 2), b could be treated as a (2,) stack of (2,) column vectors, in which case the result should have shape (2, 2), or as a single 2x2 matrix, in which case, the result should have shape (2, 2, 2). NumPy resolved this ambiguity in a confusing way, which was to treat b as (..., M) whenever b.ndim == a.ndim - 1, and as (..., M, K) otherwise. A much more consistent way to handle this ambiguity is to treat b as a single vector if and only if it is 1-dimensional, i.e., use b : {(M,), (..., M, K)}, array_like This is consistent with, for instance, the matmul operator, which only uses the special 1-D vector logic if an operand is exactly 1-dimensional, and treats the operands as (stacks of) 2-D matrices otherwise. This updates np.linalg.solve() to use this behavior. This is a backwards compatibility break, as any instance where the b array has more than one dimension and exactly one fewer dimension than the a array will now use the matrix logic, potentially returning a different result with a different shape. The previous behavior can be manually emulated with something like np.solve(a, b[..., None])[..., 0] since b as a (M,) vector is equivalent to b as a (M, 1) matrix (or the user could manually import and use the internal solve1() gufunc which implements the b-as-vector logic). This change aligns the solve() function with the array API, which resolves this broadcasting ambiguity in this way. See https://data-apis.org/array-api/latest/extensions/generated/array_api.linalg.solve.html#array_api.linalg.solve and data-apis/array-api#285. Fixes numpy#15349 Fixes numpy#25583
|
It is. |
The failures are for: >>> a = np.ones((3, 0, 0))
>>> b = np.ones((3, 0))
>>> np.linalg.solve(a, b) # with `main`
array([], shape=(3, 0), dtype=float64)
>>> np.linalg.solve(a, b) # with this PR
ValueError: solve: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (m,m),(m,n)->(m,n) (size 3 is different from 0) Second failure is the same, but then with shapes |
That's really annoying. Why are the full tests not run by default? The full test suite only takes two and half minutes to run on my machine. And this option isn't even documented in |
The linalg tests are formulated in a rather odd way. All the test cases have numpy/numpy/linalg/tests/test_linalg.py Lines 530 to 536 in 15691c3
|
The dot_generalized() function is no longer needed. It was primarily there to handle the odd case for solve(), but the updated logic for that is now put in TestSolve itself, and dot_generalized is replaced with matmul() everywhere else.
OK, I've updated the tests. The linalg test cases had some odd The linalg test cases are still not great, but I'm not sure I'm up for trying to improve them here. Like I said, they all oddly reuse test cases with |
Aside:
For me
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Historically, there is a difference bewteen the broadcast rules between np.dot
and np.matmul
, maybe the current behaviour is due to that discrepancy somehow. The rule here and in the Array API is much clearer.
This does need a release note. I sent a message to the mailing list since it is an API change.
Agreed that reducing this ambiguity is a good idea, especially since we have the opportunity to do so with 2.0. |
Let's merge this, we can revert if the mailing list conversation gets heated (unlikely). |
Thanks @asmeurer |
Ah, I completely forgot that I never wrote a release note for this. Do you still want me to do that, or did you take care of it? |
For completeness: the release note is included in gh-25937. Thanks for this PR! |
Previously the np.linalg.solve documentation stated:
however, this is inherently ambiguous. For example, if a has shape (2, 2, 2) and b has shape (2, 2), b could be treated as a (2,) stack of (2,) column vectors, in which case the result should have shape (2, 2), or as a single 2x2 matrix, in which case, the result should have shape (2, 2, 2).
NumPy resolved this ambiguity in a confusing way, which was to treat b as (..., M) whenever b.ndim == a.ndim - 1, and as (..., M, K) otherwise.
A much more consistent way to handle this ambiguity is to treat b as a single vector if and only if it is 1-dimensional, i.e., use
This is consistent with, for instance, the matmul operator, which only uses the special 1-D vector logic if an operand is exactly 1-dimensional, and treats the operands as (stacks of) 2-D matrices otherwise.
This updates np.linalg.solve() to use this behavior.
This is a backwards compatibility break, as any instance where the b array has more than one dimension and exactly one fewer dimension than the a array will now use the matrix logic, potentially returning a different result with a different shape. The previous behavior can be manually emulated with something like
since b as a (M,) vector is equivalent to b as a (M, 1) matrix (or the user could manually import and use the internal solve1() gufunc which implements the b-as-vector logic).
This change aligns the solve() function with the array API, which resolves this broadcasting ambiguity in this way. See
https://data-apis.org/array-api/latest/extensions/generated/array_api.linalg.solve.html#array_api.linalg.solve and data-apis/array-api#285.
Fixes #15349
Fixes #25583