-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG: assert_almost_equal fails on subclasses that cannot handle bool #8452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -726,14 +725,13 @@ def chk_same_position(x_id, y_id, hasval='nan'): | |||
raise AssertionError(msg) | |||
|
|||
if isnumber(x) and isnumber(y): | |||
x_id, y_id = zeros_like(x, dtype=bool_), zeros_like(y, dtype=bool_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more trivial change would replace just this line with
x_id = zeros(x.shape, dtype=bool_)
y_id = zeros(y.shape, dtype=bool_)
but it seems unnecessary to create these arrays in the first place -- and indeed the code becomes quite a bit shorter and, to me at least, clearer by just removing elements from the arrays.
I am perfectly happy with the actual fix to the test functions here (indeed, the code is cleaner than before), but I think it is absolute insanity for numpy functions to support subclasses that explicitly disable built-in methods like
My preference would be to make this change, but leave out the unit tests for the ndarray subclass without the |
@shoyer - I agree that the test at some level is rather odd, even though it is quite obvious that there will be other subclasses for which a boolean dtype simply does not make sense but for which comparisons are well defined; in a sense, the problem here really was that But while I don't quite like my own test either, it does seem this is a regression one might as well ensure does not recur; it doesn't seem that unlikely to me there are others than astropy that rely on this working. Would the test be less odd to you if the sample subclass had an explicit p.s. We do have our own |
Well, not putting a test because we don't want to necessary take care to not break it is not too useful IMHO. Might as well just say in a comment that the test documents a behaviour but is not considered guaranteed. I realize that some people may be discouraged by a failing test immidiately, but my guess is you should always look what kind of test you are breaking and then you see the comment. |
x_id |= x_isnan | ||
y_id |= y_isnan | ||
x = x[~x_isnan] | ||
y = y[~y_isnan] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I am lazy now, but do these assert functions check the shape exactly? Because otherwise, I think there may be a problem with broadcasting if x and y are only broadcastable to one another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the chk_same_position
ensures that, when broadcast, x_isnan
and y_isnan
take out the same elements. As a result, x
and y
will still broadcast against each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, shape is checked exactly in any case, so no worries.
Yes, that's fine. To be clear my complaint here is really more theoretical than practical. As long as you are testing against numpy master and promptly supplying patches when things break, keeping numpy functions working for AstroPy's ndarray subclasses is totally reasonable (assuming no significant performance or code complexity costs). I just don't want to hobble NumPy with a commitment to supporting this behavior in the long term. The nature of subclassing in Python is that without a clearly delimited public API for subclasses (which doesn't exist for NumPy) it is extremely difficult (in practice impossible) to avoid leaking implementation details in the API. So this sort of breakage happens almost inevitably. |
numpygh-8410 breaks a large number of astropy tests, because it sets up a boolean array for values that should actually be compared (i.e., are not `nan` or `inf`) using `zeros_like`. The latter means that for subclasses, the boolean test array is not a plain `ndarray` but the subclass. But for astropy's `Quantity`, the `all` method is undefined. This commit ensures the test arrays from `isinf` and `isnan` are used directly.
06032a3
to
fe46cd6
Compare
OK, all makes sense. I pushed a version with a revised comment. With that, I think this is ready to go in. |
Hehe, the comment could be more obvious about not being too afraid of making the test fail, but OK. Thanks! |
This seems to break scipy:
```
python3 runtests.py -j2 -t
scipy/sparse/tests/test_base.py:TestLIL.test_elementwise_divide
```
results to
```
======================================================================
FAIL: test_base.TestLIL.test_elementwise_divide
----------------------------------------------------------------------
Traceback (most recent call last):
File
"/home/pauli/tmp/numpy/build/testenv/lib/python3.5/site-packages/numpy/testing/utils.py",
line 707, in chk_same_position
assert_array_equal(x_id, y_id)
File
"/home/pauli/tmp/numpy/build/testenv/lib/python3.5/site-packages/numpy/testing/utils.py",
line 842, in assert_array_equal
verbose=verbose, header='Arrays are not equal')
File
"/home/pauli/tmp/numpy/build/testenv/lib/python3.5/site-packages/numpy/testing/utils.py",
line 725, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Arrays are not equal
(shapes (1, 6), (6,) mismatch)
x: [repr failed for <matrix>: The truth value of an array with more
than one element is ambiguous. Use a.any() or a.all()]
y: array([False, False, False, True, False, False], dtype=bool)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/nose/case.py", line 198, in runTest
self.test(*self.arg)
File "/home/pauli/tmp/scipy/scipy/sparse/tests/test_base.py", line
1358, in test_elementwise_divide
assert_array_equal(todense(self.datsp / denom), expected)
File
"/home/pauli/tmp/numpy/build/testenv/lib/python3.5/site-packages/numpy/testing/utils.py",
line 842, in assert_array_equal
verbose=verbose, header='Arrays are not equal')
File
"/home/pauli/tmp/numpy/build/testenv/lib/python3.5/site-packages/numpy/testing/utils.py",
line 741, in assert_array_compare
chk_same_position(x == +inf, y == +inf, hasval='+inf')
File
"/home/pauli/tmp/numpy/build/testenv/lib/python3.5/site-packages/numpy/testing/utils.py",
line 713, in chk_same_position
raise AssertionError(msg)
AssertionError:
Arrays are not equal
x and y +inf location mismatch:
x: [repr failed for <matrix>: The truth value of an array with more
than one element is ambiguous. Use a.any() or a.all()]
y: array([ 1. , 0.5 , -3. , inf, 0.25, 0. ])
```
|
Maybe problem with handling np.matrix
|
Yeah, surprising numpy tests did not notice. Possibly the always 2D stuff breaks things, matrices don't quite support the boolean indexing after all. @mhvk can you check it out? |
I guess this is the case. You can trigger the error easily by doing:
This will give the following error message:
I guess the reason is, that a matrix stays two dimensional even after removing the |
I'll have a look... |
Indeed, it was a problem with slicing and matrix keeping 2 dimensions. Possibly |
Maybe I'm missing something, but why does |
scipy.sparse doesn't use it on sparse matrices, all the inputs are
np.matrix which is an ndarray subtype.
|
Just for reference: |
gh-8410 breaks a large number of astropy tests, because it sets up a boolean array for values that should actually be compared (i.e., are not
nan
orinf
) usingzeros_like
. The latter means that for subclasses, the boolean test array is not a plainndarray
but the subclass. But for astropy'sQuantity
, theall
method is undefined.This commit ensures the test arrays from
isinf
andisnan
are used directly, and adds test cases to ensure this does not regress again.