-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG: Remove check requiring natural alignment of float/double to run AVX code #15615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
What is this part of the test meant to check? The stated purpose is to check the gather properties of the AVX loops. So I think this PR is avoiding the problem: it should only check strides that will end up using the AVX semantics. If you are hitting the glibc implementation, I think you are not testing what you want to test. Do you agree this strided part of the test is somehow escaping the AVX implementation? We will eventually get a bug report on some obscure platform that needs 2 or 3 ULP, and will spend time searching for why they report tests failing, but we cannot reproduce. This points out the problem with having architecture-specific code: we need to be very careful with our tests. NumPy supports many obscure platforms and antiquated systems, and the developers on those platforms want to be able to run |
The gather part of the code does not depend on the ufunc itself, so you really don't need to check all the ufuncs here. Perhaps a better way to test this part of the code would be to add another test ufunc that uses a Edit: be more specific in the description |
For reciprocal, no matter what the value of stride But I think I might have a clue as to why
I also realized this condition is completely unnecessary. AVX code uses unaligned loads and does not care what the alignment is. So, perhaps I could get rid of this condition and this could be fixed? |
Sounds reasonable for a fix, but we still should have some kind of test that the avx path is being followed |
Doesn't this #15558 exactly address this problem once Universal Intrinsics patch is merged? |
7221c45
to
519f7ac
Compare
In a x86-32 bit system, doubles need not be naturally aligned to 8 Byte boundary (see: -malign-double section of https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html). Having this check meant it ran different code paths (AVX v/s scalar) depending on the alignment of data which leads to different results and test failing intermittently. AVX code uses un-aligned loads and this check is unnecessary to begin with.
519f7ac
to
3253ab7
Compare
Phew! This was a fun exercise, but I think I am reasonably certain this should fix the failing test. The fact that kept bugging me the most was that the vector and scalar code gave different answers. Reciprocal is ultimately just a divide ( But turns out for some odd reason, gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5) in the manylinux2010 docker decides to use the x87 |
Thanks @r-devulap |
MacPython/numpy-wheels#73 is still failing |
I am not entirely sure what is causing the CI failure in #15610. But the test failure seems to be because of a small ULP error (Max relative difference: 1.1176388e-16
which should be < 1ULP). This patch uses assert_array_max_ulp instead of assert_equal which should hopefully fix the test failure.