Skip to content

BUG: Bad SIMD integer // on windows in numpy 1.21.2 #20025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Macky1979 opened this issue Oct 4, 2021 · 5 comments · Fixed by #20153
Closed

BUG: Bad SIMD integer // on windows in numpy 1.21.2 #20025

Macky1979 opened this issue Oct 4, 2021 · 5 comments · Fixed by #20153
Labels
00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery

Comments

@Macky1979
Copy link

Describe the issue:

I would like to report different behavior of % and // operands under numpy 1.20.3 and 1.21.2 running on Windows 10. The below code produces [0 0 0 0 0 0 0 0] for numpy 1.20.3 and [1 1 1 1 1 1 1 1] for numpy 1.21.2.

After upgrading numpy, I have noticed that pandas function groupby() is returning error of type IndexError: index 2 is out of bounds for axis 0 with size 2. The root of the cause is in function decons_group_index(), which is present in pandas file sorting.py. The issue is that groupby() determines unique values in individual columns and function decons_group_index() creates corresponding vector of indices. In my particular case I have two distinct values (e.g. 'a' and 'b') but three indices ([0, 1, 2]) when using numpy 1.21.2. Thus the error. In numpy 1.20.3 everything works fine. I suspect the following performance improvement being responsible for the issue.

Reproduce the code example:

import numpy as np

x = np.array([2076999867579399,
              2077965839147919,
              2078931810716439,
              2079897782284959,
              2080863753853479,
              2081829725421999,
              2082795696990519,
              2083761668559039])

y = np.array([0, 0, 0, 0, 0, 0, 0, 0])
factor = 160995261420
shape = 1

labels = (x - y) % (factor * shape) // factor

print(labels)

Error message:

No response

NumPy/Python version information:

1.21.2 3.9.6 (default, Aug 18 2021, 15:44:49) [MSC v.1916 64 bit (AMD64)]

@seberg
Copy link
Member

seberg commented Oct 4, 2021

Thanks for the report!

I am not aware that divmod (%) changed (meaningfully). @ganesh-k13 do you happen to have a thought on what is going on here?

@Macky1979 it would be very helpful if you can simplify the code a bit (e.g. y is unnecessary). E.g. check the intermediate result to see if the % or the // operation is wrong. It is even possible that the issue only appears if the operations are done in a single line, that would be extremely helpful to narrow things down also!

@seberg seberg added this to the 1.21.3 release milestone Oct 4, 2021
@charris charris added the 09 - Backport-Candidate PRs tagged should be backported label Oct 4, 2021
@bashtage
Copy link
Contributor

bashtage commented Oct 4, 2021

Looks to be windows only.

Simplest reproducer I found is

import numpy as np

factor = 160995261420
a = np.array([160995261399, 160995261399, 160995261399, 160995261399])
a // factor
array([1, 1, 1, 1], dtype=int64)

Funny enough

import numpy as np

factor = 160995261420
a = np.array([160995261399, 160995261399])
a // factor
array([0, 0], dtype=int64)

@seberg
Copy link
Member

seberg commented Oct 4, 2021

Thanks @bashtage, I think that is all the information needed to start digging efficiently. If the size matters, this is quite certainly SIMD related.

@ganesh-k13
Copy link
Member

ganesh-k13 commented Oct 4, 2021

Yep maybe SIMD, in first case we use this loop, which could be faulty:

for (; len >= vstep; len -= vstep, src += vstep, dst += vstep) {
npyv_@sfx@ nsign_d = npyv_setall_@sfx@(scalar < 0);
npyv_@sfx@ a = npyv_load_@sfx@(src);
npyv_@sfx@ nsign_a = npyv_cvt_@sfx@_b@len@(npyv_cmplt_@sfx@(a, nsign_d));
nsign_a = npyv_and_@sfx@(nsign_a, npyv_setall_@sfx@(1));
npyv_@sfx@ diff_sign = npyv_sub_@sfx@(nsign_a, nsign_d);
npyv_@sfx@ to_ninf = npyv_xor_@sfx@(nsign_a, nsign_d);
npyv_@sfx@ trunc = npyv_divc_@sfx@(npyv_add_@sfx@(a, diff_sign), divisor);
npyv_@sfx@ floor = npyv_sub_@sfx@(trunc, to_ninf);
npyv_store_@sfx@(dst, floor);

it works in second case as we use loop below that uses normal C /

Yeah it has to be that loop:

>>> a = np.array([160995261399, 160995261399, 160995261399, 160995261399, 160995261399])
>>> a // factor
array([1, 1, 1, 1, 0], dtype=int64)

@seberg seberg added the Priority: high High priority, also add milestones for urgent issues label Oct 6, 2021
@seberg seberg changed the title BUG: Incorrect implementation of operand % and // in numpy 1.21.2 BUG: Bad SIMD integer // on windows in numpy 1.21.2 Oct 20, 2021
@seberg seberg added the component: SIMD Issues in SIMD (fast instruction sets) code or machinery label Oct 20, 2021
@seiko2plus
Copy link
Member

seiko2plus commented Oct 23, 2021

My bad, this bug is caused by the SIMD implementation of fast integer division(#18178) when the divisor is a scalar and the dividend length >= 2 or 4 or 8 depending on the enabled SIMD extension on runtime.

as @bashtage mentioned the bug occurs at windows only, to be more precise on msvc <= 2017 builds. #20153 should fixes this issue. also during my investigation, I discovered another bug but this time related to the unsigned 8-bit division which has been reported within #20168.

@charris charris removed Priority: high High priority, also add milestones for urgent issues 09 - Backport-Candidate PRs tagged should be backported labels Sep 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug component: SIMD Issues in SIMD (fast instruction sets) code or machinery
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants