Skip to content

sum gives wrong result for structured arrays #9876

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
akhilman opened this issue Oct 17, 2017 · 15 comments · Fixed by #9887
Closed

sum gives wrong result for structured arrays #9876

akhilman opened this issue Oct 17, 2017 · 15 comments · Fixed by #9887
Labels

Comments

@akhilman
Copy link

>>> a = np.zeros(1000, dtype=[('a', np.int64), ('b', np.float64), ('c', np.int32)])
>>> a['b'].sum()
0.0
>>> a['a'] = np.arange(1, 1+len(a))
>>> a['b'].sum()  # expecting 0 here
1.7060846163323196e-309
>>> 

all works fine if I change last one field to int64

>>> a = np.zeros(1000, dtype=[('a', np.int64), ('b', np.float64), ('c', np.int64)])
>>> a['a'] = np.arange(1, 1+len(a))
>>> a['b'].sum()
0.0
>>> np.__version__
'1.13.3'
>>> sys.version
'3.6.1 (default, May 30 2017, 02:29:58) \n[GCC 6.3.0 20170516]'

Debian stable 32bit

@eric-wieser
Copy link
Member

eric-wieser commented Oct 17, 2017

Looks like an alignment problem to me. Do you get a problem with the simpler

>>> a = np.zeros(1000, dtype=[('a', np.int32), ('b', np.float64)])
>>> a['a'] = np.arange(1, 1+len(a))
>>> a['b'].sum()

@eric-wieser
Copy link
Member

And in your original, is np.all(a['b'].view(np.uint8) == 0)?

@akhilman
Copy link
Author

>>> a = np.zeros(1000, dtype=[('a', np.int32), ('b', np.float64)])
>>> a['a'] = np.arange(1, 1+len(a))
>>> a['b'].sum()
2.3671924051608192e-309

np.all(a['b'].view(np.uint8) == 0) gives me ValueError, but np.all(a['b'].view(np.float64) == 0) returns True

@eric-wieser
Copy link
Member

eric-wieser commented Oct 17, 2017

Oops, I think I meant np.all(a['b'].view((np.uint8, 8)) == 0) - I want to get the fpu out of the equation for checking it's really all zero.

@eric-wieser
Copy link
Member

And can you try with a['a'] = -1, to set as many bits as possible?

@akhilman
Copy link
Author

>>> a['a'] = -1
>>> a['b'].sum()
nan
>>> np.all(a['b'].view((np.uint8, 8)) == 0)
True

@eric-wieser
Copy link
Member

And on much smaller arrays? Is length 1 or 2 enough to cause the bug?

@akhilman
Copy link
Author

>>> a = np.zeros(1, dtype=[('a', np.int32), ('b', np.float64)])
>>> a['a'] = -1
>>> a['b'].sum()
0.0
>>> a = np.zeros(2, dtype=[('a', np.int32), ('b', np.float64)])
>>> a['a'] = -1
>>> a['b'].sum()
2.1219957904712067e-314
>>> a.tobytes()
b'\xff\xff\xff\xff\x00\x00\x00\x00\x00\x00\x00\x00\xff\xff\xff\xff\x00\x00\x00\x00\x00\x00\x00\x00'

@eric-wieser
Copy link
Member

eric-wieser commented Oct 18, 2017

Definitely an alignment bug. Perhaps not surprisingly, the underlying bits of the lower half of the float are all set.

>>> hex(np.float64(2.1219957904712067e-314).view(np.int64))
'0xffffffff'

Unfortunately, I can't reproduce this on my machine. You can almost certainly work around it with dtype=np.dtype([('a', np.int32), ('b', np.float64)], align=True), but that doesn't change the fact that this is a bug.

@eric-wieser
Copy link
Member

Did this work in previous numpy versions?

@akhilman
Copy link
Author

Don't know about previous version. Which version I should try?
Align doesn't work

>>> a = np.zeros(2, dtype=np.dtype([('a', np.int32), ('b', np.float64)], align=True))
>>> a['a'] = -1
>>> a['b'].sum()
2.1219957904712067e-314

I can reproduce this bug on debian's prepackaged numpy and python on another 32bit machine

>>> sys.version
'3.5.3 (default, Jan 19 2017, 14:11:04) \n[GCC 6.3.0 20170118]'
>>> np.__version__
'1.12.1'

But not on Fedora 64bit

>>> sys.version
'3.6.2 (default, Oct  2 2017, 16:51:32) \n[GCC 7.2.1 20170915 (Red Hat 7.2.1-2)]'
>>> np.__version__
'1.12.1'

I think it is 32bit version issue

@ahaldane
Copy link
Member

I just managed to reproduce this bug in an i686 chroot.

It's not due to an error in subscipting since the subscripted arrays shows the right values:

>>> a = np.zeros(2, dtype=[('a', np.int32), ('b', np.float64)])
>>> a['a'] = -1
>>> y = a['b']
>>> y
array([ 0.,  0.])
>>> y.sum()
2.1219957904712067e-314

since arr.sum is implemented as np.add.reduce(arr) I suspect the reduction code.

@ahaldane
Copy link
Member

Well, the stride calculation is going wrong when DOUBLE_add assumes the data is aligned here, but presumably the bug is actually earlier somewhere where we should be checking for alignment and making an aligned copy if needed.

Currently this line looks suspicious to me.

@ahaldane
Copy link
Member

Yeah, as far as I see we simply never accounted for alignment in the reduction code (in PyUFunc_Reduce, PyUFunc_ReduceWrapper). We need to write new code there to check alignment and make a copy if the array isn't aligned.

It's time for me to press my cheeks against the sweet cheeks of the pillow, so I'll leave this for today. Anyone feel free to pick up in the meantime!

@ahaldane
Copy link
Member

After another look, I think it is the implementation and optimizations in the pairwise_sum_TYPE function that are wrong.

Here's the issue: On x86 (ie 32bit) in linux a double (8 bytes) counts as 4-byte aligned (by gcc, see wikipedia (link) which I confirm in my chroot). Therefore in the examples above, a['b'].flags.aligned is True on x86, so we don't make a copy of the array before processign the ufunc. So, to restate, a['b'] is an array which counts as "aligned", which is has a 4-byte offset, whose itemsize is 8 bytes, but whose stride is 12 bytes.

But then, in pairwise_sum_@TYPE@ we assume that the array elements are at integer multiples of the sizeof(type), which allows us to access successive values using array-indexing/pointer-arithmetic as arr[i*skip] for integers i, and where skip is the stride in units of the itemsize (8 bytes in this case), computed as skip=stride/sizeof(type).

Unfortunately, for the array in the examples above the stride is not a multiple of the itemsize, so it is not possible to access the array values by pointer-arithmetic, at least with a pointer of the type. Probably the fix is to use void-pointer arithmetic with the actual stride. Not sure if that will kill the optimizations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants