Skip to content

BUG: np.save() and np.load() are not idempotent when align=True or fields are discontiguous #8100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jzwinck opened this issue Sep 30, 2016 · 4 comments · Fixed by #10411 or #12447
Closed
Labels
00 - Bug component: numpy._core Priority: high High priority, also add milestones for urgent issues

Comments

@jzwinck
Copy link
Contributor

jzwinck commented Sep 30, 2016

Given an array like this:

arr = np.array([(1, 2), (3, 4)], dtype=np.dtype([('a','u2'), ('b','u4')], align=True))

The dtype is:

{'names':['a','b'], 'formats':['<u2','<u4'], 'offsets':[0,4], 'itemsize':8, 'aligned':True}

Note the difference with the default align=False dtype:

[('a', '<u2'), ('b', '<u4')]

Now, the problem is that np.save('foo.npy', arr) saves the aligned dtype this way:

[('a', '<u2'), ('', '|V2'), ('b', '<u4')]

Then, np.load('foo.npy') produces:

array([(1, [62, -18], 2), (3, [62, -18], 4)], 
      dtype=[('a', '<u2'), ('f1', 'V2'), ('b', '<u4')])

This is despite the fact that np.load() can parse the original dtype with names, formats, offsets - only np.save() doesn't save it that way. Editing foo.npy by hand lets np.load() produce an identical array to arr (if you try it, the 9th byte must change from 0x66 to 0x96 to allow for the longer descr).

As far as I can tell, fixing this would be as simple as storing str(arr.dtype) instead of arr.dtype.descr in the NPY file. For aligned arrays the two seem to produce the same string. But I'm not sure if there's some reason this wasn't done in the first place.

I'm using NumPy 1.11.1.

@ahaldane
Copy link
Member

Yeah this is an old problem. arrays with aligned dtypes cannot be loaded in general. See my comments in #3176 #6359 #2215.

It is on my list of things to fix after my PR #6053 gets in (hopefully in 1.13).

@ahaldane
Copy link
Member

This was mistakenly closed by #10411, reopening.

I have more extensive comments on what needs to be fixed in #10931 and #7797.

@ahaldane ahaldane reopened this Aug 17, 2018
@mattip mattip changed the title np.save() and np.load() are not idempotent when align=True or fields are discontiguous BUG: np.save() and np.load() are not idempotent when align=True or fields are discontiguous Nov 1, 2018
@mattip mattip added 00 - Bug Priority: high High priority, also add milestones for urgent issues component: numpy._core labels Nov 1, 2018
@mattip mattip added this to the 1.16.0 release milestone Nov 1, 2018
@charris
Copy link
Member

charris commented Nov 18, 2018

@mattip @ahaldane What is the current status of the fixes for this issue?

@ahaldane
Copy link
Member

It's fixed!

(by #12358). I'll go ahead and close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug component: numpy._core Priority: high High priority, also add milestones for urgent issues
Projects
None yet
4 participants