Skip to content

FIX: remove a spurious check in get_cast_transfer_function #2863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 28, 2012

Conversation

njsmith
Copy link
Member

@njsmith njsmith commented Dec 28, 2012

At least, I hope it's spurious. Certainly it's creating a spurious
error message, is unexpected by callers (one of whom specifically
makes the opposite check before calling get_cast_transfer_function),
and even if it is a useful check for some reason I can't see, it
certainly doesn't belong in this function (which is otherwise just
taking care of byte-swapping and alignment issues and doesn't know
anything about dtypes). And worst case, we'll have turned an exception
into a crash; even if I'm wrong, this shouldn't cause any code to go
from working to not working, just from broken to slightly-more-broken.

Test and original diagnosis by @cgohlke.

Fixes gh-2798.

At least, I hope it's spurious. Certainly it's creating a spurious
error message, is unexpected by callers (one of whom specifically
makes the *opposite* check before calling get_cast_transfer_function),
and even if it is a useful check for some reason I can't see, it
certainly doesn't belong in this function (which is otherwise just
taking care of byte-swapping and alignment issues and doesn't know
anything about dtypes). And worst case, we'll have turned an exception
into a crash; even if I'm wrong, this shouldn't cause any code to go
from working to not working, just from broken to slightly-more-broken.

Test and original diagnosis by @cgohlke.

Fixes numpygh-2798.
@njsmith
Copy link
Member Author

njsmith commented Dec 28, 2012

We'll see how the tests go, but I think this is safe to merge so long as they pass -- as argued in the commit message, this is very unlikely to break working code.

@certik
Copy link
Contributor

certik commented Dec 28, 2012

It looks good to me. @rgommers or @charris, if you give this a +1 too, we can merge it into the release.

@charris
Copy link
Member

charris commented Dec 28, 2012

One wonders why the check was there. Maybe the function changed along the way? I'll take a closer look at this later.

@charris
Copy link
Member

charris commented Dec 28, 2012

That function looks like it goes way back and I suspect the check came in with the original datetime work. There has been a fair amount of change in the calling functions since then. Because datetime and timedelta were the other types involved, it might be worth seeing if there is a test with those types that would test if they operate correctly. Hmm, currently this segfaults if In[2] is repeated a couple of times with or without the patch

In [1]: dt = 'datetime64[ms]'

In [2]: a = np.array([1], dtype=dt).astype((object, [('name', dt)]))

But that is a different issue.

@njsmith
Copy link
Member Author

njsmith commented Dec 28, 2012

Yes, this patch doesn't affect behaviour for datetime/timedelta at all.

I tried briefly to work out which python-level operations would be affected
by this, but was defeated... the logic that leads up to this function has
tons of special cases and convolutions. I'm sure it could be worked out
given time, but unless someone does, there are basically two options for
dealing with such code: (1) tip-toe around it and pray it doesn't break (or
break worse), (2) make some careful educated guesses, and if that breaks
things, trust that either the test suite or users will find it, and either
way we'll learn something. Neither option is great, but option (2) at least
offers the possibility of ending up with maintainable code eventually.

On Fri, Dec 28, 2012 at 7:19 PM, Charles Harris notifications@github.comwrote:

That function looks like it goes way back and I suspect the check came in
with the original datetime work. There has been a fair amount of change in
the calling functions since then. Because datetime and timedelta were the
other types involved, it might be worth seeing if there is a test with
those types that would test if they operate correctly. Hmm, currently this
segfaults if In[2] is repeated a couple of times with or without the patch

In [1]: dt = 'datetime64[ms]'

In [2]: a = np.array([1], dtype=dt).astype((object, [('name', dt)]))

But that is a different issue.


Reply to this email directly or view it on GitHubhttps://github.com//pull/2863#issuecomment-11739197.

@charris
Copy link
Member

charris commented Dec 28, 2012

Yep. Well, I'll put this in and we can see what turns up in testing, it looks like it will solve the h5py problem in any case. The segfault example should probably raise an error somewhere. It works fine if [2] is corrected to

In[2] a = np.array([1], dtype=dt).astype((dt, [('name', dt)]))

It should raise

ValueError: mismatch in size of old and new data-descriptor

charris added a commit that referenced this pull request Dec 28, 2012
FIX: remove a spurious check in get_cast_transfer_function
@charris charris merged commit 4a3c347 into numpy:master Dec 28, 2012
@charris
Copy link
Member

charris commented Dec 28, 2012

Change that, I suspect that mixing the object type with a numeric type causes illegal dereferences. On my 64 bit OS

In [1]: a = np.array([1], dtype='O').astype(('O', [('name', 'Q')]))

In [2]: a['name']

segfaults immediately.

@certik
Copy link
Contributor

certik commented Dec 28, 2012

@charris, can you create an issue for the segfault?

@certik
Copy link
Contributor

certik commented Dec 28, 2012

This is what I get before this patch:

>>> import numpy as np
>>> a = np.array([1], dtype='O').astype(('O', [('name', 'Q')]))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: low level cast function is for unequal type numbers

So it seems that the patch makes it segfault, but I see what you mean. The segfault has probably been there before as well. But given this, I really don't feel like merging this into the 1.7 branch as it is now. So I am going to release rc1 now and let's look into this issue before rc2.

@certik
Copy link
Contributor

certik commented Dec 28, 2012

Or should I put this patch into rc1?

@certik
Copy link
Contributor

certik commented Dec 28, 2012

(I am releasing now, so let me know.)

@charris
Copy link
Member

charris commented Dec 28, 2012

I'm pretty sure the segfault comes from mixing object and integer types, something that is normally disallowed but apparently not detected in the more complex dtype, so that is a different problem. I think Nathaniel's patch needs to be in the release for testing. As he says, that whole nasty bit of code could use an audit, but that is a long term project..

@certik
Copy link
Contributor

certik commented Dec 28, 2012

Ok, I am backporting this for rc1.

@certik
Copy link
Contributor

certik commented Dec 28, 2012

Backported in d930c88.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants