Skip to content

BUG: simplify void scalar getfield/setfield #5642

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ahaldane
Copy link
Member

@ahaldane ahaldane commented Mar 6, 2015

This PR removes most of the logic in voidtype_getfield and voidtype_setfield. Instead they simply call the ndarray getfield and setfield. This solves bugs related to void-scalar assignment

This fixes #3126, #3561. An alternate PR for solving these issues is #4556.

However, this PR introduces a backwards incompatible change, which causes some current scipy tests to fail: Assignment to a void scalar's object array fields now unpacks any arrays on the lhs of the assignment, unlike before (details below). Edit: No longer a problem

In addition, this PR would eliminate potential segfaults involving object arrays if applied in combination with my other PR #5548. This PR currently does not pass numpy tests for this reason, as it exposes a potential bugs that were silent before. Specifically, in this PR, some tests now raise errors of the form "Cannot change data-type for object array" because they attempt to directly modify memory in object arrays. This was previously silently allowed, but now raises an error (which I think is better). My other PR would elimitate the errors. Probably my other PR should be considered before this one is.

More detailed justification for the changes:

Removed byteswapping code

I removed code that does byteswapping in both getfield and setfield. After looking at it for a while, I am convinced this code is unnecessary (though this definitely needs to be checked by others). The reason is that in both cases we use gentype_generic_method to convert the void scalar to a 0-d ndarray, do the get/setfield, and then convert the returned value to a scalar using PyArray_ToScalar (in PyArray_Return). However, note that PyArray_ToScalar already does the byteswap for us. Therefore, any scalars that reach voidtype_getfield are already in NBO and there is no need to swap again.

Change to object array assignment

This PR causes setfield called to behave differently in the specific case where the user tries to set a field of Object dtype, on a void scalar, to a value which is an ndarray. This causes some scipy tests to fail, involving loading matlab save files. The following example demonstrates what is going on:

Setup:

>>> st2 = np.empty((1,1), dtype=[('one', 'O'), ('two', 'O')])
>>> v = np.array([[("ABCD",)]], dtype=[('three', 'O')])

Old Behavior:

>>>#setfield of a void scalar
>>> st2[0,0]['two'] = v
>>> st2
array([[(None, [[('ABCD',)]])]], 
      dtype=[('one', 'O'), ('two', 'O')])
>>> st2[0,0]['two']
array([[('ABCD',)]], 
     dtype=[('three', 'O')])

>>>#setfield of an ndarray
>>> st2['two'] = v
>>> st2[0,0]['two']
('ABCD',)

New Behavior:

>>>#setfield of a void scalar
>>> st2[0,0]['two'] = v
>>> st2
array([[(None, ('ABCD',))]], 
     dtype=[('one', 'O'), ('two', 'O')])
>>> st2[0,0]['two']
('ABCD',)

>>>#setfield of an ndarray
>>> st2['two'] = v
>>> st2[0,0]['two']
('ABCD',)

In the old behavior, setfield on a void scalar on a field of Object type was a special case in numpy code - if the value being assigned was an ndarray it would not be broadcast, unlike all other types of assignment where it would be, and unlike for non-object dtypes. I think there is a case to be made that the new behavior is more consistent - it does not have the (undocumented) 'special case'. On the other hand, this seems to have been the way scipy uses to set an object field with an ndarray as the value without broadcasting the ndarray - scipy code would have to change. It is a bit tricker to do that now - but still possible, eg with st2[0,0] = (None, v).

@charris
Copy link
Member

charris commented Mar 7, 2015

Be good to post the description to the list so more folks see it.

@ahaldane
Copy link
Member Author

ahaldane commented Mar 9, 2015

I'll post it, probably later this week

@ahaldane
Copy link
Member Author

I updated it so it no longer breaks backwards compatibility in scipy. I belive all failed tests in both scipy and numpy would now be fixed by #5548.

Why the change? I realized that the new behavior was the wrong one: Setitem and setfield on structured object arrays should behave identically:

>>> b = np.zeros(1, dtype=[('x', 'O')])
>>> b[0]['x'] = arange(3)
>>> b['x'][0] = arange(3)

Both assignments now produce

 >>> b['x'][0]
 array([0, 1, 2])

Now that there is not such a serious problem, is it still worth posting to the mailing list?

One thing that seems a little hacky in the new code is the way I assign to the 0-d array (which may actually be a higher dim array in the case of subarrays). I effectively do

 >>> array(voidscalar['field'])[()] = value

Is there a better way of assigning to 'everything' that doesn't involve creating the empty tuple passign it to PyObject_SETITEM?

This commit modifiesvoidtype_get/setfield to that they call the ndarray
get/setfield. This solves bugs related to void-scalar assignment.

This fixes issues numpy#3126,
numpy#3561.
@charris charris added this to the 1.10.0 release milestone May 8, 2015
@charris
Copy link
Member

charris commented May 11, 2015

@ahaldane Needs a rebase. I'd like to put in #5548 first, then this.

@charris
Copy link
Member

charris commented May 11, 2015

@ahaldane Would some of the tests in #4556 be useful here?

@charris
Copy link
Member

charris commented May 11, 2015

BTW, you need separate Closes #1. Closes #2. etc. to close multiple issues.

@charris
Copy link
Member

charris commented Jun 5, 2015

@ahaldane Needs rebase.

@ahaldane
Copy link
Member Author

ahaldane commented Jun 6, 2015

I made a fresh re-start for this PR in #5947

@ahaldane ahaldane closed this Jun 6, 2015
@ahaldane ahaldane deleted the void_getsetfield branch January 17, 2018 23:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistent array assignment for structured arrays
3 participants