Skip to content

BUG: Fixed deepcopy of object arrays. #6456

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

pelson
Copy link
Contributor

@pelson pelson commented Oct 12, 2015

Fixes a bug with deepcopy of object arrays in F-order:

In [1]: import numpy as np

In [2]: np.__version__       
Out[2]: '1.11.0.dev0+9cc55dc'

In [3]: a = {'a': 1}

In [4]: b = {'b': 2}

In [5]: np.array([[a, b], [a, b]], order='F')
Out[5]: 
array([[{'a': 1}, {'b': 2}],
       [{'a': 1}, {'b': 2}]], dtype=object)

In [6]: import copy

In [7]: copy.deepcopy(np.array([[a, b], [a, b]], order='F'))
Out[7]: 
array([[{'a': 1}, {'a': 1}],
       [{'b': 2}, {'b': 2}]], dtype=object)

This was discovered as part of a bug found with biggus in SciTools/biggus#157.

@@ -457,7 +457,7 @@ def test_swap_real(self, level=rlevel):

def test_object_array_from_list(self, level=rlevel):
# Ticket #270
np.array([1, 'A', None]) # Should succeed
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does succeed - I just wanted it to make an assertion.
Happy to revert though.

@@ -1426,21 +1429,44 @@ array_deepcopy(PyArrayObject *self, PyObject *args)
if (deepcopy == NULL) {
return NULL;
}
it = (PyArrayIterObject *)PyArray_IterNew((PyObject *)self);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that IterNew is iterating in C order, and optr is therefore getting out of synch.

@jaimefrio
Copy link
Member

Not that I'm against using the new iterator, but wouldn't you get the same result if you changed this line:

_deepcopy_call(it->dataptr, optr, PyArray_DESCR(self), deepcopy, visit);

with

_deepcopy_call(it->dataptr, it->dataptr, PyArray_DESCR(self), deepcopy, visit);

and got rid of all the optr code, let the iterator handle all the iteration?

@pelson
Copy link
Contributor Author

pelson commented Oct 12, 2015

and got rid of all the optr code, let the iterator handle all the iteration?

I think you may be right. I was concerned about the fact that it is iterating over self, not the result array, but I think that is fine in this case. To be honest, I took this as an opportunity to understand the new iterator API - do you have a preference on my next steps - obviously, if I take the easier solution it will be easier to apply to a potential v1.10.1, but on the other hand if I stick with my original implementation it is making use of the new iterator API...

_deepcopy_call(it->dataptr, optr, PyArray_DESCR(self), deepcopy, visit);
optr += PyArray_DESCR(self)->elsize;
PyArray_ITER_NEXT(it);
NpyIter_IterNextFunc *iternext = NpyIter_GetIterNext(iter, NULL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be declared at the beginning of the function to comply with the C89 strictness of MSVC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All declarations must go at the beginning of a block.

@jaimefrio
Copy link
Member

My personal view is that, for the new iterator to replace the old one, it needs a middle layer that makes implementing simple iterations like this one much simpler and less verbose: the alternative to the 26 lines that your PR adds to methods.c is to keep using the old iterator and remove 3. And there is no obvious benefit from the change, so I would not have included the new iterator as part of this PR.

Then again, this is your PR, not mine, and there is nothing fundamentally wrong with using NpyIter. It certainly should be encouraged for new development. So if you manage to fix the test errors, I wouldn't mind merging it as is.

assert_equal(arr, arr_cp)
self.assertTrue(arr is not arr_cp)
# Ensure that we have actually copied the item.
self.assertIsNot(arr[0, 1], arr_cp[1, 1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertIs and assertIsNot are not available in Python 2.6, use assertTrue and assertFalse with proper arguments.

@pelson
Copy link
Contributor Author

pelson commented Jan 6, 2016

Thanks @jaimefrio and @charris - I've updated the PR based on your review. I've kept the new API implementation, but am still not averse to choosing the easier solution of updating the existing implementation (either way, we can keep the new tests).


if (!PyArg_ParseTuple(args, "O", &visit)) {
return NULL;
}
ret = (PyArrayObject *)PyArray_NewCopy(self, NPY_KEEPORDER);
copied_array = (PyArrayObject*) PyArray_NewCopy(self, NPY_KEEPORDER);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be checking that copied_array is not NULL, and if it is, return NULL to signal an error happened.

@jaimefrio
Copy link
Member

There seemed to be a potential memory leak in case of failure. Was already there, but would be nice if you could fix it also as part of this PR.

@pelson
Copy link
Contributor Author

pelson commented Jan 7, 2016

There seemed to be a potential memory leak in case of failure. Was already there, but would be nice if you could fix it also as part of this PR.

Yep, that seems reasonable! I've added a commit (d158c59). Happy to squash if that is necessary.

@pelson
Copy link
Contributor Author

pelson commented Jan 7, 2016

Additionally, if there are stylistic improvements to the code, do let me know - I'm happy to fix them, and would appreciate critique.

@pelson
Copy link
Contributor Author

pelson commented Jan 18, 2016

Has this missed the v1.11 boat @charris?

@charris
Copy link
Member

charris commented Jan 18, 2016

@pelson I'm going to leave this with @jaimefrio for when he gets back. Don't worry, we are shooting for branching 1.12 in April, not that far off.

@homu
Copy link
Contributor

homu commented Sep 24, 2016

☔ The latest upstream changes (presumably #6054) made this pull request unmergeable. Please resolve the merge conflicts.

@charris charris modified the milestones: 1.11.2 release, 1.12.0 release Sep 24, 2016
@charris
Copy link
Member

charris commented Sep 24, 2016

Hmm, we should get this in. @pelson, could you rebase this?

@charris
Copy link
Member

charris commented Oct 7, 2016

Squashed and rebased in #8125, so closing this. Thanks @pelson .

@charris charris closed this Oct 7, 2016
@pelson pelson deleted the deepcopy_object branch October 8, 2016 05:17
@pelson
Copy link
Contributor Author

pelson commented Oct 8, 2016

Thanks for taking this forward @charris. Great to have it fixed in master 👍

@aojeda
Copy link

aojeda commented Jan 31, 2017

Thanks guys for fixing the issue so quickly!
Onward!

@charris
Copy link
Member

charris commented Jan 31, 2017

@aojeda Thanks, but this is the PR that caused your problem 8-)

@aojeda
Copy link

aojeda commented Jan 31, 2017

Ah ah true, I'll take it back for now then ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants