Skip to content

change updateifcopy semantics in nditer #9714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mattip opened this issue Sep 19, 2017 · 10 comments
Closed

change updateifcopy semantics in nditer #9714

mattip opened this issue Sep 19, 2017 · 10 comments

Comments

@mattip
Copy link
Member

mattip commented Sep 19, 2017

a continuation of issue #7054
pr #9639 starts deprecating the resolution of UPDATEIFCOPY semantics in ndarray dealloc. The nditer class also uses the UPDATEIFCOPY mechanism, whenever a read-write iterator is used or if the updateifcopy flag is explicitly set via opflags, leading to code like this::

    a = arange(24, dtype='<i4').reshape(2, 3, 4)
    i = nditer(a, ['buffered'], order='F', casting='unsafe',
                op_dtypes='>f8', buffersize=5)
    j = i.copy()
    i = None

I can see two solutions to removing the need to do i = None to trigger the resolution of temporary data:

  • totally deprecate the use of these semantics in the case of nditers, and raise an error if the nditer instantiation would require updateifcopy semantics
  • add context manager semantics to nditer, so that the code above would become
    a = arange(24, dtype='<i4').reshape(2, 3, 4)
    with nditer(a, ['buffered'], order='F', casting='unsafe',
                op_dtypes='>f8', buffersize=5) as i:
        j = i.copy()

I can issue a pull request for either option, which one is preferable? Once we have a resolution for this issue, we can enable the DeprecationWarning on any use of UPDATEIFCOPY in favor of WRITEBACKIFCOPY + use of PyArray_ResolveWritebackIfCopy

@ghost
Copy link

ghost commented Nov 8, 2017

totally deprecate the use of these semantics in the case of nditers, and raise an error if the nditer instantiation would require updateifcopy semantics

+1. The correct solution is to use itertools.tee.

@njsmith
Copy link
Member

njsmith commented Nov 8, 2017

@xoviat I don't see how itertools.tee is relevant here, can you elaborate?

@ghost
Copy link

ghost commented Nov 8, 2017

Just from reading the documentation:

Get a copy of the iterator in its current state.

If we think about other Python iterators, .copy() is not a supported operation. So the question is: what is the .copy() method for?

@mattip
Copy link
Member Author

mattip commented Nov 8, 2017

The example is not a good one. Here is a better one:

a = np.arange(24, dtype='f8').reshape(2, 3, 4).T
i = np.nditer(a, [], [['readwrite', 'updateifcopy']],
            casting='same_kind', op_dtypes=[np.dtype('f4')])
# Check that UPDATEIFCOPY is activated
i.operands[0][2, 1, 1] = -12.5
assert  a[2, 1, 1] != -12.5
i = None                     # magic!!!
assert a[2, 1, 1] == -12.5

The real issue is the nditer creation/destruction which uses a scratch buffer via UPDATEIFCOPY semantics. The operands have a different memory layout than the original array a and so a scratch buffer is needed to allow operand manipulation

The scratch buffer is copied back to a when i is destroyed, The new API requires a less magic, explicit means of resolving or copying back to a, the one that pops to mind for me is a context manager.

Alternatively we could deprecate this whole mess, under the assumption that no-one uses it.

@ghost
Copy link

ghost commented Nov 8, 2017

The assumption here is that nditer should have a .copy() method. But that assumption isn't backed by any other examples in Python, so the question is: why should the method exist? .copy() functionality is already provided by Python itself with itertools.tee, and if that has a performance penalty, there's nothing stopping implementations from rewriting it in C or rpython.

@njsmith
Copy link
Member

njsmith commented Nov 8, 2017 via email

@ahaldane
Copy link
Member

ahaldane commented Nov 8, 2017

So when we talk about "removing the functionality", we mean raising an error and requiring the user to make an iterable copy of their array themselves? Ie, they should write:

a = arange(24, dtype='<i4').reshape(2, 3, 4)
cpy = ascontiguousarray(a)
i = nditer(cpy, ['buffered'], casting='unsafe',
                op_dtypes='>f8', buffersize=5)
# do something here
a[...] = cpy

I have only just learned about all of this. But personally the context-manager seems friendlier, and doesn't seem too hard to implement given you already wrote the C-api for all of this in PyArray_SetWritebackIfCopyBase.

@mattip
Copy link
Member Author

mattip commented Nov 10, 2017

pull request #9998 opened for comment and review

@mattip
Copy link
Member Author

mattip commented Dec 13, 2017

pull request #10184 solves this issue via a third alternative, a python-level nditer.close which provides a path for PyPy compatibility without requiring current code to be refactored.

@mattip
Copy link
Member Author

mattip commented Apr 21, 2018

Closed by PR #9998

@mattip mattip closed this as completed Apr 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants