DOC: caution against relying upon NumPy's implementation in subclasses #13633

shoyer · 2019-05-26T23:23:05Z

xref #12028

I think this is an important warning to include for subclass authors.
Otherwise, we will be expanding our exposure of internal APIs as part of
__array_function__. All things being equal, it's great when things "just
work" subclasses, but I don't want to guarantee it. In particular, I would be
very displeased if __array_function__ leads to NumPy adding more subclass
specific hacks like always calling mean() inside median() (GH-3846).

@mhvk: please take a look.

I think this is an important warning to include for subclass authors. Otherwise, we will be expanding our exposure of internal APIs as part of ``__array_function__``. All things being equal, it's great when things "just work" subclasses, but I don't want to guarantee it. In particular, I would be very displeased if ``__array_function__`` leads to NumPy adding more subclass specific hacks like always calling ``mean()`` inside ``median()`` (numpyGH-3846). mhvk: please take a look.

This is a partial reprise of the optimizations from numpyGH-13585. The trade-offs here are about readability, performance and whether these functions automatically work on ndarray subclasses. You'll have to judge the readability costs for yourself, but I think this is pretty reasonable. Here are the performance numbers for three relevant functions with the following IPython script: import numpy as np x = np.array([1]) xs = [x, x, x] for func in [np.stack, np.vstack, np.block]: %timeit func(xs) | Function | Master | This PR | | --- | --- | --- | | `stack` | 6.36 µs ± 175 ns | 6 µs ± 174 ns | | `vstack` | 7.18 µs ± 186 ns | 5.43 µs ± 125 ns | | `block` | 15.1 µs ± 141 ns | 11.3 µs ± 104 ns | The performance benefit for `stack` is somewhat marginal (perhaps it should be dropped), but it's much more meaningful inside `vstack`/`hstack` and `block`, because these functions call other dispatched functions within a loop. For automatically working on ndarray subclasses, the main concern would be that by skipping dispatch with `concatenate`, subclasses that define `concatenate` won't automatically get implementations for `*stack` functions. (But I don't think we should consider ourselves obligated to guarantee these implementation details, as I write in numpyGH-13633.) `block` also will not get an automatic implementation, but given that `block` uses two different code paths depending on argument values, this is probably a good thing, because there's no way the code path not involving `concatenate` could automatically work (it uses `empty()`).

hameerabbasi · 2019-05-27T00:33:37Z

If we actually follow Python protocol rules:

>>> class A:
...   def __add__(self, other):
...     return (self, other)
... 
>>> class B(A):
...   def __add__(self, other):
...     return NotImplemented
... 
>>> B() + B()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'B' and 'B'

We should only allow this when there IS an ndarray present. One can always do the following:

if func not in IMPLEMENTATIONS:
  return super().__array_function__(func, args, kwargs, arrays)
return IMPLEMENTATIONS[func](*args, **kwargs)

shoyer · 2019-05-27T00:46:28Z

@hameerabbasi I'm not sure I understand your example. I think __array_function__ is already consistent with this behvavior, e.g.,

class SubArray(np.ndarray):
     def __array_function__(self, func, types, args, kwargs):
         return NotImplemented

array = np.array([1]).view(SubArray)
np.stack([array, array])  # TypeError: no implementation found for 'numpy.stack'

You only get the base-class behavior if at least one argument inherits from ndarray and hasn't overriden __array_function__.

hameerabbasi · 2019-05-27T01:14:34Z

In this case, I think the warning should be extended to everything that uses np.ndarray.__array_function__.

shoyer · 2019-05-27T01:46:07Z

In this case, I think the warning should be extended to everything that uses np.ndarray.__array_function__.

How else would you use np.ndarray.__array_function__, if not from a subclass?

OK, I suppose you could indeed call it directly, but calling special methods directly is generally not recommended in Python. The methods exist for the sake of a protocol, not arbitrary usage.

In any case, if you call ndarray.__array_function__ with any objects that define __array_function__ other than subclasses, you will just get NotImplemented back. If you intentionally misuse the protocol and put the wrong arguments in types, then yes you could cause NumPy to expose implementation details for non-subclasses. But hopefully it should be obvious that we make no guarantees about such behavior.

mhvk · 2019-05-27T03:39:14Z

@shoyer - This warning is fine and indeed appropriate. The big difference with gh-3846 is that now we can tell anybody who is using an implementation that changes, that they can just override the function; in gh-3846 there was no way to get the old behaviour back.

mhvk · 2019-05-27T03:40:45Z

p.s. Hopefully, the bugs smoked out by Quantity and MaskedColumn provide a nice counterweight to the occasional pain of having to fix regressions...

charris · 2019-05-30T14:42:08Z

Thanks @shoyer.

* MAINT: avoid nested dispatch in numpy.core.shape_base This is a partial reprise of the optimizations from GH-13585. The trade-offs here are about readability, performance and whether these functions automatically work on ndarray subclasses. You'll have to judge the readability costs for yourself, but I think this is pretty reasonable. Here are the performance numbers for three relevant functions with the following IPython script: import numpy as np x = np.array([1]) xs = [x, x, x] for func in [np.stack, np.vstack, np.block]: %timeit func(xs) | Function | Master | This PR | | --- | --- | --- | | `stack` | 6.36 µs ± 175 ns | 6 µs ± 174 ns | | `vstack` | 7.18 µs ± 186 ns | 5.43 µs ± 125 ns | | `block` | 15.1 µs ± 141 ns | 11.3 µs ± 104 ns | The performance benefit for `stack` is somewhat marginal (perhaps it should be dropped), but it's much more meaningful inside `vstack`/`hstack` and `block`, because these functions call other dispatched functions within a loop. For automatically working on ndarray subclasses, the main concern would be that by skipping dispatch with `concatenate`, subclasses that define `concatenate` won't automatically get implementations for `*stack` functions. (But I don't think we should consider ourselves obligated to guarantee these implementation details, as I write in GH-13633.) `block` also will not get an automatic implementation, but given that `block` uses two different code paths depending on argument values, this is probably a good thing, because there's no way the code path not involving `concatenate` could automatically work (it uses `empty()`). * MAINT: only remove internal use in np.block * MAINT: fixup comment on np.block optimization

shoyer mentioned this pull request May 26, 2019

Tracking issue for implementation of NEP-18 (__array_function__) #12028

Closed

33 tasks

shoyer mentioned this pull request May 26, 2019

MAINT: avoid nested dispatch in numpy.core.shape_base #13634

Merged

charris added 04 - Documentation component: NEP labels May 30, 2019

charris merged commit 9c44a2d into numpy:master May 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC: caution against relying upon NumPy's implementation in subclasses #13633

DOC: caution against relying upon NumPy's implementation in subclasses #13633

Uh oh!

shoyer commented May 26, 2019

Uh oh!

hameerabbasi commented May 27, 2019 •

edited

Loading

Uh oh!

shoyer commented May 27, 2019

Uh oh!

hameerabbasi commented May 27, 2019

Uh oh!

shoyer commented May 27, 2019

Uh oh!

mhvk commented May 27, 2019

Uh oh!

mhvk commented May 27, 2019

Uh oh!

charris commented May 30, 2019

Uh oh!

Uh oh!

Uh oh!

DOC: caution against relying upon NumPy's implementation in subclasses #13633

DOC: caution against relying upon NumPy's implementation in subclasses #13633

Uh oh!

Conversation

shoyer commented May 26, 2019

Uh oh!

hameerabbasi commented May 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shoyer commented May 27, 2019

Uh oh!

hameerabbasi commented May 27, 2019

Uh oh!

shoyer commented May 27, 2019

Uh oh!

mhvk commented May 27, 2019

Uh oh!

mhvk commented May 27, 2019

Uh oh!

charris commented May 30, 2019

Uh oh!

Uh oh!

hameerabbasi commented May 27, 2019 •

edited

Loading