Closed
Description
Consider a class T
that implements the __array__
protocol with shape (0, 0) and has length 0, so it is consistent with itself. When calling np.array([T()])
, the result has shape (1, 0)
; 1 from the list and 0 since that is what T-as-sequence reports as its length. But should the result shape be (1, 0, 0)
, i.e. should the __array__
protocol take precedence over the sequence?
This came up in #13659 when discussing the case of pandas.DataFrame()
which exhibits this behavior. Below is a class taken from the test to fix that issue.
class T(object):
def __array__(self):
return np.ndarray(shape=(0,0))
# Make sure __array__ is used instead of Sequence methods.
def __iter__(self):
return iter([])
def __getitem__(self, idx):
raise AssertionError("__getitem__ was called")
def __len__(self):
return 0