Skip to content

Inconsistent advanced indexing behavior #4940

Closed
@pbazant

Description

@pbazant

The following comments are partially based on impressions triggered by the following part of the NumPy documentation:

For the discussion below, when the selection object is not a tuple, it will be referred to as if it had been promoted to a 1-tuple, which will be called the selection tuple.

To avoid confusion, by wrap into a tuple I mean a->(a,), whereas by converting to a tuple, I mean a->tuple(a).

Let's start with some array:

In [176]: a=np.array([10,11,12])

If an indexing object is not a tuple, it is wrapped in a tuple:

In [178]: a[0]
Out[178]: 10

If the indexing object is a list, we have advanced indexing:

In [179]: a[[0]]
Out[179]: array([10])

Providing the trivial tuple wrap ourselves, we get the same result:

In [180]: a[[0],]
Out[180]: array([10])

If the indexing object has several levels of nesting, so does the result:

In [181]: a[[[0]],]
Out[181]: array([[10]])

Relying on the automatic wrapping into a tuple, we might want to remove the explicit tuple construction:

In [182]: a[[[0]]]
Out[182]: array([10])

This reveals the inconsistency -- the result has changed as if the outer list were actually reinterpreted as a tuple. This theory is consistent with the following behavior:

In [185]: a[[[0,0],[0,0]]]
IndexError: unsupported iterator index

Explicitly wrapping into a tuple makes it work:

In [186]: a[[[0,0],[0,0]],]
Out[186]: 
array([[10, 10],
       [10, 10]])

Alternatively, explicitly converting to an array also works:

In [187]: a[np.array([[0,0],[0,0]])]
Out[187]: 
array([[10, 10],
       [10, 10]])

It would be logical to convert the object to a tuple only if the result obtained by wrapping the object was not suitable for indexing. Even better would be to actually not backtrack too much while trying to make sense of the indexing object, as it makes the behavior really hard to understand.
The following behavior seems to be the instance of the same problem:

In [177]: b=np.array([[1,2],[3,4]])

In [191]: b[[0,0]] # wraps into a tuple and pads as to get ([0,0],:)
Out[191]: 
array([[1, 2],
       [1, 2]])

In [193]: b[[[0,1],[0,1]]]# does not wrap into a tuple and instead converts to a tuple
Out[193]: array([1, 4])

In [194]: b[[[0,1],[0,1]],] # wrapping it ourselves solves the problem
Out[194]: 
array([[[1, 2],
        [3, 4]],

       [[1, 2],
        [3, 4]]])

In my opinion either the order in which various interpretations are tried should be strict or, even better (but problematic due to backwards compatibility) raise an error if there is "too much" ambiguity.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions