ENH: `at`: add `setitem` fancy indexing fallback #395

amacati · 2025-08-21T00:56:29Z

Fancy indexing is currently not supported for __setitem__, which blocks some PRs in scipy (scipy/scipy#23425).

As discussed in data-apis/array-api#864 (comment), all frameworks of the array api already implement this feature, but not necessarily in a consistent manner for duplicate indices. It is currently not part of the standard. This PR adds a workaround for array api strict to allow fancy indexing in xpx.at(x, ...).

lucascolley

thanks @amacati !

src/array_api_extra/_lib/_at.py

amacati · 2025-08-23T21:30:47Z

The fallback is now implemented as error handling for __setitem__ and the behaviour for data races is documented. I also fixed the handling of negative indices and the assignment of scalar values.

One remaining concern is out-of-bounds indices. For anything negative or positive > len(x), the fallback currently does not throw an error. We would need to add value-based checks to catch that.

lucascolley · 2025-08-23T21:38:23Z

For anything negative or positive > len(x), the fallback currently does not throw an error. We would need to add value-based checks to catch that.

Seems fine to leave that for now. I guess the pattern would be to hide things behind not is_lazy_array(x) like SciPy?

lucascolley

looks good to me, thanks @amacati !

Would you like to take a look @crusaderky ?

lucascolley · 2025-08-23T22:13:41Z

the codecov failure is not a worry

lucascolley · 2025-08-24T11:30:05Z

lint failures are real but trivial

lucascolley · 2025-08-26T14:24:41Z

I'll release 0.8.1 soon, and merge this for 0.9.0 by the end of the week if there is no further feedback

crusaderky

I have concerns regarding user experience.

This PR introduces support for integer array indices on backends that don't support it in __setitem__.

However, it makes it work

exclusively on set(); not on add() nor any other methods;
exclusively when it is on axis 0;
exclusively when it is not expressed as a tuple;
exclusively for data of trivial size (anything in the magnitude of megabytes will crash with MemoryError).

This I suspect will result in a rather unpleasant user experience for those that don't fall in this very specific use case.

src/array_api_extra/_lib/_at.py

crusaderky · 2025-08-28T11:46:46Z

src/array_api_extra/_lib/_at.py

+        >>> import numpy as np
+        >>> import array_api_strict as xp
+        >>> import array_api_extra as xpx
+        >>> xpx.at(np.asarray([0]), np.asarray([0, 0])).set(np.asarray([2, 3]))
+        array([3])
+        >>> xpx.at(xp.asarray([0]), xp.asarray([0, 0])).set(xp.asarray([2, 3]))
+        Array([3], dtype=array_api_strict.int64)


This example leaves me confused. I don't think it adds anything?

The aim is to show that np's and xpx's behavior is identical. For torch tensors on the GPU you would see

>>> xpx.at(torch.tensor([0]).cuda(), torch.tensor([0, 0]).cuda()).set(torch.tensor([2, 3]).cuda()) torch.Tensor([2], dtype=torch.int64)

src/array_api_extra/_lib/_at.py

crusaderky · 2025-08-28T11:58:51Z

src/array_api_extra/_lib/_at.py

+        except IndexError as e:
+            if "Fancy indexing" not in str(e):  # Avoid masking other index errors
+                raise e


This is quite fragile as it cherry-picks array-api-strict's behaviour. Different libraries would have different error messages and different exceptions.

Suggested change

except IndexError as e:

if "Fancy indexing" not in str(e): # Avoid masking other index errors

raise e

except Exception as e:

Yes, I thought about this as well. However, I am strongly opposed to a blank except. We would mask errors for regular frameworks that would subsequently enter an unexpected code path which may throw obscure errors. Hence the commend on masking other index errors. This feels almost worse than the added benefit of having array-api-strict support for integer indexing.

src/array_api_extra/_lib/_at.py

crusaderky · 2025-08-28T12:13:47Z

src/array_api_extra/_lib/_at.py

+                    )
+                    raise IndexError(msg) from e
+
+                x_mask = xp.any(xp.arange(x.shape[0])[..., None] == u_idx_pos, axis=-1)


Suggested change

x_mask = xp.any(xp.arange(x.shape[0])[..., None] == u_idx_pos, axis=-1)

x_rng = xp.arange(x.shape[0], device=device(u_idx_pos))

x_mask = xp.any(x_rng[..., None] == u_idx_pos, axis=-1)

Could you add a comment explaning what you're doing here?

I think you need to add in the documentation above a note warning that the implementation is quadratic.

If x is 10 MiB along axis 0 and the u_idx_pos is 10 MiB, this line transitorily consumes 100 terabytes of RAM.
Have you considered using searchsorted?

src/array_api_extra/_lib/_at.py

crusaderky · 2025-08-28T12:44:44Z

src/array_api_extra/_lib/_at.py

@@ -355,9 +370,70 @@ def _op(
        # Backends without boolean indexing (other than JAX) crash here
        if in_place_op:  # add(), subtract(), ...
            x[idx] = in_place_op(x[idx], y)


These remain broken.

Yes. I'm not sure if we should attempt to fix them. See my general comment.

amacati · 2025-08-28T13:38:33Z

I have concerns regarding user experience.

This PR introduces support for integer array indices on backends that don't support it in __setitem__.

However, it makes it work

exclusively on set(); not on add() nor any other methods;

exclusively when it is on axis 0;

exclusively when it is not expressed as a tuple;

exclusively for data of trivial size (anything in the magnitude of megabytes will crash with MemoryError).

This I suspect will result in a rather unpleasant user experience for those that don't fall in this very specific use case.

Correct. I think this should be discussed before moving forward.

The only way to use array indices for these operations while remaining compliant with the standard (please correct me if I am wrong!) is to use boolean masking. However, creating masks that exactly replicate the selection from integer indices are hard to create. The aim of restricting axis to 0, no tuples, and only for set() was to reduce the potential for errors and keep the complexity manageable. The current implementation already has several special cases that need to be accounted for (e.g. wrapping for negative indices, uniqueness etc). I expect this to grow significantly if we open up the narrow scope of this PR.

If we can't express the logic of integer indexing easily with existing array API functions, we might want to rethink if adding it makes sense.

amacati · 2025-08-28T16:24:18Z

I have updated the code with most of the suggested changes from the review.

However:

The current implementation is broken for non-unique indices. One test case should fail. To see why, consider the following code

import array_api_extra as xpx
import array_api_strict as xp


x = xp.asarray([0, 1, 2])
y = xp.asarray([3, 4, 5])
idx = xp.asarray([0, 1, 0])
print(xpx.at(x, idx).set(y))  # [4, 5, 2], should be [5, 4, 2]

The reason why this happens is that we construct two masks for the set operation, one for x and one for y. The x mask is True where the index of array x is in idx, which in this case is [True, True, False]. The y mask is True where the unique indices in idx appear for the last time, i.e. [False, True, True].

However, we have no way to express the fact that the first True in the x mask belongs to the second True in the y mask. Thus, the values get assigned in the wrong order. One way to fix this would be to create an integer index for y instead of a mask and shuffle the values around such that the order matches x. But this isn't exactly making things less brittle.

@crusaderky What's your opinion on this? These workarounds don't exactly feel great given we are currently doing this only for array-api-strict, which is only used for testing. Having such a complex separate code path where the testing framework deviates from the other frameworks kind of defeats the point.

crusaderky · 2025-08-29T10:53:31Z

These workarounds don't exactly feel great given we are currently doing this only for array-api-strict, which is only used for testing

array-api-strict is a proxy for not explicitly tested, possibly not yet known or even existing, additional libraries.

I honestly doubt that this PR should be merged, given its complication and its way-too-many caveats.
However, it has still value in demonstrating how painful it is to work around this limitation of the Array API, for functionality that is being leveraged as we speak by scipy. It should be used as exhibit A when pushing for inclusion of integer array indices in __setitem__ in the Array API standard.

amacati · 2025-08-29T11:07:44Z

I sadly have to agree with the feeling. There are too many complications for a too narrow scope. If it was a general solution for integer indexing for all operations it might have been worth the trouble, but as it stands it's probably best to wait for the standard to advance (if it ever does).

lucascolley · 2025-08-29T11:27:34Z

To jog my memory, is the blocker for the standard that existing libraries implement orthogonal semantics, so standardising one set of behaviour would mean a significant break for at least some library?

amacati added 2 commits August 21, 2025 02:49

Add fancy __setitem__ support for array-api-strict

7f42065

Remove unnecessary variable assignment

9df1c65

amacati changed the title ~~Add __setitem__ fancy indexing support for array-api-strict~~ ENH: Add __setitem__ fancy indexing support for array-api-strict Aug 21, 2025

lucascolley added the enhancement New feature or request label Aug 22, 2025

lucascolley reviewed Aug 22, 2025

View reviewed changes

src/array_api_extra/_lib/_at.py Outdated Show resolved Hide resolved

src/array_api_extra/_lib/_at.py Outdated Show resolved Hide resolved

lucascolley changed the title ~~ENH: Add __setitem__ fancy indexing support for array-api-strict~~ ENH: Add __setitem__ fancy indexing fallback Aug 22, 2025

lucascolley changed the title ~~ENH: Add __setitem__ fancy indexing fallback~~ ENH: at: add __setitem__ fancy indexing fallback Aug 22, 2025

lucascolley mentioned this pull request Aug 23, 2025

ENH: spatial.transform.RigidTransform: add array API backend scipy/scipy#23425

Open

2 tasks

Add docs. Fix 0D and scalar cases. Handle negative indices

2c5e0aa

lucascolley approved these changes Aug 23, 2025

View reviewed changes

lucascolley added this to the 0.9.0 milestone Aug 23, 2025

Add handling and tests for negative and positive out-of-bounds indices

6fe2ffb

Fix linting

acd5365

crusaderky suggested changes Aug 28, 2025

View reviewed changes

[wip] Fix _at logic. Update tests

07e6bda

	x_mask = xp.any(xp.arange(x.shape[0])[..., None] == u_idx_pos, axis=-1)
	x_rng = xp.arange(x.shape[0], device=device(u_idx_pos))
	x_mask = xp.any(x_rng[..., None] == u_idx_pos, axis=-1)

ENH: at: add __setitem__ fancy indexing fallback #395

Are you sure you want to change the base?

ENH: at: add __setitem__ fancy indexing fallback #395

Conversation

amacati commented Aug 21, 2025

Uh oh!

lucascolley left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

amacati commented Aug 23, 2025

Uh oh!

lucascolley commented Aug 23, 2025

Uh oh!

lucascolley left a comment

Choose a reason for hiding this comment

Uh oh!

lucascolley commented Aug 23, 2025

Uh oh!

lucascolley commented Aug 24, 2025

Uh oh!

lucascolley commented Aug 26, 2025

Uh oh!

crusaderky left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

crusaderky Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

amacati Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

crusaderky Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

amacati Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

crusaderky Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

crusaderky Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

crusaderky Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

amacati Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

amacati commented Aug 28, 2025

Uh oh!

amacati commented Aug 28, 2025

Uh oh!

crusaderky commented Aug 29, 2025

Uh oh!

amacati commented Aug 29, 2025

Uh oh!

lucascolley commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ENH: `at`: add `setitem` fancy indexing fallback #395

ENH: `at`: add `setitem` fancy indexing fallback #395

crusaderky Aug 28, 2025 •

edited

Loading

lucascolley commented Aug 29, 2025 •

edited

Loading