ENH: Use array indexing preparation routines for flatiter objects #28590

lysnikolaou · 2025-03-26T10:19:05Z

Use prepare_index in iter_subscript and iter_ass_subscript. This fixes various cases that were broken before:
- arr.flat[[True, True]]
- arr.flat[[1.0, 1.0]]
- arr.flat[()] = 0
Add more extensive tests for flatiter indexing operations

ngoldbaum · 2025-03-26T19:03:12Z

numpy/_core/tests/test_regression.py

-        assert_raises(ValueError, ia, x.flat, s, np.zeros(9, dtype=float))
-        assert_raises(ValueError, ia, x.flat, s, np.zeros(11, dtype=float))
+        assert_raises(IndexError, ia, x.flat, s, np.zeros(9, dtype=float))
+        assert_raises(IndexError, ia, x.flat, s, np.zeros(11, dtype=float))


While this is certainly more consistent and I'd even call it a bugfix, it is a behavior change and someone might have code relying on the old behavior. Needs a release note at least. You also need another release note for the new features.

Agreed. Added a release note that lists all the most important changes.

ngoldbaum · 2025-03-28T14:17:44Z

This is a big refactor, so I think we'll need at least two experienced developers to go over the C code changes, so that might take a while. I'll try to do a pass focusing on the correctness of the C code soon. On a first, high-level pass this looks like mostly simplification and cleanup.

I think you should also try running the indexing benchmarks to see if there are any significant regressions in existing benchmarks. I think bench_indexing already captures several workflows that go through the changed low-level C code path.

It would also be nice to get new entries in the FlatIterIndexing benchmark for newly added functionality.

lysnikolaou · 2025-03-28T16:45:26Z

These are the results of running the (old & new) benchmarks:

| Change   | Before [93898621] <main>   | After [cfcdabf0] <use-prepare-index-flatiter>   |     Ratio | Benchmark (Parameter)                                       |
|----------|----------------------------|-------------------------------------------------|-----------|-------------------------------------------------------------|
| +        | 87.0±0.4ns                 | 43.0±0.2ms                                      | 494276    | bench_indexing.FlatIterIndexing.time_flat_empty_tuple_index |
| +        | 115±3ns                    | 479±8ns                                         |      4.15 | bench_indexing.FlatIterIndexing.time_flat_bool_index_0d     |
| +        | 39.4±0.3ms                 | 42.8±0.3ms                                      |      1.09 | bench_indexing.FlatIterIndexing.time_flat_ellipsis_index    |
| +        | 3.95±0.04ms                | 4.29±0.07ms                                     |      1.09 | bench_indexing.FlatIterIndexing.time_flat_slice_index       |

It looks like having special cases for tuple, ellipses etc. (instead of going through prepare_index) did have an impact on performance. Should we try and keep those special cases in?

ngoldbaum · 2025-03-28T17:01:15Z

Should we try and keep those special cases in?

Probably

lysnikolaou · 2025-04-23T13:31:15Z

Should we try and keep those special cases in?

Probably

I added a couple of special cases for an empty tuple and boolean indexes. This fixes the two worst performance regressions. I feel that the rest are acceptable, since this goes through a much more complex code path to make sure that everything is set up correctly.

ngoldbaum · 2025-04-23T17:46:16Z

I added the 2.3 milestone to make sure we don't drop reviewing this before cutting the release.

charris · 2025-05-19T18:25:18Z

I added the 2.3 milestone

@ngoldbaum I am about to push this off to 2.4 unless you want to put it in very soon.

ngoldbaum · 2025-05-19T19:38:36Z

I spoke with Lysandros and he said it's OK to push this off. We'll coordinate on getting this reviewed soon.

seberg

Sorry for not looking at it much. Overall looks nice, I need to do a pass to see for refcount issues, etc.

I am slightly worried that some of the bad bool cases should maybe have a FutureWarning (or just go to an error for a bit?!), to enforce correct behavior.

Overall, I am happy that this seemed to have worked well to integrate, the diff is a bit unwieldy, but it can't be helped.

doc/release/upcoming_changes/28590.improvement.rst

numpy/_core/src/multiarray/iterators.c

numpy/_core/tests/test_indexing.py

numpy/_core/src/multiarray/mapping.c

lysnikolaou · 2025-07-17T11:14:50Z

@seberg Sorry for taking so long with this and thanks a lot for the thorough review. It was really helpful!

I've addressed all of the feedback so it should be in a better state now. Looking foward to hearing your thoughts.

seberg

Sorry this is so tedious. I have to look further, but wanted to comment for those things.
I do wonder more now if we shouldn't push the errors/warning down into the index prep helper. As annoying as it is, we just duplicate things and while that code is spaghetti, it is at least pretty linear spaghetti.

doc/release/upcoming_changes/28590.improvement.rst

numpy/_core/src/multiarray/iterators.c

seberg · 2025-07-17T13:36:22Z

numpy/_core/src/multiarray/iterators.c

+    if (!PyArray_Check(ind) && !PyArrayIter_Check(ind)) {
+        PyErr_SetString(PyExc_IndexError,
+            "Non-array indices are not supported because they can lead to unexpected behavior. "
+            "This is expected to change in future versions.");
        goto finish;


This is tedious :/. I don't think it is right here, we should move things down because arr.flat[[1, 2, 3, 4]] is really OK.

So there are two things here:

a.flat[[1.]] floats are at this point, rejected by the original preparation. Changing that this stops working is also the biggest break (of sensible behavior). I could see adding a branch to the index-prep code to deprecate it there (or we gamble and see if someone notices, even considering that array indexing is strict for a decade, I am not sure).

a.flat[[True, False]] was always nonsense, and I am certainly happy to not worry about deprecation. Two possible solutions:

Just move this (similarly) into the HAS_BOOL branch. That works, although the current will also fail for arr.flat[np.array([True, False]),] as it is a tuple. Admittedly, that is rather niche.

We also push this into the prep-index code, where we have the original object at hand to give the warning.

The float case is now fixed in that it works but is derecated (added a branch in the index-prep code). I'm a bit confused about the bool case though. When bool indices are used that do not match the iterator's shape, that raises an IndexError. That's what we want, so I think we're okay like this?

lysnikolaou · 2025-08-04T12:49:39Z

Friendly ping here. I'm gonna be on PTO starting Aug 11th, so if we could get another round of feedback on this before I go, that'd be very helpful!

seberg

Thanks, had a look through. I think two of the comments are probably relevant (avoid the often unnecessary copy and actually raise the error we said we would raise).

But otherwise, I think this is turning out very nice and much simpler than the old code (even if that means making the index prep function slightly larger).

doc/release/upcoming_changes/28590.improvement.rst

numpy/_core/tests/test_custom_dtypes.py

numpy/_core/src/multiarray/mapping.c

seberg · 2025-08-04T13:14:28Z

numpy/_core/src/multiarray/iterators.c

-    else {
-        Py_INCREF(ind);
-        obj = ind;
+    if (index_type == HAS_BOOL) {


The release notes promise that this raises an error for the cases were the result would change.
We have to actually add that error here (or actually probably easier in the index prep function as we don't have to worry about tuples there).

That is, raise a specific IndexError that informs the user that this will be treated as a boolean index in the future (depending on where the error is, a non-matching length of the boolean one may be raised earlier, I don't care):

In [6]: a = np.arange(3) In [7]: a[[True, False, True]] Out[7]: array([0, 2]) # changed behavior

May be nice to mention that this will be allowed in the future and that np.asarray(index) is a work-around.

Or we decide to live with it and put it more prominently in the release notes.

I think it's okay now. This path raises as well.

numpy/_core/src/multiarray/iterators.c

lysnikolaou · 2025-08-06T17:40:00Z

Tests look good. This should be really close now @seberg! Also, thanks for the reviews and the patience!

seberg

Thanks, there is one issue around [] I think, but I'll just apply the fix and hope CI passes :).

numpy/_core/src/multiarray/iterators.c

seberg · 2025-08-08T13:33:56Z

numpy/_core/src/multiarray/iterators.c

-    Py_DECREF(obj);
+    PyErr_SetString(PyExc_IndexError,
+        "only integers, slices (`:`), ellipsis (`...`) and integer or boolean "
+        "arrays are valid indices"


Ah, I suspect this can't be reached, but it seems good to have as a back-stop. I'll just apply the == above and hope it works fine.

(Just to be a bit pedantic.)

numpy/_core/src/multiarray/iterators.c

numpy/_core/src/multiarray/mapping.c

numpy/_core/tests/test_indexing.py

lysnikolaou · 2025-08-08T13:55:05Z

Unfortunately the change to include PyArray_SIZE(tmp_arr) != 0 fails the following test because it doesn't warn anymore. What do we wanna do here? Is this acceptable?

    def test_empty_string_flat_index_on_flatiter(self):
        a = np.arange(9).reshape((3, 3))
        b = np.array([], dtype="S")
        with pytest.warns(DeprecationWarning,
                          match="Invalid non-array indices for iterator objects are "
                                "deprecated"):
            assert_equal(a.flat[b.flat], np.array([]))

seberg · 2025-08-08T14:06:09Z

I want to fix it to do the same thing as for arrays, but I am slightly confused why np.arange(3)[np.array([], dtype="S")] does the right thing.

lysnikolaou · 2025-08-08T14:08:55Z

That's probably because of this new if in mapping.c.

            if (PyArray_SIZE(tmp_arr) == 0
                || (is_flatiter_object && !PyArray_ISINTEGER(tmp_arr) && !PyArray_ISBOOL(tmp_arr))) {

The string array is cast to an integer array and since it's got no elements, there's no issues there.

seberg · 2025-08-08T14:11:03Z

Oh, nvm, let's just just remove the test, or only test for a string array. The array case just lets this one pass, because it can't doesn't distinguish between a [] and array([]).flat.

seberg · 2025-08-08T14:12:10Z

The point is, that it isn't a string array, its a string flatiter... And we never handle it correctly, so I don't think we should care for this PR.

seberg · 2025-08-08T14:15:01Z

numpy/_core/src/multiarray/mapping.c

+                            "only integers, slices (`:`), ellipsis (`...`)%s and integer or boolean "
+                            "arrays are valid indices",
+                            is_flatiter_object ? "" : ", numpy.newaxis (`None`)"
+                        );


I just realized that this also changes the array path. We may want to add a change release note, but overall it's not super important.
(Also wonder whether we should chain the error, but doubt that it is helpful).

It doesn't for this specific case. Do you know of any examples that would reach this?

>>> import numpy as np >>> a = np.arange(3) >>> b = np.array(["a"], dtype="S") >>> a[b] Traceback (most recent call last): File "<python-input-8>", line 1, in <module> a[b] ~^^^ IndexError: arrays used as indices must be of integer (or boolean) type >>> a.flat[b.flat] Traceback (most recent call last): File "<python-input-9>", line 1, in <module> a.flat[b.flat] ~~~~~~^^^^^^^^ IndexError: only integers, slices (`:`), ellipsis (`...`) and integer or boolean arrays are valid indices

Oh, I suppose it is practically impossible to reach indeed, since it only enters for length zero.

numpy/_core/tests/test_indexing.py

seberg · 2025-08-08T14:57:38Z

Thanks @lysnikolaou this was a really nice cleanup, I think. And fixing up flatiter a bit more is also nice.

lysnikolaou changed the title ~~[ENH] Use array indexing preparation routines for flatiter objects~~ ENH: Use array indexing preparation routines for flatiter objects Mar 26, 2025

lysnikolaou force-pushed the use-prepare-index-flatiter branch from 198df6b to 75aaed0 Compare March 26, 2025 10:23

lysnikolaou added 4 commits March 26, 2025 11:30

[ENH] Use array indexing preparation routines for flatiter objects

bc55a60

Fix assign subscript and add tests for it

f19745e

Fix regression test

a563242

Remove unnecessary dims array

9f2d51f

lysnikolaou force-pushed the use-prepare-index-flatiter branch from 75aaed0 to 9f2d51f Compare March 26, 2025 10:30

ngoldbaum reviewed Mar 26, 2025

View reviewed changes

Add release note

33109dd

lysnikolaou force-pushed the use-prepare-index-flatiter branch from 8f2b322 to 33109dd Compare March 28, 2025 13:52

ngoldbaum self-assigned this Mar 28, 2025

lysnikolaou added 3 commits March 28, 2025 17:34

Remove unnecessary branch from iter_subscript

dc362b4

Add more benchmarks

8255e5a

Merge branch 'main' into use-prepare-index-flatiter

cfcdabf

lysnikolaou added 2 commits April 23, 2025 15:09

Merge branch 'main' into use-prepare-index-flatiter

368b404

Add special cases to fix performance regression

ae101a5

ngoldbaum added this to the 2.3.0 release milestone Apr 23, 2025

charris modified the milestones: 2.3.0 release, 2.4.0 release May 19, 2025

seberg added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Jun 11, 2025

seberg self-requested a review June 11, 2025 17:49

seberg reviewed Jun 12, 2025

View reviewed changes

Merge branch 'main' into use-prepare-index-flatiter

e567311

lysnikolaou added 4 commits July 16, 2025 17:20

Address more feedback

149dadd

Address more feedback; raise error on non-array index

4028768

Fix linting errors

5319889

Update changelog item

50a6045

lysnikolaou added 2 commits July 17, 2025 13:31

Fix tests

0aea49e

Fix typing tests

8f87da2

seberg reviewed Jul 17, 2025

View reviewed changes

lysnikolaou added 2 commits July 29, 2025 10:05

Address review feedback

fa1540b

Fix linter errors

080d24d

seberg reviewed Aug 4, 2025

View reviewed changes

Address feedback

2559633

seberg reviewed Aug 6, 2025

View reviewed changes

numpy/_core/src/multiarray/iterators.c Outdated Show resolved Hide resolved

lysnikolaou added 3 commits August 6, 2025 17:25

Remove wrong comments

e3630eb

Fix linting errors

4443708

Fix test_deprecations

564f086

seberg approved these changes Aug 8, 2025

View reviewed changes

Apply suggestions from code review

a77c43e

seberg reviewed Aug 8, 2025

View reviewed changes

numpy/_core/tests/test_indexing.py Outdated Show resolved Hide resolved

Update numpy/_core/tests/test_indexing.py

9211898

seberg merged commit a0515ad into numpy:main Aug 8, 2025
79 checks passed

Uh oh!

ENH: Use array indexing preparation routines for flatiter objects #28590

ENH: Use array indexing preparation routines for flatiter objects #28590

Conversation

lysnikolaou commented Mar 26, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngoldbaum commented Mar 28, 2025

Uh oh!

lysnikolaou commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngoldbaum commented Mar 28, 2025

Uh oh!

lysnikolaou commented Apr 23, 2025

Uh oh!

ngoldbaum commented Apr 23, 2025

Uh oh!

charris commented May 19, 2025

Uh oh!

ngoldbaum commented May 19, 2025

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lysnikolaou commented Jul 17, 2025

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lysnikolaou commented Aug 4, 2025

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lysnikolaou commented Aug 6, 2025

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lysnikolaou commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Aug 8, 2025

lysnikolaou commented Mar 28, 2025 •

edited

Loading

lysnikolaou commented Aug 8, 2025 •

edited

Loading