BUG,ENH: Fix generic scalar infinite recursion issues #26904

seberg · 2024-07-10T15:09:14Z

This reorganizes the avoidance of infinite recursion to happen
in the generic scalar fallback, rather than attempting to do so
(badly) in the scalarmath, where it only applies to longdouble
to begin with.

This was also needed for the follow up to remove special handling
of python scalar subclasses.

Closes gh-26767

This reorganizes the avoidance of infinite recursion to happen in the generic scalar fallback, rather than attempting to do so (badly) in the scalarmath, where it only applies to longdouble to begin with. This was also needed for the follow up to remove special handling of python scalar subclasses.

seberg · 2024-07-10T15:10:52Z

numpy/_core/src/multiarray/scalartypes.c.src

+            res = PyArray_GenericBinaryFunction(m1, other_op, n_ops.power);
+        }
+        else {
+            res = PyArray_GenericBinaryFunction(other_op, m2, n_ops.power);


I started calling the ufunc directly now. This has advantages and disadvantages, but overall: I don't think it should matter much and besides working around power oddities felt pretty sane.

(Not done yet for the comparison, where it might also make sense eventually)

seberg · 2024-07-10T15:16:29Z

I don't care for backporting this. The bug has been around forever.

seberg · 2024-07-10T15:41:05Z

numpy/_core/tests/test_scalarmath.py

+    # As of NumPy 2.1, longdouble behaves like other types and can coerce
+    # e.g. lists.  (Not necessarily better, but consistent.)
+    assert_array_equal(op(sctype(3), [1, 2]), op(3, np.array([1, 2])))
+    assert_array_equal(op([1, 2], sctype(3)), op(np.array([1, 2]), 3))


So yes, the new paths aligns longdouble with others here. I could guess at this not being right, that would remove the array conversion step.

(It may mean deciding that one test in gh-26905, just can't pass, because we simply cannot deal with subclasses of Python floats even if they get converted to a float array later. Although, it can go either way, since you can still convert but only allow the was_scalar path.)

ngoldbaum

I don't have a lot of context for the details here, but I think removing special cases for longdoubles makes a lot of sense, and the logic behind the retrying for object arrays is explained in a lot of detail.

Sorry for taking so long to look at this.

numpy/_core/src/multiarray/scalartypes.c.src

ngoldbaum · 2024-07-22T16:23:29Z

numpy/_core/src/multiarray/scalartypes.c.src

+        /* self_item can be used to retry the operation */
+        *self_op = self_item;
+        return 0;
+    }


don't you leak self_item if you get to this last return 0? Maybe also in the case where you assign it to self_op, I haven't looked at how reference counting works in the caller.

ngoldbaum · 2024-07-22T16:26:13Z

numpy/_core/src/umath/scalarmath.c.src

-             * Generally, we try dropping through to the array path here,
-             * but this can lead to infinite recursions for (c)longdouble.
+             * We drop through to the generic path here which checks for the
+             * (c)longdouble infinite recursion problem.


Might make sense to just refer to this as the "infinite recursion problem", without special casing longdouble anymore. Maybe also leave a gh-XXXX reference.

For this file only longouble is problematic, but fair, removed it and added some references.

ngoldbaum · 2024-07-22T16:27:42Z

numpy/_core/tests/test_scalarmath.py

@@ -875,8 +876,8 @@ def test_operator_object_left(o, op, type_):


 @given(sampled_from(objecty_things),
-       sampled_from(reasonable_operators_for_scalars),
-       sampled_from(types))
+       sampled_from(reasonable_operators_for_scalars + [operator.xor]),


Does stuff break if you add xor to reasonable_operators_for_scalars? If not i'd just add it there. If so, maybe a comment explaining why only xor here?

I'll think a bit if I can think of a change. I think that would be a rename of the variable and not a comment here.

numpy/_core/src/multiarray/scalartypes.c.src

Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>

ngoldbaum · 2024-07-24T18:21:58Z

We chatted about this at the triage meeting and I got some extra context about why we're doing this.

mroeschke · 2024-08-05T22:05:49Z

I think this broke a use case in pandas where scalar arithmetic should have delegated to __array_ufunc__?

In [3]: class Array:
   ...:     def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
   ...:         return 1
   ...: 

In [4]: np.timedelta64(120,'m') + Array()
Out[4]: 1

In [5]: np.__version__
Out[5]: '1.26.4'

In [3]: class Array:
   ...:     def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
   ...:         return 1
   ...: 

In [4]: np.timedelta64(120,'m') + Array()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[4], line 1
----> 1 np.timedelta64(120,'m') + Array()

TypeError: unsupported operand type(s) for +: 'datetime.timedelta' and 'Array'

In [5]: np.__version__
Out[5]: '2.1.0.dev0+git20240802.0469e1d'

seberg · 2024-08-05T22:24:33Z

Hah, yes it defers still, but if you don't implement __radd__ then the ufunc is never called now. Hmmm...

seberg · 2024-08-06T09:31:06Z

OK, there are two problems, here (the second is even more subtle I guess):

If the other object has the __array_ufunc__ attribute, we promise that the ufunc is called (unless we already deferred because it is None). It might be nice to incorporate that into the macro, but that seems more tedious than worth it.
Basically, the implementations all work on the assumption that if we don't defer, we quickly end up calling the ufunc. This PR broke that assumption partially because that is necessary to break recursion.
There is a subtle issue that converting the array early might lose the __array_wrap__ information. I doubt there is a great solution for that, since I don't want to duplicate logic. Since this is already fast-path for most cases, not sure if there is a need to worry about the "unnecessary" array conversion.

seberg · 2024-08-06T11:31:53Z

OK, gh-27117 should fix 1. I looked at 2. a bit closer and I think it's not worth worrying about it, because the only way one could think it matters is for things that have the exact scalar array-priority, which is weird.
Otherwise, I suppose it could matter for scalar subclasses that define a priority, but we never supported that anyway. (Which seems fine to me: subclassing them is bound to create inconsistencies all over the place anyway.)

mroeschke · 2024-08-06T17:38:16Z

At least from the pandas side, issue 2. shouldn't be significant since there's only one use of __array_wrap__ which doesn't have significant logic to it.

seberg added 3 commits July 10, 2024 16:54

MAINT: For now, do not use exact checks (this is a follow up)

799f5fb

TST: Also test xor and rational (segfault currently not just longdouble)

56299a3

github-actions bot added the 00 - Bug label Jul 10, 2024

seberg commented Jul 10, 2024

View reviewed changes

seberg mentioned this pull request Jul 10, 2024

API: Do not consider subclasses for NEP 50 weak promotion #26905

Merged

TST: Refine/add tests for paths that now work for longdouble

ecbb3a4

seberg commented Jul 10, 2024

View reviewed changes

seberg added this to the 2.1.0 release milestone Jul 12, 2024

ngoldbaum self-requested a review July 13, 2024 15:34

ngoldbaum reviewed Jul 22, 2024

View reviewed changes

ngoldbaum added the triage review Issue/PR to be discussed at the next triage meeting label Jul 22, 2024

seberg commented Jul 22, 2024

View reviewed changes

numpy/_core/src/multiarray/scalartypes.c.src Show resolved Hide resolved

seberg and others added 3 commits July 22, 2024 20:17

Apply suggestions from code review

69c73f9

Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>

TST,MAINT: Remane scalar operator list for clarity

3af87c4

DOC: Clarify inline docs slightly and add link

d43d677

ngoldbaum merged commit 7d6973a into numpy:main Jul 24, 2024
68 checks passed

seberg deleted the scalar-recursion-fix branch July 24, 2024 18:29

QuLogic mentioned this pull request Aug 21, 2024

FIX: remove __array__ method on BasicUnit matplotlib/matplotlib#28744

Open

5 tasks

story645 mentioned this pull request Aug 22, 2024

quick fix dev build by locking out numpy version that's breaking things matplotlib/matplotlib#28752

Merged

5 tasks

Uh oh!

BUG,ENH: Fix generic scalar infinite recursion issues #26904

BUG,ENH: Fix generic scalar infinite recursion issues #26904

Uh oh!

Conversation

seberg commented Jul 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg Jul 10, 2024

Choose a reason for hiding this comment

Uh oh!

seberg commented Jul 10, 2024

Uh oh!

seberg Jul 10, 2024

Choose a reason for hiding this comment

Uh oh!

ngoldbaum left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ngoldbaum Jul 22, 2024

Choose a reason for hiding this comment

Uh oh!

ngoldbaum Jul 22, 2024

Choose a reason for hiding this comment

Uh oh!

seberg Jul 22, 2024

Choose a reason for hiding this comment

Uh oh!

ngoldbaum Jul 22, 2024

Choose a reason for hiding this comment

Uh oh!

seberg Jul 22, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ngoldbaum commented Jul 24, 2024

Uh oh!

mroeschke commented Aug 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Aug 5, 2024

Uh oh!

seberg commented Aug 6, 2024

Uh oh!

seberg commented Aug 6, 2024

Uh oh!

mroeschke commented Aug 6, 2024

Uh oh!

Uh oh!

seberg commented Jul 10, 2024 •

edited

Loading

mroeschke commented Aug 5, 2024 •

edited

Loading