BUG: Ensure einsum uses chunking (now that nditer doesn't) #28043

seberg · 2024-12-19T21:18:42Z

The nditer refactor enabled the GROWINNER for reductions, this was not the case before (I am not sure about cases where the reduction is in an outer axis, but I don't think so).

Either way, as the test says, chunking always improves the precision of large sums if their mean is nonzero and sklearn noticed lower precision.

We could only remove growinner if there is a reduction, but it seems like a 1% performance hit for the simplest (non-trivial) case where GROWINNER had done something before.

I suspect this is a pragmatic thing (little downside, better precision for large arrays). An even better thing would be pairwise summation, within einsum I suppose.

I.e. the reason for the chunking used to be that the iterator didn't manage to optimize this (an explicit code comment). But now, it became a choice to help the sum precision a bit...

See also scikit-learn/scikit-learn#30509, ping @lesteve

The nditer refactor enabled the `GROWINNER` for reductions, this was not the case before (I am not sure about cases where the reduction is in an outer axis, but I don't think so). Either way, as the test says, chunking always improves the precision of large sorts if their mean is nonzero and sklearn noticed lower precision. We could only remove growinner if there is a reduction, but it seems like a 1% performance hit for the simplest (non-trivial) case where GROWINNER had done something before.

lesteve · 2024-12-20T08:42:57Z

Thanks a lot for looking at this! I can confirm that this PR fixes scikit-learn/scikit-learn#30509.

mhvk

Seems a sensible solution for now. Two small comments inline.

mhvk · 2024-12-20T18:46:48Z

numpy/_core/tests/test_einsum.py

+    """
+    num = 100000000
+    value = 1.00000000002
+    res = np.einsum("i->", np.full(num, value)) / num


I'd tend to use float32 and fewer elements, just to keep the test fast (now 147 ms on my machine).

Doesn't work :). Einsum uses double for the accumulation, so for float32, the results are actually better without chunking...

But, let me use broadcasting and make the number smaller, certainly doesn't need this much.

mhvk · 2024-12-20T18:47:44Z

numpy/_core/src/multiarray/einsum.c.src

@@ -1035,7 +1035,6 @@ PyArray_EinsteinSum(char *subscripts, npy_intp nop,
    iter_flags = NPY_ITER_EXTERNAL_LOOP|
            NPY_ITER_BUFFERED|
            NPY_ITER_DELAY_BUFALLOC|
-            NPY_ITER_GROWINNER|


How about adding a comment here? Partially as a reminder that a pair-wise sum might be a good idea...

Yeah, didn't think of the "let's do pairwise" spin, so wasn't sure.

mhvk

Super!

lesteve · 2024-12-21T10:51:12Z

Great, thanks a lot for the investigation and the fix!

Maybe one day I will manage to understand these tricky floating point things, maybe one day 😅.

seberg · 2024-12-21T11:09:43Z

Maybe one day I will manage to understand these tricky floating point things

Conceptually, (in decimals) it comes down to:
Say 10. + 1.0001 == 11.0000 in floating point math, but for smaller values 9 + 1.0001 was still exact (not true, but...).
Then sum([1.0001] * 10) might give you 10.001 exactly, but sum([1.0001] * 20) will be the reach 10.001 and after that ignore the 0.0001 part and return 20.001 rather than 20.002.

But if you chunk it up into batches of 10 sum([1.0001] * 10) + sum([1.0001] * 10) you get 10.001 + 10.001 which is precise: It is more precise to add similar values and chunking/ or pairwise summation ((a0 + a1) + (a2 + a3)) + ... tends to add more similar values.

github-actions bot added the 00 - Bug label Dec 19, 2024

mhvk reviewed Dec 20, 2024

View reviewed changes

Refine test and add code comment

d1fa379

mhvk approved these changes Dec 20, 2024

View reviewed changes

mhvk merged commit 9aa5cda into numpy:main Dec 20, 2024
66 checks passed

seberg deleted the einsum-always-chunk branch December 21, 2024 11:02

ogrisel mentioned this pull request Dec 23, 2024

⚠️ CI failed on Linux_Nightly.pylatest_pip_scipy_dev (last failure: Dec 22, 2024) ⚠️ scikit-learn/scikit-learn#30509

Closed

lesteve mentioned this pull request Jan 4, 2025

scikit-learn 1.6 changed behavior of growing trees scikit-learn/scikit-learn#30554

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Ensure einsum uses chunking (now that nditer doesn't) #28043

BUG: Ensure einsum uses chunking (now that nditer doesn't) #28043

seberg commented Dec 19, 2024 •

edited

Loading

Uh oh!

lesteve commented Dec 20, 2024 •

edited

Loading

Uh oh!

mhvk left a comment

Uh oh!

mhvk Dec 20, 2024

Uh oh!

seberg Dec 20, 2024

Uh oh!

mhvk Dec 20, 2024

Uh oh!

seberg Dec 20, 2024

Uh oh!

mhvk left a comment

Uh oh!

Uh oh!

lesteve commented Dec 21, 2024 •

edited

Loading

Uh oh!

seberg commented Dec 21, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

BUG: Ensure einsum uses chunking (now that nditer doesn't) #28043

BUG: Ensure einsum uses chunking (now that nditer doesn't) #28043

Conversation

seberg commented Dec 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lesteve commented Dec 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

mhvk Dec 20, 2024

Choose a reason for hiding this comment

Uh oh!

seberg Dec 20, 2024

Choose a reason for hiding this comment

Uh oh!

mhvk Dec 20, 2024

Choose a reason for hiding this comment

Uh oh!

seberg Dec 20, 2024

Choose a reason for hiding this comment

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

lesteve commented Dec 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Dec 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

seberg commented Dec 19, 2024 •

edited

Loading

lesteve commented Dec 20, 2024 •

edited

Loading

lesteve commented Dec 21, 2024 •

edited

Loading

seberg commented Dec 21, 2024 •

edited

Loading