Skip to content

Commit 27de286

Browse files
authored
bpo-41513: Save unnecessary steps in the hypot() calculation (#21994)
1 parent e6dcd37 commit 27de286

File tree

1 file changed

+10
-5
lines changed

1 file changed

+10
-5
lines changed

Modules/mathmodule.c

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2447,6 +2447,14 @@ precision. When a value at or below 1.0 is correctly rounded, it
24472447
never goes above 1.0. And when values at or below 1.0 are squared,
24482448
they remain at or below 1.0, thus preserving the summation invariant.
24492449
2450+
Another interesting assertion is that csum+lo*lo == csum. In the loop,
2451+
each scaled vector element has a magnitude less than 1.0. After the
2452+
Veltkamp split, *lo* has a maximum value of 2**-27. So the maximum
2453+
value of *lo* squared is 2**-54. The value of ulp(1.0)/2.0 is 2**-53.
2454+
Given that csum >= 1.0, we have:
2455+
lo**2 <= 2**-54 < 2**-53 == 1/2*ulp(1.0) <= ulp(csum)/2
2456+
Since lo**2 is less than 1/2 ulp(csum), we have csum+lo*lo == csum.
2457+
24502458
The square root differential correction is needed because a
24512459
correctly rounded square root of a correctly rounded sum of
24522460
squares can still be off by as much as one ulp.
@@ -2519,11 +2527,8 @@ vector_norm(Py_ssize_t n, double *vec, double max, int found_nan)
25192527
csum += x;
25202528
frac += (oldcsum - csum) + x;
25212529

2522-
x = lo * lo;
2523-
assert(fabs(csum) >= fabs(x));
2524-
oldcsum = csum;
2525-
csum += x;
2526-
frac += (oldcsum - csum) + x;
2530+
assert(csum + lo * lo == csum);
2531+
frac += lo * lo;
25272532
}
25282533
h = sqrt(csum - 1.0 + frac);
25292534

0 commit comments

Comments
 (0)