Skip to content

ext/bcmath: Improving bcpow() performance #18099

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 29, 2025

Conversation

SakiTakamachi
Copy link
Member

@SakiTakamachi SakiTakamachi commented Mar 17, 2025

I ran 500,000 comparisons with PHP 8.3 results. (I would like to perform more comparisons, but PHP 8.3's bcpow() is too heavy and the comparisons never finish.)
No problems were detected.

Benchmarks

The speed difference is especially noticeable when base is less than 1 and there are many leading zeros.
This is because try to eliminate as many zeros as possible during the calculation, and then fill in the missing digits with zeros later.

1:

for ($i = 0; $i < 2000000; $i++) {
    bcpow('123', '4', 0);
}
Benchmark 1: /php-dev/sapi/cli/php /mount/bc/pow/1.php
  Time (mean ± σ):     244.8 ms ±   1.3 ms    [User: 235.5 ms, System: 4.2 ms]
  Range (min … max):   242.8 ms … 247.7 ms    12 runs
 
Benchmark 2: /master/sapi/cli/php /mount/bc/pow/1.php
  Time (mean ± σ):     279.4 ms ±   3.4 ms    [User: 269.1 ms, System: 5.2 ms]
  Range (min … max):   275.7 ms … 287.4 ms    10 runs
 
Summary
  '/php-dev/sapi/cli/php /mount/bc/pow/1.php' ran
    1.14 ± 0.02 times faster than '/master/sapi/cli/php /mount/bc/pow/1.php'

2:

for ($i = 0; $i < 200000; $i++) {
    bcpow('123456.789', '64', 10);
}
Benchmark 1: /php-dev/sapi/cli/php /mount/bc/pow/2.php
  Time (mean ± σ):     360.9 ms ±   1.1 ms    [User: 350.9 ms, System: 4.8 ms]
  Range (min … max):   359.4 ms … 363.2 ms    10 runs
 
Benchmark 2: /master/sapi/cli/php /mount/bc/pow/2.php
  Time (mean ± σ):     506.4 ms ±   4.3 ms    [User: 496.2 ms, System: 4.7 ms]
  Range (min … max):   501.0 ms … 514.3 ms    10 runs
 
Summary
  '/php-dev/sapi/cli/php /mount/bc/pow/2.php' ran
    1.40 ± 0.01 times faster than '/master/sapi/cli/php /mount/bc/pow/2.php'

3:

for ($i = 0; $i < 100000; $i++) {
    bcpow('0.00123456', '120', 10);
}
Benchmark 1: /php-dev/sapi/cli/php /mount/bc/pow/3.php
  Time (mean ± σ):     304.9 ms ±   2.9 ms    [User: 295.4 ms, System: 4.4 ms]
  Range (min … max):   299.9 ms … 308.7 ms    10 runs
 
Benchmark 2: /master/sapi/cli/php /mount/bc/pow/3.php
  Time (mean ± σ):     808.8 ms ±   9.0 ms    [User: 797.4 ms, System: 5.6 ms]
  Range (min … max):   799.3 ms … 827.3 ms    10 runs
 
Summary
  '/php-dev/sapi/cli/php /mount/bc/pow/3.php' ran
    2.65 ± 0.04 times faster than '/master/sapi/cli/php /mount/bc/pow/3.php'

4:

for ($i = 0; $i < 1000; $i++) {
    bcpow('0.00000000000000123456', '2000', 10);
}
Benchmark 1: /php-dev/sapi/cli/php /mount/bc/pow/4.php
  Time (mean ± σ):     433.7 ms ±   1.9 ms    [User: 422.2 ms, System: 6.2 ms]
  Range (min … max):   431.6 ms … 437.6 ms    10 runs
 
Benchmark 2: /master/sapi/cli/php /mount/bc/pow/4.php
  Time (mean ± σ):      6.024 s ±  0.071 s    [User: 6.011 s, System: 0.007 s]
  Range (min … max):    5.986 s …  6.225 s    10 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Summary
  '/php-dev/sapi/cli/php /mount/bc/pow/4.php' ran
   13.89 ± 0.18 times faster than '/master/sapi/cli/php /mount/bc/pow/4.php'

@SakiTakamachi SakiTakamachi marked this pull request as ready for review March 18, 2025 00:06
@SakiTakamachi SakiTakamachi marked this pull request as draft March 18, 2025 00:31
@SakiTakamachi
Copy link
Member Author

There were still some things I wanted to change, so I reverted it to a draft.

@SakiTakamachi SakiTakamachi force-pushed the bcmath/pow branch 2 times, most recently from 33bfabf to f303bd5 Compare March 19, 2025 15:09
@SakiTakamachi SakiTakamachi marked this pull request as ready for review March 19, 2025 16:46
@SakiTakamachi
Copy link
Member Author

Ready for review.
The failed tests are irrelevant.

cc: @nielsdos

@SakiTakamachi SakiTakamachi force-pushed the bcmath/pow branch 2 times, most recently from a1f6bbf to 76d0e61 Compare March 23, 2025 10:01
@SakiTakamachi SakiTakamachi marked this pull request as draft March 23, 2025 12:04
@SakiTakamachi
Copy link
Member Author

I force pushed by mistake.

I'll open it again once I've confirmed the code 🙏

@SakiTakamachi SakiTakamachi marked this pull request as ready for review March 23, 2025 14:41
@SakiTakamachi
Copy link
Member Author

I reverted to the original code.

}

size_t base_arr_size = BC_ARR_SIZE_FROM_LEN(base_len);
size_t max_power_arr_size = base_arr_size * exponent;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this overflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check for base->n_len * exponent.
So guaranteed not to overflow here. I added a comment.
b96f2d9

size_t max_power_arr_size = base_arr_size * exponent;

/* The allocated memory area is reused on a rotational basis, so the same size is required. */
BC_VECTOR *buf = safe_emalloc(max_power_arr_size * 3, sizeof(BC_VECTOR), 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be safe_emalloc(max_power_arr_size, 3 * sizeof(BC_VECTOR), 0); for safety

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 2556951

Comment on lines 104 to 107
/* Remove the leading zeros as they will be filled in later. */
while (*base_ptr++ == 0) {
base_len--;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will already move base_ptr even if the first element is not 0. Did you mean this?

Suggested change
/* Remove the leading zeros as they will be filled in later. */
while (*base_ptr++ == 0) {
base_len--;
}
/* Remove the leading zeros as they will be filled in later. */
while (*base_ptr == 0) {
base_ptr++;
base_len--;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, damn, you're absolutely right. Thanks!
Fixed in 2556951

bc_square_ex(power, &power, pwrscale);
exponent = exponent >> 1;
size_t base_len = base->n_len + base->n_scale;
size_t power_len = base->n_len * exponent;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder which of these can overflow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check in b96f2d9
Thanks!


/* Pad with leading zeros if necessary. */
while (power_leading_zeros > sizeof(uint32_t)) {
bc_write_bcd_representation(0, pptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an inefficient way of writing the zeros, better use one call to memset instead of this while+for loops actually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had seen somewhere that memset was slow, so I had been avoiding it.
I've now updated the code to use memset.

in 2556951

if (!bc_raise(first, exponent, &result, scale)) {
zend_throw_exception_ex(zend_ce_division_by_zero_error, 0, "Negative power of zero");
goto cleanup;
switch (bc_raise(first, exponent, &result, scale)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code to handle the errors is duplicated. I think it would be great to have a separate function that throws the correct exception when bc_raise fails. Then, here and in bcmath_number_pow_internal you can call that function is bc_raise return value != BC_RAISE_STATUS_OK.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 20c9309

@@ -74,6 +74,10 @@ typedef struct bc_struct {
#define MAX(a, b) ((a)>(b)?(a):(b))
#define MIN(a, b) ((a)>(b)?(b):(a))

#ifndef SIZE_T_MAX
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a standard macro SIZE_MAX in limits.h

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 54729fd

/* check overflow */
if (UNEXPECTED(base->n_len > SIZE_T_MAX / exponent)) {
bc_free_num (result);
*result = bc_copy_num(BCG(_one_));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it necessary to return a number in result anyway after this check fails? (Same question below)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, there was nothing that required action here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 91e3890

@SakiTakamachi SakiTakamachi merged commit c5f3281 into php:master Apr 29, 2025
9 checks passed
@SakiTakamachi SakiTakamachi deleted the bcmath/pow branch April 29, 2025 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants