-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Fix GH-16870: gmp_pow(64, 11) throws overflow exception #16884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The threshold logic should be unified with #16880 and other places in gmp.c. |
ext/gmp/gmp.c
Outdated
|
||
if (Z_TYPE_P(base_arg) == IS_LONG && Z_LVAL_P(base_arg) >= 0) { | ||
INIT_GMP_RETVAL(gmpnum_result); | ||
if ((log(Z_LVAL_P(base_arg)) * exp) > powmax) { | ||
if ((log(Z_LVAL_P(base_arg)) / log(256) * exp) > powmax) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically what is already happening in this patch. The difference is that I'm counting bytes, not bits. The logic is to estimate the number of bytes the result would have, and if this is greater than some threshold (powmax
is badly named; should be rather max_size_in_bytes
or so), trigger a value error.
mpz_sizeinbase()
is only used in the else branch, because if the input is a zend_long
, that wouldn't work. So I'm calculating log256(base_arg) * exp
instead.
I'll update to counting bytes.
Oh, I had only checked with 64bit builds, but now noticed that two test cases are failing on 32bit. I could further reduce |
I think this fix is correct, so we may as well apply it to lower branches. We could also have different thresholds for 32 and 64 bit. |
The current threshold is very low; it would only allow for numbers with roughly 2^14 bits (while in theory 2^34 bits on 32bit systems, and 2^37 bits on 64bit systems would be supported). While that would be sufficient for the proposed test cases #16898 and #16896, I think we should actually raise that threshold (and fix the failing test cases). Even 2^24 bits shouldn't cause serious issues regarding memory consumption (would take about 2MB per number; default memory_limit is 128MB). And for users doing some scientific computing, even that might be too low. |
No, not really. The problem is that Negative bases will never take the special casing for ints, but otherwise have the same issue regarding the miscalculation for PS: from the
So subtracting 1 indeed makes sense. |
That is equivalent to `floor(log2(Z_LVAL_P(base_arg)))` and as such more suitable[1]. Since this now fails gmp_pow_fpe.phpt, we use an int there. [1] <php#16884 (comment)>
Added two commits, rebased onto PHP-8.2 and force-pushed. |
The current guard to prevent FPEs is way too restrictive; `64 ** 11` is a perfectly reasonable operation. Instead, we now estimate the number of bytes of the resulting GMP (assuming that numbers are stored base 256 encoded), and fail if that exceeds a given threshold. The chosen threshold is somewhat arbitrary. We also ensure that we do not prematurely convert a given non int base to an int to avoid overflow which could circumvent our early check.
That is equivalent to `floor(log2(Z_LVAL_P(base_arg)))` and as such more suitable[1]. Since this now fails gmp_pow_fpe.phpt, we use an int there. [1] <php#16884 (comment)>
`[-1, 1]` are special, since the result always has a single digit (ignoring the sign). We also test the edge cases regarding the difference between `log2(x)` and `mpz_sizeinbase(x, 2) - 1`, where the latter may support higher numbers (always less than an order of binary magnitude). These tests need to be adapted if we change `max_bits` (what we likely will do).
Rebased and force-pushed again (it's a pain to switch target branches; I'm not even sure this will now run on PHP-8.2 ; note to self: first change target branch, then force-push rebased PR). |
|
||
if (Z_TYPE_P(base_arg) == IS_LONG && Z_LVAL_P(base_arg) >= 0) { | ||
INIT_GMP_RETVAL(gmpnum_result); | ||
if ((log(Z_LVAL_P(base_arg)) * exp) > powmax) { | ||
if ((log2(Z_LVAL_P(base_arg)) * exp) > max_bits) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For parity with the else branch, this should be
if ((log2(Z_LVAL_P(base_arg)) * exp) > max_bits) { | |
if ((floor(log2(Z_LVAL_P(base_arg))) * exp) > max_bits) { |
But that would fail the last check in gmp_pow_fpe.phpt.
@@ -1350,22 +1350,20 @@ ZEND_FUNCTION(gmp_pow) | |||
RETURN_THROWS(); | |||
} | |||
|
|||
double powmax = log((double)ZEND_LONG_MAX); | |||
double max_bits = 16000; // TODO: value is very small, but passed current test suite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_bits
is a misnomer now; it's actually something like max_double_bits
. Maybe someone has an idea for a good name. Alternatively, we could double the value here, and check against > max_bits / 2.0
below. Shouldn't be a performance issue, if we declare const
or use a macro.
var_dump((string) gmp_pow(64, 11)); | ||
var_dump((string) gmp_pow(-1, PHP_INT_MAX)); | ||
var_dump((string) gmp_pow(0, PHP_INT_MAX)); | ||
var_dump((string) gmp_pow(1, PHP_INT_MAX)); | ||
var_dump((string) gmp_pow("-1", PHP_INT_MAX)); | ||
var_dump((string) gmp_pow("0", PHP_INT_MAX)); | ||
var_dump((string) gmp_pow("1", PHP_INT_MAX)); | ||
var_dump((string) gmp_pow(3, 10094)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add tests that use the **
operator, as those would also be affected I guess? Or did we somehow not covered this in the initial bug fix?
Maybe we shouldn't try to determine the threshold, scientific computation should be allowed to do whatever they want within their memory bounds. I must say the whole bits computation and comparison is confusing to me, like I understand the logic of it, but the implementation of that is another story, which is why I think your previous fix was correct (ignoring the threshold). I don't understand why you would want to double the |
We're trying to estimate the number of bits the result needs, but we're actually estimating the number of double-bits (= number of bits divided by 2), since we're using
Usually I'm very fine with this approach, but libgmp Anyhow, this is not a hill I'm willing to die on. I think we can just go with #16934 instead. |
I think the issue found by the fuzzer is just very theoretical, and unlikely TM to happen. Ideally, setting up GMP to use ZendMM would be best, but that apparently also has issues which is annoying... |
The current guard to prevent FPEs is way too restrictive;
64 ** 11
is a perfectly reasonable operation. Instead, we now estimate the number of bytes of the resulting GMP (assuming that numbers are stored base 256 encoded), and fail if that exceeds a given threshold. The chosen threshold is somewhat arbitrary.We also ensure that we do not prematurely convert a given non int base to an int to avoid overflow which could circumvent our early check.
Note that the threshold of 2000 seems way too small. To avoid triggering overflow handling during allocations it should rather be close to
SIZE_MAX
(or whatever libgmp/mpir assume there), but even 10,000 would causephp-src/ext/gmp/tests/gmp_pow_fpe.phpt
Lines 20 to 29 in ff3b4ec
to no longer throw, but instead yield reasonable results: numbers with a bit less than 10,000 and 5,000 digits, respectively. It's hard for me to tell whether these examples would actually cause libgmp's overflow check to trigger, since mpir doesn't even have these overflow checks.
In any way, it might not be the best idea to push the limits of our early overflow check just to avoid overflow checks to be triggered by libgmp, since we're likely hitting OOM, which causes abort.