Skip to content

gh-118164: str(10**10000) hangs if the C _decimal module is missing #118503

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
May 4, 2024

Conversation

tim-one
Copy link
Member

@tim-one tim-one commented May 2, 2024

This is somewhat of a conceptual mess, related to two issues:

#118164

str(10**10000) hangs if the C _decimal module is missing

#118027

_pydecimal.Decimal.pow tries to compute 10**1000000000000000000 and thus doesn't terminate

These only occur if the C decimal module isn't available, so _pydecimal.py is used instead. But CPython doesn't use _pydecimal.py for anything anymore, so this can only happen in a damaged CPython installation.. Or, significantly, under PyPy, which hasn't incorporated the C libmpdec CPython uses to implement decimal.

Confounding it is that the first issue above can't be tested even after this PR's changes. This PR repairs the proximate cause of the failure, which causes it to fail instead later. That failure is the subject of a different PR:

#118483

If someone can think of a sane way to test this, I'd be grateful 😄l.

@rhettinger rhettinger removed their request for review May 3, 2024 00:15
@serhiy-storchaka
Copy link
Member

I came to the same solution.

The simple test can be based on the #118027 example.

But the case of xc exactly equal to 10**p is not currently covered by tests. So you can remove _is_power_of_10, and the tests will still pass. How difficult is to write a test for this case?

@tim-one
Copy link
Member Author

tim-one commented May 3, 2024

EDIT: never mind. I still don't understand how _pydecimal gets tested at all (if it in fact does), but I did find a simple way to force test_decimal.py to try the simple test described below using _pydecimal.py.

I don't know anything about how to write tests for _pydecimal, and after too many minutes of staring at test_decimcal.py couldn't even figure out how (or if) it gets tested at all.

@rhettinger, is this shallow to you? I need help to get started. I'd like to add a test that sets the context Emax to decimal.MAX_EMAX, and then evaluates decimal.Decimal(2) ** 117. That's it. But it has to be tried under _pydecimal.py (where it currently appears to hang, while computing 10**MAX_EMAX).

@tim-one
Copy link
Member Author

tim-one commented May 3, 2024

But the case of xc exactly equal to 10**p is not currently covered ... How difficult is to write a test for this case?

No idea. xc at this point has been transformed from what it was at input, and so far nobody seems to know what its possible range may be at this point in the code (nor at the other point where 10**p is computed) . Neither how to contrive an input that ends up with xc being large at these points. Any input xc that's an integer power of 10 >= 1 was handled long before reaching these points.

@tim-one
Copy link
Member Author

tim-one commented May 4, 2024

A smidgen of progress: in the first block that tests whether xc > 10**p, xc was transformed to be one of 2**e or 5**e. Neither can be a power of 10 (unless e is 0), so only len(str(xs)) <= p is possibly relevant.

@tim-one
Copy link
Member Author

tim-one commented May 4, 2024

And another smidgen: in the other block, xc ix changed to xc**m. That could be very large on its own, but the code doing that - before doing that - does a fast check using a lower bound on log10(xc) which weeds out results materially larger than 10**p. So our str(xc) in this block can at worst only have a few more than p digits. Checking it against 10**p exactly before converting to string wasn't really important, but faster in "too large!" cases. Which never happen when p is in the quintillions 😉

@tim-one
Copy link
Member Author

tim-one commented May 4, 2024

After the comment here:

        # if m > p*100//_log10_lb(xc) then m > p/log10(xc), hence xc**m >
        # 10**p and the result is not representable.
        if xc > 1 and m > p*100//_log10_lb(xc):
            return None

I inserted:

        assert xc != 1, self
        assert not _is_power_of_10(str(xc)), self

The second assert is more general. No code I have triggers them (including test_decimal), and I haven't been able to contrive inputs that trigger them.

So I've so far failed to falsify my original guess that, near the end of this code, xc is never a power of 10. In which case, xc**m can never be a power of 10 either, so that in the original xc > 10**p, xc == 10**p isn't possible.

Yet the code guards against xc == 1 anyway. For example, in the code quoted above, xc > 1 and ..., best I can tell xc > 1 is always true (for other reasons xc cannot be <= 0).

If so, doesn't mean there's a bug, just that the code is being a tiny bit more cautious than necessary. For example, perhaps this code was written before the much earlier code in the function that special-cases the snot out of inputs where xc is a power of 10 >= 1 (and it's because of that code that I first believed all such cases were probably handled in full early on).

Also if so, str(xc) > p alone is both necessary and sufficient for this function to return None. The case of a 1 followed by p (or any other number of) zeros can't happen in the two code blocks at issue.

tim-one and others added 4 commits May 4, 2024 13:39
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Serhiy and I independently concluded that exact powers of 10
aren't possible in these contexts, so just checking the
string length is sufficient.
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@tim-one tim-one merged commit 999f0c5 into python:main May 4, 2024
@tim-one tim-one deleted the pydec2 branch May 4, 2024 23:22
@serhiy-storchaka serhiy-storchaka added the needs backport to 3.12 only security fixes label May 5, 2024
@miss-islington-app
Copy link

Thanks @tim-one for the PR 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 5, 2024
…sing (pythonGH-118503)

* Initial stab.

* Test the tentative fix. Hangs "forever" without this change.

* Move the new test to a better spot.

* New comment to explain why _convert_to_str allows any poewr of 10.

* Fixed a comment, and fleshed out an existing test that appeared unfinished.

* Added temporary asserts. Or maybe permanent ;-)

* Update Lib/_pydecimal.py

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

* Remove the new _convert_to_str().

Serhiy and I independently concluded that exact powers of 10
aren't possible in these contexts, so just checking the
string length is sufficient.

* At least for now, add the asserts to the other block too.

* 📜🤖 Added by blurb_it.

---------

(cherry picked from commit 999f0c5)

Co-authored-by: Tim Peters <tim.peters@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
@bedevere-app
Copy link

bedevere-app bot commented May 5, 2024

GH-118584 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 only security fixes label May 5, 2024
@serhiy-storchaka
Copy link
Member

I think that this was a bug, and this change should be backported.

@tim-one
Copy link
Member Author

tim-one commented May 5, 2024

@serhiy-storchaka, backporting is a fine idea. I'll leave it to you 😄.

serhiy-storchaka added a commit that referenced this pull request May 5, 2024
…ssing (GH-118503) (GH-118584)

Serhiy and I independently concluded that exact powers of 10
aren't possible in these contexts, so just checking the
string length is sufficient.

(cherry picked from commit 999f0c5)

Co-authored-by: Tim Peters <tim.peters@gmail.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
SonicField pushed a commit to SonicField/cpython that referenced this pull request May 8, 2024
…sing (python#118503)

* Initial stab.

* Test the tentative fix. Hangs "forever" without this change.

* Move the new test to a better spot.

* New comment to explain why _convert_to_str allows any poewr of 10.

* Fixed a comment, and fleshed out an existing test that appeared unfinished.

* Added temporary asserts. Or maybe permanent ;-)

* Update Lib/_pydecimal.py

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

* Remove the new _convert_to_str().

Serhiy and I independently concluded that exact powers of 10
aren't possible in these contexts, so just checking the
string length is sufficient.

* At least for now, add the asserts to the other block too.

* 📜🤖 Added by blurb_it.

---------

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants