-
-
Notifications
You must be signed in to change notification settings - Fork 31.8k
gh-96512: Modernize int_max_str_digits protection in 3.12 #96568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get the rationale for audit events on operations which can be slow. Is that a new trend? If you consider that it's so important to emit audit events on "slow" operations, why not adding on every single Python operation which is "slow"? Like... time.sleep()
, 10 ** (10 ** 10)
, or well... anything else.
It's hard for me to see how converting a large integer to a string, or a long string to an integer, is a threat. There is already the ValueErrror if it's larger than the limit.
Calling PySys_Audit() is cheap but it's not free. It has a minor impact on performance. Once it's added, it will hard to justify to remove it to "optimize Python". So I would prefer that we think twice about it.
When you're done making the requested changes, leave the comment: |
What's the rationale for that? It makes this specific option special and requires special code in Python/initconfig.c and other places. I would prefer to use an int for all integer options. If you want to put min or max values, you can add debug checks in config_check_consistency() and checks in _PyConfig_Read(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is hard to review, it does many things at once:
- Add PyConfig.int_max_str_digits: you should add tests
- Add a new int32_t type for PyConfig
- Add new audit events
- Update the int_max_str_digits documentation
You might split it into separated PR, like move the addition of audit events to a new PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the adding of PyConfig.int_max_str_digits, IMO PyInterpreterState.int_max_str_digits becomes redundant and can introduce inconsistencies. I suggest removing it.
In PyInterpreterState, PyInterpreterState.config is a directly structure, not a pointer. At the machine code level, reading PyInterpreterState.config.int_max_str_digits is efficient: the offset in the memory is constant.
nitpick: in the commit message, you should mention that the change also removes _Py_global_config_int_max_str_digits. I suggest to start with a PR with replaces _Py_global_config_int_max_str_digits + PyInterpreterState.int_max_str_digits with PyConfig.int_max_str_digits. |
Thanks! Yep this is entirely reasonable to split up into separate PRs. The audit hook can go away, it was just an idea. I've personally never used those. I used individual commits for each list item to make wrangling this into smaller PRs after concept review easy, will do. Using |
|
Let's just leave it as is. |
No, not safely, unless we know for sure that I'm more than happy to make |
In PyConfig, we can reject values which don't fit in an uint32_t, to make sure that the cast is safe. No? What's the issue on |
Assuming you mean
The usual one whenever you're converting between two signed integer types that aren't known to have the same width - potential for undefined behaviour from overflow. There's no guarantee from the C standard that I've put significant effort over the years into making sure that Object/longobject.c is highly portable and depends on very little beyond what's guaranteed by the C standards; I care very much about keeping that the case. But really, this isn't worth the discussion at this point; I really wasn't anticipating that my request to use Py_ssize_t instead of |
Will do. Crafting the change was an informative exercise regardless. I'm a fan of always using explicitly sized types, so any attempt to do so in a codebase that didn't start out that way reveals how ready it is or isn't. I'll dive back in on this code next week sometime, but questions in the back of my mind before I do that you may know already: PyConfig vs PyInterpreterState vs subinterpreters? Right now I believe each subinterpreter gets its own limit. We need a test for that. I wouldn't want a Similarly, PyConfig... is that intended to be mutable beyond process startup? I thought might be meant to represent the environment and flag values at start, not not necessarily settings that were changed via APIs afterwards. thoughts? |
Mark:
I don't think that only this specific change is a good motivation to require 32-bit int to build Python. So far, we did our best to support all architectures. We accepted a change to support m68k (the code was generalized recently: commit dec0757). Hum. I checked and it seems like m68k uses 32-bit int, at least on Linux ;-) Well, it seems like on Linux, int is always 32-bit. I heard (Wikipedia) about ILP64 and SILP64 which use 64-bit int, but it seems to be outdated:
All modern operating systems use 32-bit int. |
Mark:
My concern is more about the maintenance burden of Python/initconfig.c. Adding a new variable already requires a lot of code, as you can see in this PR. I would prefer to keep the code as simple as possible, by using only a few types: I'm fine with using Gregory:
So maybe all PyConfig integer members should become int32_t? I don't know. The |
Each interpreter has its own config in Oh, there was a test API to set _isolated_interpreter=1, but it seems like it's gone. Or I misremember. Example modifying the sub-interpreter config with _testinternalcapi.set_config(): import _testcapi
import _testinternalcapi
code = """
import _testinternalcapi
config = _testinternalcapi.get_config()
config['bytes_warning'] = 2
_testinternalcapi.set_config(config)
print("bytes warnings of the sub-interpreter?",
_testinternalcapi.get_config()['bytes_warning'])
"""
_testcapi.run_in_subinterp(code)
print("bytes warnings of main interpreter?",
_testinternalcapi.get_config()['bytes_warning']) Output:
Modifying the sub-interpreter config doesn't modify the main interpreter config ;-) |
the pair of PRs mentioned above replace this with & address the feedback. |
This takes care of the TODOs outlined in the linked issue:
int_max_str_digits
into PyConfig in 3.12.int32_t
orPy_ssize_t
instead ofint
for the max digits value internally.PySys_Audit
hook calls at the new ValueError raise spots.