-
-
Notifications
You must be signed in to change notification settings - Fork 32.5k
gh-128942: make arraymodule.c free-thread safe (lock-free) #130771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ping @colesbury |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Disclaimer: I'm not an expert on the FT list implementation, so take some of my comments with a grain of salt.
Seeing good single-threaded performance is nice, but what about multi-threaded scaling? The number of locks that are still here scare me a little--it would be nice if this scaled well for concurrent use as well, especially for operations that don't require concurrent writes (e.g., comparisons and copies).
Note, this is not ready to go, there is the memory issue which needs resolving. |
@ZeroIntensity you can remove the do-not-merge, its not an |
The main thing here for acceptance is a benchmark run which I am not able to start (I only did local pyperformance check against main), so someone with access will have to initiate that to compare with main. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't gotten a chance to look through arraymodule.c
yet. I'll review that later this week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The overall approach here seems good. A few comments below.
The actual
Are there any other places where this needs to take place? Its the test and trying to run it with
Which is not Left the bad |
I'd like
Yes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there!
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Its been a while so I forgot that the array module shouldn't be in Setup.stdlib.in but rather in Setup.bootstrap.in so it is compiled as built-in as per: #130771 (comment), sent up fix. |
Modules/arraymodule.c
Outdated
} | ||
} | ||
else { // typecode == 'w' | ||
Py_ssize_t ustr_length = PyUnicode_GetLength(ustr); | ||
Py_ssize_t old_size = Py_SIZE(self); | ||
Py_ssize_t new_size = old_size + ustr_length; | ||
|
||
if (new_size < 0 || (size_t)new_size > PY_SSIZE_T_MAX / sizeof(Py_UCS4)) { | ||
if (new_size < 0 || !arraydata_size_valid(new_size, sizeof(Py_UCS4))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new_size < 0
checks for overflow here. Maybe it is worth to do clarify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved the < 0
check into arraydata_size_valid()
instead. I can technically remove some of the other checks like (Py_SIZE(self) > PY_SSIZE_T_MAX - Py_SIZE(b)
with this, but want opinion first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not big expert in C, but AFAIK an integer overflow is undefined behavior in C. So, it should be avoided as I can tell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, if you really want to enforce that then this code was non-compliant to begin with and continues to be so, with or without the new_size < 0
. But I'm guessing the "correct" behavior would be overkill as it is not used elsewhere here either.
Or if you want I could make them all compliant, your call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see two ways:
size_t new_size = old_size + ustr_length; if (new_size > Py_SSIZE_T_MAX || ...
if ((size_t)old_size + ustr_length > Py_SSIZE_T_MAX) { PyErr_NoMemory(); } Py_ssize_t new_size = old_size + ustr_lengt; ...
It is up to you what is a best here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, what I meant mostly. But if you want to dot all the i's, the size_t new_size = old_size + ustr_length
is UB from what I understand without the explicit cast size_t new_size = (size_t)old_size + ustr_length
.
Also note the if ((size_t)old_size + ustr_length > Py_SSIZE_T_MAX)
is handled implicitly by the check in arraydata_size_valid()
. Also the array_resize(self, new_size)
using a size_t new_size
which is guaranteed to be <= Py_SSIZE_T_MAX
is defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size_t new_size = (size_t)old_size + ustr_length.
- yeah, good catch!
if ((size_t)old_size + ustr_length > Py_SSIZE_T_MAX)
- I mean that you should check overflow before declaring a Py_ssize_t new_size
variable. If you use size_t new_size
it is not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we are in agreement.
Lib/test/test_array.py
Outdated
# array module is not free-thread safe. | ||
|
||
def setUp(self): | ||
self.enterContext(warnings.catch_warnings()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What warnings are emitted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly don't remember. It might be from when the thing was non-racy and running TSAN tests, but useless now so will remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need these changes anymore as we aren't using _Alignas
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice _Alignof
in a few places, so will leave as a keyword but remove the _Alignas
rules.
I added lock-free single element reads and writes by mostly copying the
list
object's homework. TL;DR: pyperformance scimark seems to be back to about what it was without the free-thread safe stuff (pending confirmation of course). Tried a few other things but the list strategy seems good enough (except for the negative index thing I mentioned in #130744, if that is an issue).Timings, the relevant ones are "OLD" - non free-thread safe arraymodule, "SLOW" - the previous slower PR and the last two "LFREERW".
array
module is not free-thread safe. #128942