-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
[NOGIL] thread local promotion state #26368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NOGIL] thread local promotion state #26368
Conversation
Looks like there are some linux issues that need smoothing out at least |
I don't mind doing this, it certainly is better/correct also in the non no-gil branch. But, do you have an actual reason/failure you need this for? I.e. I would like to just delete all of this, and the main reason I haven't already started with that, is that I think as long as backports to 2.0 are very likely, it might make things complicated. |
The only way I could consistently trigger the data race was with a test like one I added. I understand the desire to ease backports and if you don't want this that's fine. I worry a little bit about occasional breakage in threaded nogil tests because there are a few spots in the code where the promotion state is used like a context variable. That said, those code paths have a data race even with the GIL I think, so maybe not a big deal? If we don't upstream this I'd like to figure out the linux build issues, see if there are any other build issues, and upstream the |
Well, I realize that this is a bug fix effectively. To be clear: I don't mind adding this on the grounds of backports! What I mostly wanted to say is that
So, I don't mind reviewing+merging this at all and it seems that build issue is useful. But it seems to me like this fix, while correct, might not be worth the trouble. |
The new test is really only using those functions to simulate the data race that happens when the promotion state gets juggled internally. Happy to remove all of this when we delete the promotion state code. |
OK, will be nice and eventually necessary to get the |
035c9bc
to
04f941b
Compare
Almost working, but it looks like some of the windows test failures are real so I need to do some debugging on a windows box to figure out why |
c653580
to
0cc359d
Compare
I think I've resolved all the build issues, although I'd like to run the tests with the spin issue fixed to see if there are problems with clang-cl. |
0cc359d
to
2e82f75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming this is almost ready, so had a closer look. LGTM, modulo the 2-3 simple comments/fixes. So will approve, but those need fixing.
I don't know if all the thread-local definitions are right, but they look good.
npy_promotion_state == NPY_USE_WEAK_PROMOTION_AND_WARN | ||
int promotion_state = get_npy_promotion_state(); | ||
if (promotion_state == NPY_USE_LEGACY_PROMOTION || ( | ||
get_npy_promotion_state() == NPY_USE_WEAK_PROMOTION_AND_WARN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_npy_promotion_state() == NPY_USE_WEAK_PROMOTION_AND_WARN | |
promotion_state == NPY_USE_WEAK_PROMOTION_AND_WARN |
@@ -12,7 +12,8 @@ extern NPY_NO_EXPORT npy_intp REQUIRED_STR_LEN[]; | |||
#define NPY_USE_LEGACY_PROMOTION 0 | |||
#define NPY_USE_WEAK_PROMOTION 1 | |||
#define NPY_USE_WEAK_PROMOTION_AND_WARN 2 | |||
extern NPY_NO_EXPORT int npy_promotion_state; | |||
|
|||
extern NPY_NO_EXPORT NPY_TLS int npy_promotion_state; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extern NPY_NO_EXPORT NPY_TLS int npy_promotion_state; |
Since you add the functions, keeping the definition strange?
Unless you go back to using it directly (if that is even technically possible).
NPY_NO_EXPORT PyObject *NO_NEP50_WARNING_CTX = NULL; | ||
NPY_NO_EXPORT PyObject *npy_DTypePromotionError = NULL; | ||
NPY_NO_EXPORT PyObject *npy_UFuncNoLoopError = NULL; | ||
|
||
NPY_NO_EXPORT NPY_TLS int npy_promotion_state = NPY_USE_LEGACY_PROMOTION; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NPY_NO_EXPORT NPY_TLS int npy_promotion_state = NPY_USE_LEGACY_PROMOTION; | |
static NPY_TLS int npy_promotion_state = NPY_USE_LEGACY_PROMOTION; |
Make it static now that it you have the function accessors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out, it didn't occur to me but of course this is nicer.
@@ -262,7 +264,7 @@ foreach optional_attr: optional_variable_attributes | |||
return 0; | |||
} | |||
''' | |||
if cc.compiles(code) | |||
if cc.compiles(code, name: optional_attr[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Full discloser, I don't know why this is needed, but it seems fine :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It adds output for the numpy build stdout when it checks for these:
Checking if "thread_local" compiles: NO
Checking if "_Thread_local" compiles: YES
Checking if "__thread" compiles: YES
Checking if "__declspec(thread)" compiles: NO
Checking for function "fallocate" : NO
I found it useful for debugging on CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice solution.
This makes the promotion state a thread local variable and mediates access to it via getter and setter functions.
In order to use it I extended the existing
NPY_TLS
macro to also check for some C11/C23 possibilities. This macro is public - is it problematic to update it? A github search indicates it's not used much. I'm not sure if the macro works on all supported platforms, let's see. I don't think there are any supported platforms or compilers that are not covered by one of the spellings for thread-local storage I added to the meson config but please let me know if you're aware of any issues.I decided to mark this variable as thread-local irrespective of the GIL setting. I don't think marking these variables as thread-local adds any overhead. Are there any benchmarks that might be sensitive to overhead in setting or getting the promotion state?
The test I added reliably fails before the promotion state in the nogil build on
main
.