Skip to content

[NOGIL] thread local promotion state #26368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 9, 2024

Conversation

ngoldbaum
Copy link
Member

@ngoldbaum ngoldbaum commented May 1, 2024

This makes the promotion state a thread local variable and mediates access to it via getter and setter functions.

In order to use it I extended the existing NPY_TLS macro to also check for some C11/C23 possibilities. This macro is public - is it problematic to update it? A github search indicates it's not used much. I'm not sure if the macro works on all supported platforms, let's see. I don't think there are any supported platforms or compilers that are not covered by one of the spellings for thread-local storage I added to the meson config but please let me know if you're aware of any issues.

I decided to mark this variable as thread-local irrespective of the GIL setting. I don't think marking these variables as thread-local adds any overhead. Are there any benchmarks that might be sensitive to overhead in setting or getting the promotion state?

The test I added reliably fails before the promotion state in the nogil build on main.

@ngoldbaum ngoldbaum added the 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) label May 1, 2024
@ngoldbaum
Copy link
Member Author

Looks like there are some linux issues that need smoothing out at least

@seberg
Copy link
Member

seberg commented May 2, 2024

I don't mind doing this, it certainly is better/correct also in the non no-gil branch. But, do you have an actual reason/failure you need this for?

I.e. I would like to just delete all of this, and the main reason I haven't already started with that, is that I think as long as backports to 2.0 are very likely, it might make things complicated.

@ngoldbaum
Copy link
Member Author

The only way I could consistently trigger the data race was with a test like one I added. I understand the desire to ease backports and if you don't want this that's fine. I worry a little bit about occasional breakage in threaded nogil tests because there are a few spots in the code where the promotion state is used like a context variable. That said, those code paths have a data race even with the GIL I think, so maybe not a big deal?

If we don't upstream this I'd like to figure out the linux build issues, see if there are any other build issues, and upstream the NPY_TLS changes at least. I suspect there's other global state that will need to be made thread-local elsewhere in numpy. I don't care much if we decide to do it in NPY_TLS or in a private macro.

@seberg
Copy link
Member

seberg commented May 2, 2024

Well, I realize that this is a bug fix effectively. To be clear: I don't mind adding this on the grounds of backports!

What I mostly wanted to say is that np._set_promotion_state() shouldn't exist in 2.1 and even in 2.0 it shouldn't matter except for testing. The only places where np._set_promotion_state() actually changes away from the global default should be:

  • The NumPy test suite itself (i.e. you could even delete those tests if you don't want to backport this)!
  • Downstream users who are testing, i.e. not production code!

So, I don't mind reviewing+merging this at all and it seems that build issue is useful. But it seems to me like this fix, while correct, might not be worth the trouble.

@ngoldbaum
Copy link
Member Author

The new test is really only using those functions to simulate the data race that happens when the promotion state gets juggled internally. Happy to remove all of this when we delete the promotion state code.

@seberg
Copy link
Member

seberg commented May 2, 2024

OK, will be nice and eventually necessary to get the NPY_TLS thing right, so might as well use this to figure it out, even if it is probably very temporary.

@ngoldbaum ngoldbaum force-pushed the nogil-thread-local-promotion-state branch 2 times, most recently from 035c9bc to 04f941b Compare May 6, 2024 21:13
@ngoldbaum
Copy link
Member Author

Almost working, but it looks like some of the windows test failures are real so I need to do some debugging on a windows box to figure out why __declspec(thread) isn't sufficient.

@ngoldbaum ngoldbaum force-pushed the nogil-thread-local-promotion-state branch 4 times, most recently from c653580 to 0cc359d Compare May 7, 2024 20:03
@ngoldbaum
Copy link
Member Author

I think I've resolved all the build issues, although I'd like to run the tests with the spin issue fixed to see if there are problems with clang-cl.

@ngoldbaum ngoldbaum force-pushed the nogil-thread-local-promotion-state branch from 0cc359d to 2e82f75 Compare May 8, 2024 16:56
Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is almost ready, so had a closer look. LGTM, modulo the 2-3 simple comments/fixes. So will approve, but those need fixing.

I don't know if all the thread-local definitions are right, but they look good.

npy_promotion_state == NPY_USE_WEAK_PROMOTION_AND_WARN
int promotion_state = get_npy_promotion_state();
if (promotion_state == NPY_USE_LEGACY_PROMOTION || (
get_npy_promotion_state() == NPY_USE_WEAK_PROMOTION_AND_WARN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
get_npy_promotion_state() == NPY_USE_WEAK_PROMOTION_AND_WARN
promotion_state == NPY_USE_WEAK_PROMOTION_AND_WARN

@@ -12,7 +12,8 @@ extern NPY_NO_EXPORT npy_intp REQUIRED_STR_LEN[];
#define NPY_USE_LEGACY_PROMOTION 0
#define NPY_USE_WEAK_PROMOTION 1
#define NPY_USE_WEAK_PROMOTION_AND_WARN 2
extern NPY_NO_EXPORT int npy_promotion_state;

extern NPY_NO_EXPORT NPY_TLS int npy_promotion_state;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
extern NPY_NO_EXPORT NPY_TLS int npy_promotion_state;

Since you add the functions, keeping the definition strange?

Unless you go back to using it directly (if that is even technically possible).

NPY_NO_EXPORT PyObject *NO_NEP50_WARNING_CTX = NULL;
NPY_NO_EXPORT PyObject *npy_DTypePromotionError = NULL;
NPY_NO_EXPORT PyObject *npy_UFuncNoLoopError = NULL;

NPY_NO_EXPORT NPY_TLS int npy_promotion_state = NPY_USE_LEGACY_PROMOTION;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
NPY_NO_EXPORT NPY_TLS int npy_promotion_state = NPY_USE_LEGACY_PROMOTION;
static NPY_TLS int npy_promotion_state = NPY_USE_LEGACY_PROMOTION;

Make it static now that it you have the function accessors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, it didn't occur to me but of course this is nicer.

@@ -262,7 +264,7 @@ foreach optional_attr: optional_variable_attributes
return 0;
}
'''
if cc.compiles(code)
if cc.compiles(code, name: optional_attr[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full discloser, I don't know why this is needed, but it seems fine :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It adds output for the numpy build stdout when it checks for these:

Checking if "thread_local" compiles: NO
Checking if "_Thread_local" compiles: YES
Checking if "__thread" compiles: YES
Checking if "__declspec(thread)" compiles: NO
Checking for function "fallocate" : NO

I found it useful for debugging on CI.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice solution.

@ngoldbaum ngoldbaum merged commit 2a9b913 into numpy:main May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants