gh-127266: avoid data races when updating type slots #131174

nascheme · 2025-03-13T00:54:29Z

Avoid data races when updating type slots and type flags by "stopping-the-world" first. This is only needed if the type is potentially exposed to multiple threads. There are a couple places this happens:

type_setattro() when the assigned item is a "dunder" method that corresponds to one or more slot
assigning to __bases__
_abc.c calling _PyType_SetFlags() to set the collection flags (SEQUENCE or MAPPING)

Two examples of code that causes this: dataclasses will assign dunder methods after the type object has been created, when the decorator runs. That triggers this and since the slots are assigned one-by-one, we stop for each assignment. Not great. Another example is unittest MagicMock. Again, it assigns dunder methods after the class is defined.

In order to assess the performance impact of this change, I've measured the following things. The time it takes to run the unit test suite does not seem significantly worse, even when running tests serially using -j 1. Comparing the pyperformance results before and after this change shows no significant change. See link below for the chart.

Running with TSAN enabled and running the --tsan and --tsan-parallel --parallel-threads=4 tests show no warnings. I also have a modified pyperformance test suite that I can run in parallel and that shows no TSAN warnings related to this as well.

Additional changes after feedback from Sam:

Replace the critical sections that use TYPE_LOCK with a simple mutex instead. To avoid dead-locks, the code was analyzed for re-entrancy and adjusted as needed. There are a couple places where the mutex is dropped before calling something that's possibly re-entrant.
Remove some locking where it is not needed. Slots can only be changed when the world is stopped (e.g. reading tp_mro no longer needs to hold the lock).
Create type_lookup_ex() as a lower-level version of _PyType_LookupStackRefAndVersion that optionally acquires the TYPE_LOCK or requires that it is already held.
Add a new type flag that is used for debug enabled builds. It tries to check if a type is potentially exposed to multiple threads (and therefore needs stop-the-world before modifying slots). This flag could perhaps be removed before the final merge of this PR (since type flag bits are kind of in short supply).

Benchmark using pyperformance

Issue: Type slots are not thread-safe in free-threaded builds #127266

kumaraditya303 · 2025-03-14T14:35:31Z

Two examples that cause this: dataclasses will assign dunder methods after the type object has been created. That triggers this and since the slots are assigned one-by-one, we stop for each assignment. Not great. Another example is unittest MockMock. Again, it assigns dunder methods after the class is defined.

We can avoid stop the world pauses in both cases by checking if the type is uniquely ref by the current thread, the type object is newly created here and not exposed to other threads so we can assign without stopping the world.

This would be better to be part of the pystats info.

Objects/typeobject.c

colesbury

I'm concerned about atomicity here (see comment below).

Objects/typeobject.c

colesbury · 2025-03-17T12:24:34Z

Objects/typeobject.c

+    PyInterpreterState *interp = _PyInterpreterState_GET();
+    _PyEval_StopTheWorld(interp);


_PyEval_StopTheWorld() is always defined and is a no-op in the default build.

Objects/typeobject.c

colesbury · 2025-03-17T15:50:19Z

We might be able to make BEGIN_TYPE_LOCK() a regular mutex acquisition instead of a critical section, but we'll have to be careful that nothing between BEGIN_TYPE_LOCK() and END_TYPE_LOCK() leads to reentrancy.

For TYPE_LOCK, replace critical sections with a simple mutex. A number of changes were needed to avoid deadlocking on reentrancy (trying to re-acquire the mutex). Split _PyType_LookupRefAndVersion() into a locked and unlocked version. Remove atomic operations where they are not required. Remove some cases of TYPE_LOCK being held that is not required.

nascheme · 2025-03-26T22:43:32Z

I've converted TYPE_LOCK from being used as a critical section to being a regular mutex. There was a number of cases of reentrancy and so things got complex. I think I nearly have that sorted out now. The mro_invoke() function needs work yet. At least, unit tests pass without deadlocking. One somewhat ugly part yet is that _PyType_LookupRefAndVersion() got split into two versions with only slight differences between the functions (taking lock if needed vs caller already holding lock or the lock not being needed). Perhaps I can find a way to refactor it to avoid code duplication.

colesbury

Nice!

I think there's still too much mixing of the type lock and the stop the world. I think data should either be protected by the type lock OR by stop-the-world, but not both.

From reading through this:

type lock:

tp_version_tag
_spec_cache

stop the world:

tp_flags (modifications)
bases (modifications)

Additionally, there's a few functions that are named with _lock_held but are called in a stop-the-world pause (not with the type lock held), which I found a bit confusing.

Objects/typeobject.c

Tools/tsan/suppressions_free_threading.txt

Include/object.h

colesbury · 2025-03-27T22:36:05Z

Objects/typeobject.c

+    types_mutex_unlock();
    new_mro = mro_invoke(type);  /* might cause reentrance */
+    types_mutex_lock();


This doesn't seem right to me. This can be called with both the type lock held and the world stopped. If mro_invoke can call arbitrary code, it needs to start the world around the call.

Yes, that can call a custom mro() method on the meta class. The only thing in Lib that does that is a test_descr test but obviously we need to handle it.

I believe this has been resolved. Inside mro_invoke() I either start-the-world before calling mro() (if it's custom) or I unlock the mutex. The latter case happens during type_ready().

Objects/typeobject.c

nascheme · 2025-04-21T22:50:32Z

I'm not convinced that the switch of the type lock from a critical section to a plain mutex is profitable here.

Okay, I've reverted back to using a critical section for TYPE_LOCK.

This avoids the watcher callback potentially using invalid cache entries.

* Since type_modified_unlocked() can be called with the world stopped then we need to re-start if making re-entrant calls (like calling type watcher callbacks). * Re-order code before type_modified_unlocked(). This fixes a potential bug if called functions end up re-assigning the type version tag. If they do, the cache can contain invalid entries. We should do the dictionary update first then set the version to zero. * Replace 'is_new_type' with 'world_stopped' boolean parameter. This seems a bit more clear.

Rather than trying to update tp_base, tp_bases, and tp_mro while the world is stopped, hold TYPE_LOCK when updating them. This is much more similar to how the base version of the code works and requires many fewer changes. Using stop-the-world turned out to be difficult to make work correctly, mostly due to custom mro() methods but also for some other reasons (for example, functions that potentially re-enter the interpreter). This re-work requires that we can call stop-the-world inside a critical section and have the mutex still held after re-starting. This seems to work and I've added asserts to confirm it. Stop-the-world is now only used when updating either tp_flags or other type slots and not the three type members listed above.

nascheme · 2025-04-23T20:17:54Z

Major rework, reverting back to using TYPE_LOCK and critical sections to protect tp_mro, tp_bases, tp_base. In a previous version of this PR, I was making updates to those only while the world was stopped. Because we are so limited on what APIs you can call while the world is stopped it is hard to make it work without major overhaul to how types work. The mro() method on the metatype is particularly troublesome but there are other similar problems.

Here is a summary of what's changed vs the main branch:

Update tp_flags and type slots (other than tp_base, tp_bases, and tp_mro) only while the world is stopped. This allows us to read those type fields without atomics and without locking.
Add _Py_TYPE_REVEALED_FLAG as a debug build check to help prevent code that unsafely writes to a type (missing lock or stop-the-world).
Fix a data race with the pname/ptrs "little cache" for resolve_slotdups(). That cache is not safe in the free-threaded build so disable it.
A few code cleanup things that are not strictly required but leftover from the previous version of this PR. I think they improve the code so I left them in.

Updated pyperformance run.

These are not required as part of this PR and can be reviewed separately.

colesbury

This LGTM once the issue with test_opcache is resolved

Now that these slot updates use stop-the-world, these two tests are quite a lot slower. Reduce size of the items list from 1000 to 100.

Objects/typeobject.c

…thongh-131174) This is triggering deadlocks in test_opcache.

… (gh-133129) This is triggering deadlocks in test_opcache. See GH-133130 for stack trace.

wip: update type slots, stop-the-world

0e995ad

bedevere-app bot mentioned this pull request Mar 13, 2025

Type slots are not thread-safe in free-threaded builds #127266

Open

nascheme added 2 commits March 13, 2025 15:03

Remove unneeded atomics for tp_flags.

b173f17

Use stop-the-world for tp_flag changes too.

d4ce112

nascheme added 4 commits March 14, 2025 11:54

Remove 'world_stops' and 'sys._get_world_stops'.

44eb332

This would be better to be part of the pystats info.

Improve code comments.

d132fab

Remove TSAN suppressions that seem unneeded.

658bcd5

Add NEWS file.

ef2f07b

nascheme marked this pull request as ready for review March 14, 2025 21:37

nascheme requested a review from markshannon as a code owner March 14, 2025 21:37

bedevere-app bot added the awaiting core review label Mar 14, 2025

nascheme requested review from mpage and colesbury March 15, 2025 18:11

kumaraditya303 reviewed Mar 16, 2025

View reviewed changes

Objects/typeobject.c Outdated Show resolved Hide resolved

colesbury reviewed Mar 17, 2025

View reviewed changes

nascheme added 2 commits March 26, 2025 00:22

Merge 'origin/main' into pythongh-127266-type-slots-ts

b97a4b4

nascheme marked this pull request as draft March 26, 2025 07:49

bedevere-app bot removed the awaiting core review label Mar 26, 2025

Fix non-debug build.

ca00e74

nascheme added 3 commits March 26, 2025 15:45

Improve comments.

6db4542

Merge 'origin/main' into pythongh-127266-type-slots-ts

398ac14

Avoid unused function warning.

75d6b71

nascheme marked this pull request as ready for review March 27, 2025 15:49

bedevere-app bot added the awaiting core review label Mar 27, 2025

nascheme requested a review from colesbury March 27, 2025 15:50

colesbury reviewed Mar 27, 2025

View reviewed changes

nascheme added 7 commits April 21, 2025 16:45

Invalidate type cache before calling watchers.

a81e9e3

This avoids the watcher callback potentially using invalid cache entries.

Merge 'origin/main' into pythongh-127266-type-slots-ts

da2a0ad

Fix non-debug build.

986f23a

Revert unneeded code changes.

c404ed4

Merge branch 'origin/main' into pythongh-127266-type-slots-ts

55af4ba

nascheme requested a review from colesbury April 23, 2025 20:18

nascheme added 2 commits April 23, 2025 17:00

Restore comment

0eb77da

Revert more changes.

16f15b2

These are not required as part of this PR and can be reviewed separately.

colesbury approved these changes Apr 24, 2025

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting core review labels Apr 24, 2025

nascheme added 2 commits April 24, 2025 19:17

Merge 'origin/main' into pythongh-127266-type-slots-ts

0c328cc

Reduce item list size for a few tests.

64547e9

Now that these slot updates use stop-the-world, these two tests are quite a lot slower. Reduce size of the items list from 1000 to 100.

kumaraditya303 reviewed Apr 25, 2025

View reviewed changes

Objects/typeobject.c Outdated Show resolved Hide resolved

kumaraditya303 reviewed Apr 25, 2025

View reviewed changes

Objects/typeobject.c Outdated Show resolved Hide resolved

kumaraditya303 reviewed Apr 25, 2025

View reviewed changes

Objects/typeobject.c Outdated Show resolved Hide resolved

nascheme added 2 commits April 28, 2025 12:56

Merge 'origin/main' into pythongh-127266-type-slots-ts

fff1bd2

Minor code tidy.

5672352

nascheme enabled auto-merge (squash) April 28, 2025 20:26

nascheme merged commit e414a2d into python:main Apr 28, 2025
48 checks passed

bedevere-app bot removed the awaiting merge label Apr 28, 2025

sergey-miryanov mentioned this pull request Apr 29, 2025

gh-132042: Prebuild mro_dict for find_name_in_mro to speedup class creation #132618

Open

nascheme added a commit to nascheme/cpython that referenced this pull request Apr 29, 2025

Revert pythongh-127266: avoid data races when updating type slots (py…

5292ead

…thongh-131174) This is triggering deadlocks in test_opcache.

nascheme mentioned this pull request Apr 29, 2025

Deadlock in test_opcache with gh-131174 applied #133130

Open

nascheme added a commit that referenced this pull request Apr 29, 2025

Revert gh-127266: avoid data races when updating type slots (gh-131174)…

eecafc3

… (gh-133129) This is triggering deadlocks in test_opcache. See GH-133130 for stack trace.

nascheme mentioned this pull request Apr 29, 2025

gh-127266: avoid data races when updating type slots v2 #133177

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-127266: avoid data races when updating type slots #131174

gh-127266: avoid data races when updating type slots #131174

nascheme commented Mar 13, 2025 •

edited

Loading

kumaraditya303 commented Mar 14, 2025 •

edited

Loading

colesbury left a comment

colesbury Mar 17, 2025

colesbury commented Mar 17, 2025

nascheme commented Mar 26, 2025

colesbury left a comment

colesbury Mar 27, 2025

nascheme Mar 28, 2025

nascheme Mar 31, 2025

nascheme commented Apr 21, 2025

nascheme commented Apr 23, 2025 •

edited

Loading

colesbury left a comment

		PyInterpreterState *interp = _PyInterpreterState_GET();
		_PyEval_StopTheWorld(interp);

gh-127266: avoid data races when updating type slots #131174

gh-127266: avoid data races when updating type slots #131174

Conversation

nascheme commented Mar 13, 2025 • edited Loading

kumaraditya303 commented Mar 14, 2025 • edited Loading

colesbury left a comment

Choose a reason for hiding this comment

colesbury Mar 17, 2025

Choose a reason for hiding this comment

colesbury commented Mar 17, 2025

nascheme commented Mar 26, 2025

colesbury left a comment

Choose a reason for hiding this comment

colesbury Mar 27, 2025

Choose a reason for hiding this comment

nascheme Mar 28, 2025

Choose a reason for hiding this comment

nascheme Mar 31, 2025

Choose a reason for hiding this comment

nascheme commented Apr 21, 2025

nascheme commented Apr 23, 2025 • edited Loading

colesbury left a comment

Choose a reason for hiding this comment

nascheme commented Mar 13, 2025 •

edited

Loading

kumaraditya303 commented Mar 14, 2025 •

edited

Loading

nascheme commented Apr 23, 2025 •

edited

Loading