Skip to content

gh-134761: Use deferred reference counting for threading concurrency primitives #134762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions Lib/test/test_sys.py
Original file line number Diff line number Diff line change
Expand Up @@ -1343,6 +1343,25 @@ def test_pystats(self):
def test_disable_gil_abi(self):
self.assertEqual('t' in sys.abiflags, support.Py_GIL_DISABLED)

@test.support.cpython_only
@unittest.skipUnless(hasattr(sys, '_defer_refcount'), "requires _defer_refcount()")
def test_defer_refcount(self):
_testinternalcapi = import_helper.import_module('_testinternalcapi')

class Test:
pass

ref = Test()
if support.Py_GIL_DISABLED:
self.assertTrue(sys._defer_refcount(ref))
self.assertTrue(_testinternalcapi.has_deferred_refcount(ref))
self.assertFalse(sys._defer_refcount(ref))
self.assertFalse(sys._defer_refcount(42))
else:
self.assertFalse(sys._defer_refcount(ref))
self.assertFalse(_testinternalcapi.has_deferred_refcount(ref))
self.assertFalse(sys._defer_refcount(42))


@test.support.cpython_only
class UnraisableHookTest(unittest.TestCase):
Expand Down
9 changes: 9 additions & 0 deletions Lib/threading.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,11 @@
_profile_hook = None
_trace_hook = None

def _defer_refcount(op):
"""Improve multithreaded scaling on the free-threade build."""
if hasattr(_sys, "_defer_refcount"):
_sys._defer_refcount(op)

def setprofile(func):
"""Set a profile function for all threads started from the threading module.

Expand Down Expand Up @@ -298,6 +303,7 @@ def __init__(self, lock=None):
if hasattr(lock, '_is_owned'):
self._is_owned = lock._is_owned
self._waiters = _deque()
_defer_refcount(self)

def _at_fork_reinit(self):
self._lock._at_fork_reinit()
Expand Down Expand Up @@ -466,6 +472,7 @@ def __init__(self, value=1):
raise ValueError("semaphore initial value must be >= 0")
self._cond = Condition(Lock())
self._value = value
_defer_refcount(self)

def __repr__(self):
cls = self.__class__
Expand Down Expand Up @@ -595,6 +602,7 @@ class Event:
def __init__(self):
self._cond = Condition(Lock())
self._flag = False
_defer_refcount(self)

def __repr__(self):
cls = self.__class__
Expand Down Expand Up @@ -700,6 +708,7 @@ def __init__(self, parties, action=None, timeout=None):
self._parties = parties
self._state = 0 # 0 filling, 1 draining, -1 resetting, -2 broken
self._count = 0
_defer_refcount(self)

def __repr__(self):
cls = self.__class__
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Improve performance when using :mod:`threading` primitives across multiple
threads.
2 changes: 2 additions & 0 deletions Modules/_threadmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -951,6 +951,7 @@ lock_new_impl(PyTypeObject *type)
return NULL;
}
self->lock = (PyMutex){0};
_PyObject_SetDeferredRefcount((PyObject *)self);
return (PyObject *)self;
}

Expand Down Expand Up @@ -1222,6 +1223,7 @@ rlock_new_impl(PyTypeObject *type)
return NULL;
}
self->lock = (_PyRecursiveMutex){0};
_PyObject_SetDeferredRefcount((PyObject *)self);
return (PyObject *) self;
}

Expand Down
32 changes: 31 additions & 1 deletion Python/clinic/sysmodule.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions Python/sysmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Data members:
*/

#include "Python.h"
#include "object.h"
#include "pycore_audit.h" // _Py_AuditHookEntry
#include "pycore_call.h" // _PyObject_CallNoArgs()
#include "pycore_ceval.h" // _PyEval_SetAsyncGenFinalizer()
Expand Down Expand Up @@ -2653,6 +2654,23 @@ sys__is_gil_enabled_impl(PyObject *module)
#endif
}

/*[clinic input]
sys._defer_refcount -> bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document it in Doc/library/sys.rst. If it "should not be used", add a clear explanation why it should not be used there. If it's not documented, the lack of documentation doesn't prevent users from using it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See Ken and Donghee's comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I disageee with them. IMO we should document sys functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would only be supportive of documenting this if we were allowed to change it in a minor version with no deprecation period. My understanding is that PyUnstable in the C API allows that, but exposing to sys._x means we are stuck with at least 2 deprecation cycle and recommended 5 deprecation cycles. Users should not rely on this function in the first place except in very specific scenarios.

One way to "bypass" this is make the function a no-op in future versions of Python once we solve this issue altogether. But I don't know what users will rely on by then so I'm a bit worried.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I thought we were allowed to change sys._x things in minor versions without deprecation. If not, that's a problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're getting a bit hung up on this point. We can add or remove the documentation for _defer_refcount later, it's not too important. Does everything else look fine here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well I am preparing a better proposal for this approach. Give me hours.

Copy link
Member Author

@ZeroIntensity ZeroIntensity May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cool. Feel free to cc me on it.

Something we also need to consider is whether we want to address this for 3.14. Should this general idea be considered a bugfix or a feature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: #134819

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is improvement rather than bug fix.


op: object
/

Defer reference counting for the object, allowing for better scaling across multiple threads.

This function should be used for specialized purposes only.
[clinic start generated code]*/

static int
sys__defer_refcount_impl(PyObject *module, PyObject *op)
/*[clinic end generated code: output=3b965122056085f5 input=a081971a76c49e64]*/
{
return PyUnstable_Object_EnableDeferredRefcount(op);
}

#ifndef MS_WINDOWS
static PerfMapState perf_map_state;
Expand Down Expand Up @@ -2834,6 +2852,7 @@ static PyMethodDef sys_methods[] = {
SYS__GET_CPU_COUNT_CONFIG_METHODDEF
SYS__IS_GIL_ENABLED_METHODDEF
SYS__DUMP_TRACELETS_METHODDEF
SYS__DEFER_REFCOUNT_METHODDEF
{NULL, NULL} // sentinel
};

Expand Down
Loading