-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-112075: Make some dict
operations thread-safe without GIL
#112247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For `dict.__len__`, use `_Py_atomic_load_ssize_relaxed` to access `PyDictObject` `ma_used` field. For the following methods * `dict.fromkeys` * `dict.copy` * `dict_richcompare` * `dict.clear` * `dict.__sizeof__` * `dict.__or__` * `dict.__ior__` * `dict.__reversed__` * `dict.keys` * `dict.items` * `dict.values` use critical section API, either in form of AC directive or macro.
@@ -117,6 +117,7 @@ As a consequence of this, split keys have a maximum size of 16. | |||
#include "pycore_call.h" // _PyObject_CallNoArgs() | |||
#include "pycore_ceval.h" // _PyEval_GetBuiltin() | |||
#include "pycore_code.h" // stats | |||
#include "pycore_critical_section.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment like: #include "pycore_critical_section.h" // Py_BEGIN_CRITICAL_SECTION()
@@ -2746,6 +2755,7 @@ dict_items(PyDictObject *mp) | |||
} | |||
|
|||
/*[clinic input] | |||
@critical_section |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dict_fromkeys_impl
should not have a critical_section. This locks the type
object, which isn't necessary.
We may need critical sections for some code paths of _PyDict_FromKeys
, but that will need to be within _PyDict_FromKeys
.
@@ -4610,14 +4641,15 @@ PyTypeObject PyDictRevIterKey_Type = { | |||
|
|||
|
|||
/*[clinic input] | |||
@critical_section |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want a critical section here. Creating the reverse iterator is thread-safe without any special handling, although iterating over the reverse iterator may require special handling.
return _PyDictView_New(dict, &PyDictKeys_Type); | ||
PyObject *view; | ||
Py_BEGIN_CRITICAL_SECTION(dict); | ||
view = _PyDictView_New(dict, &PyDictKeys_Type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No critical section here. Creating the view objects, like _PyDictView_New
(and the reverse iterator above) do not access any of the dictionary internals so they don't need to lock the dictionary.
return _PyDictView_New(dict, &PyDictItems_Type); | ||
PyObject *view; | ||
Py_BEGIN_CRITICAL_SECTION(dict); | ||
view = _PyDictView_New(dict, &PyDictItems_Type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No critical section here.
return _PyDictView_New(dict, &PyDictValues_Type); | ||
PyObject *view; | ||
Py_BEGIN_CRITICAL_SECTION(dict); | ||
view = _PyDictView_New(dict, &PyDictValues_Type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No critical section here
@@ -3511,7 +3529,9 @@ dict_setdefault_impl(PyDictObject *self, PyObject *key, | |||
static PyObject * | |||
dict_clear(PyDictObject *mp, PyObject *Py_UNUSED(ignored)) | |||
{ | |||
Py_BEGIN_CRITICAL_SECTION(mp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need public APIs like PyDict_Clear()
to be thread-safe. The critical sections need to be pushed down into that API.
I suggest doing something like in colesbury/nogil-3.12@d896dfc8db:
- Rename the current implementation of
PyDict_Clear()
and make itstatic
(e.g.static void _dict_clear(...)
) - Make
PyDict_Clear()
a simple wrapper around_dict_clear()
that adds critical section calls. - Just call
PyDict_Clear()
from this functiondict_clear(...)
.
Py_ReprLeave((PyObject *)mp); | ||
return PyUnicode_FromString("{}"); | ||
repr = PyUnicode_FromString("{}"); | ||
goto end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thin this may incorrectly call _PyUnicodeWriter_Dealloc()
on an uninitialized writer
below if repr
is NULL (i.e., if PyUnicode_FromString
returns an error).
For these functions with complex control flow, I think it's better to add a wrapper function than try to work Py_BEGIN_CRITICAL_SECTION()
calls into the existing control flow. In other words:
- Rename
dict_repr
to e.g.dict_repr_impl
. - Add a new
dict_repr
that just callsdict_repr_impl
under aPy_BEGIN_CRITICAL_SECTION()
calls.
@colesbury, is this PR still relevant, given @DinoV's recent work on |
I think dict thread-safety has been addressed by Dino's PRs and this PR can be closed now. Thanks for your work @chgnrdv. |
#112075
For
dict.__len__
,_Py_atomic_load_ssize_relaxed
is used to accessPyDictObject
ma_used
field.For the following methods
dict.fromkeys
dict.copy
dict_richcompare
dict.clear
dict.__sizeof__
dict.__or__
dict.__ior__
dict.__reversed__
dict.keys
dict.items
dict.values
dict.update
(indict_update_common
)dict.__init__
(indict_update_common
)dict.__repr__
the critical section API is used, either in form of AC directive or macro.