-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
gh-112075: Fix race in constructing dict for instance #118499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need to fix:
- PyObject_GenericGetDict (non-managed dict case)
- PyObject_GenericSetDict
Objects/dictobject.c
Outdated
@@ -7222,37 +7222,39 @@ _PyObjectDict_SetItem(PyTypeObject *tp, PyObject **dictptr, | |||
PyInterpreterState *interp = _PyInterpreterState_GET(); | |||
|
|||
assert(dictptr != NULL); | |||
if ((tp->tp_flags & Py_TPFLAGS_HEAPTYPE) && (cached = CACHED_KEYS(tp))) { | |||
assert(dictptr != NULL); | |||
dict = *dictptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to use atomics to load from the dictptr
or there will be a data race between the read here and the write within the critical section.
a9ad363
to
56a79e6
Compare
Objects/dictobject.c
Outdated
|
||
Py_BEGIN_CRITICAL_SECTION(dict); | ||
res = set_or_del_lock_held((PyDictObject *)dict, key, value); | ||
Py_END_CRITICAL_SECTION(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the ASSERT_CONSISTENT
needs to be inside the critical section.
Objects/dictobject.c
Outdated
{ | ||
PyDictKeysObject *cached; | ||
|
||
PyObject *dict = FT_ATOMIC_LOAD_PTR_RELAXED(*dictptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
acquire (or seq-cst)
Objects/dictobject.c
Outdated
else { | ||
dict = PyDict_New(); | ||
} | ||
FT_ATOMIC_STORE_PTR_RELAXED(*dictptr, dict); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
release (or seq-cst)
Objects/dictobject.c
Outdated
if (dict == NULL) { | ||
dictkeys_decref(interp, cached, false); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think new_dict_with_shared_keys
already handles the dictkeys_decref
in the error case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's super subtle, I don't think there's any reason to spread this out, I'll move the incref into new_dict_with_shared_keys
Objects/object.c
Outdated
@@ -1797,7 +1797,11 @@ PyObject_GenericSetDict(PyObject *obj, PyObject *value, void *context) | |||
"not a '%.200s'", Py_TYPE(value)->tp_name); | |||
return -1; | |||
} | |||
#ifdef Py_GIL_DISABLED | |||
Py_XDECREF(_Py_atomic_exchange_ptr(dictptr, Py_NewRef(value))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to be consistent about either using atomic exchange/cas or using the object locks to protect the dictptr. I don't think we can safely mix them.
Objects/dictobject.c
Outdated
} | ||
#endif | ||
PyTypeObject *tp = Py_TYPE(obj); | ||
if ((tp->tp_flags & Py_TPFLAGS_HEAPTYPE) && (cached = CACHED_KEYS(tp))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe: _PyType_HasFeature(tp, Py_TPFLAGS_HEAPTYPE)
Objects/dictobject.c
Outdated
} | ||
#endif | ||
OBJECT_STAT_INC(dict_materialized_on_request); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we consistent about when we increment this stat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to actually have an extra call in materialize_managed_dict_lock_held
which is the inline values case. This one seems like maybe it shouldn't exist at all, as we no longer have a simple values array which is outside of the dict, so even though we have shared keys here we will always have a dict. I'll remove it from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Two minor comments below
Lock obj instead of using atomic, and fix reads/writes
Fix a small thread safety issue w/ dicts. When we are constructing the dict for an object with either a non-inline managed dict or with a non-managed dict we're not properly protecting the creation with a lock. This locks the object in
_PyObjectDict_SetItem
if we don't already have a dict to ensure only one thread creates the lock at a time.dict
objects thread-safe in--disable-gil
builds #112075