Skip to content

gh-112075: Fix race in constructing dict for instance #118499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 6, 2024

Conversation

DinoV
Copy link
Contributor

@DinoV DinoV commented May 2, 2024

Fix a small thread safety issue w/ dicts. When we are constructing the dict for an object with either a non-inline managed dict or with a non-managed dict we're not properly protecting the creation with a lock. This locks the object in _PyObjectDict_SetItem if we don't already have a dict to ensure only one thread creates the lock at a time.

Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to fix:

  • PyObject_GenericGetDict (non-managed dict case)
  • PyObject_GenericSetDict

@@ -7222,37 +7222,39 @@ _PyObjectDict_SetItem(PyTypeObject *tp, PyObject **dictptr,
PyInterpreterState *interp = _PyInterpreterState_GET();

assert(dictptr != NULL);
if ((tp->tp_flags & Py_TPFLAGS_HEAPTYPE) && (cached = CACHED_KEYS(tp))) {
assert(dictptr != NULL);
dict = *dictptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use atomics to load from the dictptr or there will be a data race between the read here and the write within the critical section.

@DinoV DinoV force-pushed the nogil_dict_creation branch 2 times, most recently from a9ad363 to 56a79e6 Compare May 3, 2024 19:50
@colesbury colesbury self-requested a review May 3, 2024 20:01

Py_BEGIN_CRITICAL_SECTION(dict);
res = set_or_del_lock_held((PyDictObject *)dict, key, value);
Py_END_CRITICAL_SECTION();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ASSERT_CONSISTENT needs to be inside the critical section.

{
PyDictKeysObject *cached;

PyObject *dict = FT_ATOMIC_LOAD_PTR_RELAXED(*dictptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

acquire (or seq-cst)

else {
dict = PyDict_New();
}
FT_ATOMIC_STORE_PTR_RELAXED(*dictptr, dict);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

release (or seq-cst)

Comment on lines 7217 to 7219
if (dict == NULL) {
dictkeys_decref(interp, cached, false);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think new_dict_with_shared_keys already handles the dictkeys_decref in the error case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's super subtle, I don't think there's any reason to spread this out, I'll move the incref into new_dict_with_shared_keys

Objects/object.c Outdated
@@ -1797,7 +1797,11 @@ PyObject_GenericSetDict(PyObject *obj, PyObject *value, void *context)
"not a '%.200s'", Py_TYPE(value)->tp_name);
return -1;
}
#ifdef Py_GIL_DISABLED
Py_XDECREF(_Py_atomic_exchange_ptr(dictptr, Py_NewRef(value)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to be consistent about either using atomic exchange/cas or using the object locks to protect the dictptr. I don't think we can safely mix them.

}
#endif
PyTypeObject *tp = Py_TYPE(obj);
if ((tp->tp_flags & Py_TPFLAGS_HEAPTYPE) && (cached = CACHED_KEYS(tp))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe: _PyType_HasFeature(tp, Py_TPFLAGS_HEAPTYPE)

}
#endif
OBJECT_STAT_INC(dict_materialized_on_request);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we consistent about when we increment this stat?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We seem to actually have an extra call in materialize_managed_dict_lock_held which is the inline values case. This one seems like maybe it shouldn't exist at all, as we no longer have a simple values array which is outside of the dict, so even though we have shared keys here we will always have a dict. I'll remove it from here.

@DinoV DinoV force-pushed the nogil_dict_creation branch from e7e5ed4 to f328d37 Compare May 6, 2024 18:40
Copy link
Contributor

@colesbury colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Two minor comments below

@DinoV DinoV force-pushed the nogil_dict_creation branch from f328d37 to 9547a5f Compare May 6, 2024 22:01
@ambv ambv enabled auto-merge (squash) May 6, 2024 23:18
@ambv ambv merged commit 636b8d9 into python:main May 6, 2024
34 checks passed
SonicField pushed a commit to SonicField/cpython that referenced this pull request May 8, 2024
@DinoV DinoV deleted the nogil_dict_creation branch May 31, 2024 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants