Skip to content

MAINT: convert convert_datatype and dtypemeta to c++ #28253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ngoldbaum
Copy link
Member

@ngoldbaum ngoldbaum commented Jan 30, 2025

Towards solving #28048, to allow me to use C++ standard library synchronization primitives.

This ended up being a little more involved that I initially expected based on just compiling with clang on my Mac, but it looks like things are working now.

I think the only non-trivial bits are due to moving away from designated initializer lists, which require C++20. Happy to go with alternate approaches to handle those. The sefaults I saw earlier were from not zero-initializing some of the structs that used to use designated initializers. I now explicitly zero-initialize all function-local structs (static structs are always zero-initialized).

NULL,
0, /* type_num filled in below */
NULL,
NPY_DT_LEGACY,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if there's a less verbose way to deal with MSVC not liking designated initializer lists in C++ code.

@ngoldbaum

This comment was marked as resolved.

@ngoldbaum ngoldbaum marked this pull request as draft January 30, 2025 23:01
@ngoldbaum ngoldbaum marked this pull request as ready for review January 31, 2025 22:48
@ngoldbaum
Copy link
Member Author

@mhvk maybe you would like to look at this?

I'm also happy to actually implement the locking on the castingimpls dict and make sure it fixes the issue before merging this.

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR, I can of course not see any of the benefits - it seems quite a bit of churn. I think despite probably getting quite blind to mistakes because of the repetitive nature, I may have spotted a single forgotten decref. Otherwise, I'll trust you that this will help eventually -- as is, it is not exactly an advertisement for the benefits of C++, but that's probably unfair; we would have written things differently had we started in C++...

PyArray_DTypeMeta *dtypes[2] = {NPY_DTYPE(descr), given_DType};
PyArray_Descr *given_descrs[2] = {descr, NULL};
PyArray_Descr *loop_descrs[2];

PyArrayMethodObject *meth = (PyArrayMethodObject *)tmp;
PyObject *tmp = PyArray_GetCastingImpl(NPY_DTYPE(descr), given_DType);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move down to just before if?

@@ -2121,13 +2115,13 @@ simple_cast_resolve_descriptors(

loop_descrs[0] = NPY_DT_CALL_ensure_canonical(given_descrs[0]);
if (loop_descrs[0] == NULL) {
return -1;
return (NPY_CASTING)-1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably for another PR, but I wonder if one shouldn't just do return _NPY_ERROR_OCCURRED_IN_CAST;?

.dtypes = dtypes,
.slots = slots,
};
PyArrayMethod_Spec spec = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty sad indeed, just because of one crappy compiler. Oh well.

}

result = 0;
finish:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this? It seemed nicer not to duplicate that decref everywhere.

}

result = 0;
finish:
Py_XDECREF(other_dt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure the decref is not needed in the non-error path?

@@ -3385,7 +3379,8 @@ object_to_object_get_loop(
NpyAuxData **out_transferdata,
NPY_ARRAYMETHOD_FLAGS *flags)
{
*flags = NPY_METH_REQUIRES_PYAPI | NPY_METH_NO_FLOATINGPOINT_ERRORS;
*flags = (NPY_ARRAYMETHOD_FLAGS)(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so weird: one can not even or flags in an enum together and not keep the same type?

@ngoldbaum ngoldbaum added the 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) label Feb 4, 2025
@seberg
Copy link
Member

seberg commented Feb 5, 2025

I have not had a close look. For what its worth, I tend to think that at least most of this is probably good: We should make e.g. the headers C++ ready because this is in the future probably. (even if some things are unclear, for example we should maybe stop using magic /**NUMPY_API or make that magic create an extern "C" wrapper? The zombi state between C/C++ in some of these places is a bit unfortunate)

So I am happy if this is pushed forwards (modulo someone reviewing possible details).

On the other hand, I still tend to think that the reason that triggered this isn't a good reason for this, and we should just do:

The diff is large, but only because we need to refactor things into a helper function because a critical section creates a block. (sorry, first version had a small bug)

I have not tested that this fixes the bug on a free-threading build, but code is better to discuss.
The critical section enclosure could be made larger (the second dict addition always adds the identical object so not sure tsan complains), just the "legacy check" either needs to be moved out of the critical section then, or we need to also add Py_None to the dictionary in that branch (which is totally fine, IMO!).

diff --git a/numpy/_core/src/multiarray/convert_datatype.c b/numpy/_core/src/multiarray/convert_datatype.c
index a0f09b71a3..7317d82ae5 100644
--- a/numpy/_core/src/multiarray/convert_datatype.c
+++ b/numpy/_core/src/multiarray/convert_datatype.c
@@ -63,6 +63,37 @@ static PyObject *
 PyArray_GetObjectToGenericCastingImpl(void);
 
 
+static PyObject *
+ensure_castingimpl_exists(PyArray_DTypeMeta *from, PyArray_DTypeMeta *to) {
+    if (from != to) {
+        /* A cast function must have been registered */
+        PyArray_VectorUnaryFunc *castfunc = PyArray_GetCastFunc(
+                from->singleton, to->type_num);
+        if (castfunc == NULL) {
+            PyErr_Clear();
+            /* Remember that this cast is not possible */
+            if (PyDict_SetItem(NPY_DT_SLOTS(from)->castingimpls,
+                        (PyObject *) to, Py_None) < 0) {
+                return NULL;
+            }
+            Py_RETURN_NONE;
+        }
+    }
+
+    /* PyArray_AddLegacyWrapping_CastingImpl find the correct casting level: */
+    /*
+    * TODO: Possibly move this to the cast registration time. But if we do
+    *       that, we have to also update the cast when the casting safety
+    *       is registered.
+    */
+    if (PyArray_AddLegacyWrapping_CastingImpl(from, to, -1) < 0) {
+        return NULL;
+    }
+    /* The castingimpl now always exists (or is None, so we can recurse) */
+    return PyArray_GetCastingImpl(from, to);
+}
+
+
 /**
  * Fetch the casting implementation from one DType to another.
  *
@@ -75,15 +106,14 @@ PyArray_GetObjectToGenericCastingImpl(void);
 NPY_NO_EXPORT PyObject *
 PyArray_GetCastingImpl(PyArray_DTypeMeta *from, PyArray_DTypeMeta *to)
 {
-    PyObject *res;
+    PyObject *res = NULL;
     if (from == to) {
-        res = (PyObject *)NPY_DT_SLOTS(from)->within_dtype_castingimpl;
+        res = Py_XNewRef((PyObject *)NPY_DT_SLOTS(from)->within_dtype_castingimpl);
     }
-    else {
-        res = PyDict_GetItemWithError(NPY_DT_SLOTS(from)->castingimpls, (PyObject *)to);
+    else if (PyDict_GetItemRef(NPY_DT_SLOTS(from)->castingimpls, (PyObject *)to, &res) < 0) {
+        return NULL;
     }
-    if (res != NULL || PyErr_Occurred()) {
-        Py_XINCREF(res);
+    if (res != NULL) {
         return res;
     }
     /*
@@ -120,31 +150,20 @@ PyArray_GetCastingImpl(PyArray_DTypeMeta *from, PyArray_DTypeMeta *to)
         return NULL;
     }
     else {
-        if (from != to) {
-            /* A cast function must have been registered */
-            PyArray_VectorUnaryFunc *castfunc = PyArray_GetCastFunc(
-                    from->singleton, to->type_num);
-            if (castfunc == NULL) {
-                PyErr_Clear();
-                /* Remember that this cast is not possible */
-                if (PyDict_SetItem(NPY_DT_SLOTS(from)->castingimpls,
-                            (PyObject *) to, Py_None) < 0) {
-                    return NULL;
-                }
-                Py_RETURN_NONE;
-            }
-        }
-
-        /* PyArray_AddLegacyWrapping_CastingImpl find the correct casting level: */
         /*
-         * TODO: Possibly move this to the cast registration time. But if we do
-         *       that, we have to also update the cast when the casting safety
-         *       is registered.
+         * We need to create and add the CastingImpl while making sure no other
+         * threads does so.  Before we add it check again (another thread may
+         * have added it in the meantime).
          */
-        if (PyArray_AddLegacyWrapping_CastingImpl(from, to, -1) < 0) {
-            return NULL;
+        Py_BEGIN_CRITICAL_SECTION((NPY_DT_SLOTS(from)->castingimpls));
+        if (PyDict_GetItemRef(NPY_DT_SLOTS(from)->castingimpls, (PyObject *)to, &res) < 0) {
+            /* We will return NULL on error */
         }
-        return PyArray_GetCastingImpl(from, to);
+        else if (res == NULL) {
+            res = ensure_castingimpl_exists(from, to);
+        }
+        Py_END_CRITICAL_SECTION();
+        return res;
     }
 
     if (res == NULL) {

@ngoldbaum
Copy link
Member Author

I chatted with @colesbury about this and he confirmed that Sebastian's approach should be fine because PyDict is a concurrent hash table. It's only not thread-safe to do this if you are working with a hash table that doesn't support concurrent use (like PyArray_IdentityHash).

That said, he pointed out that because critical sections are reentrant it's probably better to attach a PyMutex to the dtypemeta struct on the free-threaded build and use that for the locking instead of a critical section.

I'm going to close this and do this with a similar approach to Sebastian's diff but with a PyMutex, if someone wants to take another stab at converting bits of the multiarray module to C++ they can use this as a place to start.

@ngoldbaum ngoldbaum closed this Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
03 - Maintenance 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants