-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
RFC Guideline for usage of Cython types #25572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I am +1 on stopping the use of When this is a fused type to work with several precision levels, then I think our usual For tempita variable names, maybe we could also reuse the For integers, when they are used for data in our Cython code, they are mostly used as contiguous category or class indices among a precomputed list of possible values, typically the |
I am against using type names with implicit precision levels ( |
I also want to stop using all the My preference is to avoid # sklearn/utils/_typedefs.pyd
ctypedef Py_ssize_t intp_t
ctypedef unsigned char bool_t
ctypedef signed int int32_t
ctypedef signed long long int64_t
ctypedef float float32_t
ctypedef double float64_t
# etc and use those typedefs everywhere. This means our Cython code would look like: import numpy as np
from sklearn.utils._typedefs cimport float32_t
cdef float32_t[:] X = np.zeros(10, dtype=np.float32) I like having the precision levels explicit in the type, but I also want to avoid including NumPy headers just for the types. The cost is copying some of the typedefs from NumPy. |
I agree with what @ogrisel and @thomasjpfan propose. In scikit-learn/sklearn/tree/_tree.pxd Lines 16 to 20 in 9956210
Yet, it is Moreover, should we create our own fused types? |
I think we have an agreement with @thomasjpfan proposal in #25572 (comment). Now, we need a volunteer... One remark about predefined fused types like def fun(floating a[:], floating b[:]):
cdef:
int i
int n = a.shape[0]
for i in range(n):
a[i] += b[i] This enforces |
For fused types, I prefer using Cython's For commonly fused types, I can see doing something like scikit-image: https://github.com/scikit-image/scikit-image/blob/main/skimage/_shared/fused_numerics.pxd |
I agree with @thomasjpfan's proposal, and regarding fused types I'd like to echo my sentiment from a prior discussion: #24153 (comment) TL;DR With that being said, I think Cython standard fused types (e.g. |
I'd also be glad to get rid of all the DTYPE_t and friends. |
Note that we won't be able to complete this until we require cython 3 due to a bug I just found where cython doesn't properly infer types: %%cython
cimport numpy as cnp
ctypedef Py_ssize_t intp_t # like npy_intp and intp_t are defined in numpy
cdef void f(cnp.npy_intp* x):
return x
cdef intp_t a = 0
f(&a) # cython error: Cannot assign type 'intp_t *' to 'npy_intp *' The function expects pointer to cnp.npy_intp, but we pass pointer to intp_t and it crashes although they are defined the same way. I checked and it's fixed in the master branch of cython, but not in the 0.29.x branch. |
I opened #25942 to finalize the removal of old defined types in utils._typedefs. The next step will be to use the new defined types in utils._typedefs in cython modules that used to not import types from there but defined DTYPE_t, ITYPE_t, ... locally. |
What's left to do? A short entry in the dev doc? Anything else? |
There are still several files that use custom typedefs defined locally that should now use the types defined in |
Some are useful. For HGBT, for instance, you can easily play around with the effect of different types only because it uses its own typedefs. |
I noticed this when building a downstream application that leverages Cython code from scikit-learn. In #27352, I changed https://numpy.org/doc/stable/reference/arrays.scalars.html#numpy.uintc It seems that numpy's uint32 is an alias for either cc: @lorentzenchr and @jjerphan |
Have you observed problems, @adam2392? |
The above links no longer work with the latest version of Cython. But I found: https://github.com/cython/cython/blob/3.0.9/Cython/Includes/libc/stdint.pxd#L15 and on numpy's side: https://github.com/numpy/numpy/blob/v1.26.4/numpy/__init__.pxd#L280 which are both consistent with aliasing |
Goal
Have a documented consensus on which types to use in Cython code.
Types
We should distinguish between floating point numbers and integers. We may also split the use cases of integers: As data value and as index for pointers and memoryviews.
Linked issues
#24153, #25555 (comment), #23865 (comment), #23865 (comment)
Also note the Cython bug with
const
types: cython/cython#5230The text was updated successfully, but these errors were encountered: