-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
ENH: Add shape to *_like() array creation #13046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add shape to *_like() array creation #13046
Conversation
Please note I've optimistically added the |
adding |
This change maintains backwards compatibility, rather than passing new arguments to PyArray_NewLikeArray().
Alright, I think with the changes I just pushed, we're not breaking the API. Should I also update |
Yes. |
Arrays created with new shapes should not take into consideration the original array's stride, and ndim == 0 should not be a case to ignore a new shape, as the caller may request a 0d array.
* Add comments to cversions.txt (new PyArray_NewLikeArrayWithShape function) * Increment C_API_VERSION to 1.17 in setup_common.py
This change has a few benefits compared to the original proposal: * Its format is compatible with NumPy's copy of empty sliced arrays, as formats must now match * Avoid accidental overwriting of cupy_array when formats don't match * The current CUDA stream may be used implicitly This change implies on cupy_array being created beforehand. Eventually, we can benefit from numpy/numpy#13046 and this change combined.
Requirements: * numpy/numpy#13046 * cupy/cupy#2079
It looks like this is ready for another round of review |
Do we need to expose this in the public numpy C API, or can it be done only at the python level? It would be nice to keep the C API to the minimum required, and this seems to be yet-another-wrapper around |
@mattip we need to expose the new |
We cannot currently remove already published API functions. Is there a need to expose the new function at the C-API level? Do we expect C-extension writers to use |
@pentschev What @mattip means is that we should simply remove the function from the header and make it internal to NumPy. That way we avoid changing the C API. |
Not at the moment, but what is normally the C-API policy of NumPy? Should it avoid exposing new functionality if there's no known immediate use case? Sorry if this is a question that has been answered before. |
Sorry, I didn't see that comment before. I understand it, and that's doable. My previous question is a general one, should we always prefer not exposing C-API, or is there a general rule when we should choose to expose it? |
I don't recall that we have an explicit policy apart from not removing functions from the API, although they may be deprecated. Generally, new functions are added along with new basic functionality, particularly when the new functionality starts on the C side. The addition should also be discussed on the mailing list, and these days we might want to have an NEP as well. Needs some discussion. |
@mrocklin I could use a suggestion from you here, do you think for our use case with Dask and libraries like CuPy and Sparse, do you see us needing to pass |
From an applications perspective I do not anticipate the libraries above needing to use the C-API. If dropping that accelerates things, then I'm for dropping. |
Ok, looks like |
Checking in here. Is there anything we can do to accelerate this? |
- If order='K' try to keep, otherwise, order='C' is implied - Do not raise ValueError anymore
I pushed some changes which I believe now handle @eric-wieser's latest request. |
What I was hoping for is a distinction between order=none and order=k at the c level, and some feedback from other maintainers on whether that's a good idea. |
I guess that would probably need a change of signature of either |
It seems to me that the failures are unrelated to this PR. Could anyone confirm this? |
@eric-wieser could we merge this PR as-is and open an issue to fix |
This has been approved by two core contributors, and seems to address @eric-wieser's major concerns. Eric, do you want to go ahead and merge? |
I started writing a justification for changing the default to
Given this, I think we should do one of three things for when
My preference here is 1.ii, but really my goal is to ensure we don't pick 3 |
@@ -1259,14 +1267,14 @@ PyArray_NewLikeArray(PyArrayObject *prototype, NPY_ORDER order, | |||
for (idim = ndim-1; idim >= 0; --idim) { | |||
npy_intp i_perm = strideperm[idim].perm; | |||
strides[i_perm] = stride; | |||
stride *= shape[i_perm]; | |||
stride *= dims[i_perm]; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My 1.ii proposal would look like changing everything between the new line 1261 and here to:
if (PyArray_NDIM(prototype) >= ndim) {
int leading_dims = PyArray_NDIM(prototype) - ndim;
/* Use only the trailing strides */
PyArray_CreateSortedStridePerm(ndim,
PyArray_STRIDES(prototype) + leading_dims,
strideperm);
/* Build the new strides */
stride = dtype->elsize;
for (idim = ndim-1; idim >= 0; --idim) {
npy_intp i_perm = strideperm[idim].perm;
strides[i_perm] = stride;
stride *= shape[i_perm];
}
}
else {
int leading_dims = ndim - PyArray_NDIM(prototype);
/* Use all the strides */
PyArray_CreateSortedStridePerm(PyArray_NDIM(prototype),
PyArray_STRIDES(prototype),
strideperm);
/* Build the new trailing strides */
stride = dtype->elsize;
for (idim = PyArray_NDIM(prototype)-1; idim >= 0; --idim) {
npy_intp i_perm = strideperm[idim].perm + leading_dims;
strides[i_perm] = stride;
stride *= shape[i_perm];
}
/* Create remaining leading strides as C order */
for (idim = leading_dims; idim >= 0; --idim) {
strides[idim] = stride;
stride *= shape[idim];
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or trading branching for state:
/* not sure if this is clearer via min/max */
int leading_src_dims = 0; // max(src.ndim - dst.ndim, 0)
int leading_dst_dims = 0; // max(dst.ndim - src.ndim, 0)
int shared_dims; // min(src.ndim, dst.ndim
if (PyArray_NDIM(prototype) >= ndim) {
shared_dims = ndim;
leading_src_dims = PyArray_NDIM(prototype) - ndim;
}
else {
shared_dims = PyArray_NDIM(prototype);
leading_dst_dims = ndim - PyArray_NDIM(prototype);
}
/* Use only the trailing strides from the source */
PyArray_CreateSortedStridePerm(shared_dims,
PyArray_STRIDES(prototype) + leading_src_dims,
strideperm);
/* Build the destrination trailing strides */
stride = dtype->elsize;
for (idim = ndim-1; idim >= 0; --idim) {
npy_intp i_perm = strideperm[idim].perm + leading_dst_dims;
strides[i_perm] = stride;
stride *= shape[i_perm];
}
/* Create remaining leading strides as C order */
for (idim = leading_dst_dims; idim >= 0; --idim) {
strides[idim] = stride;
stride *= shape[idim];
}
An example of what I'd expect from the behavior of 1ii is, I think: # note: number of * shows stride order
a = np.zeros((2, 3, 5, 7), dtype=np.uint8, order='F')
assert a.strides == (1, 1*2, 1*2*3, 1*2*3*5)
a_t = a.transpose((2, 0, 3, 1))
assert a_t.strides == (1*2*3, 1, 1*2*3*5, 1*2)
b = np.zeros_like(a, shape=a.shape[1:], dtype=a.dtype)
assert b.strides == (1, 1*3, 1*3*5) # note: just remove the 2 factors
b_t = np.zeros_like(a_t, shape=a_t.shape[1:], dtype=a_t.dtype)
assert b.strides == (1, 1*2*3, 1*2) # note: just remove the 5 factors
c = np.zeros_like(a, shape=(11,) + a.shape, dtype=a.dtype)
assert c.strides == (1*2*3*5*7,) + a.strides # C order beyond matching dimensions
c_t = np.zeros_like(a_t, shape=(11,) + a_t.shape, dtype=a_t.dtype)
assert c.strides == (1*2*3*5*7,) + a.strides |
I'm fine with this - These very detailed considerations on strides have not been considered before AFAIK in relation to the I would recommend merging this right now, and approving a future change without deprecation cycle. Requiring changes almost ad infinitum before merging is potentially discouraging for contributors, especially if there's back and forth between maintainers who don't agree on small details. |
I think we've actually managed to do a good job at doing better than picking between C/F
To be clear, I'm suggesting a performance dependency, not a behavior dependency.
Sounds good to me |
Great, thanks guys. Yes, merging as is good, I would have been fine with the error path for now even. Should we discuss about where to go? I would be on board for the 1.ii. choice (undefined part is C-order, except when all is F-order already I suppose). There is maybe some question what to do if the array is almost F – i.e. F-order, but not contiguous). It seems the closest we will get to "keeping" some layout. In general, frankly, I doubt it will be used much (with someone caring about the result) anyway though. |
Thanks so much everyone for the reviews! FWIW, I've tried testing the code you suggested "as-is" @eric-wieser, it didn't seem to work for all cases. I admit I don't really understand all this complex striding though, so maybe I did something wrong. However, feel free to ping me if I can help with any of the fixes. |
Thanks for the contribution @pentschev! |
This solves #13043.
@shoyer @mrocklin @hameerabbasi