-
-
Notifications
You must be signed in to change notification settings - Fork 10.9k
ENH: Improved performance of PyArray_FromAny for sequences of array-like #13399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5cf5ee7
to
edcb829
Compare
edcb829
to
0187811
Compare
Running
shows a marked improvement. Before
After
|
return NULL; | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: the original code had a check clause at this point if (tmp != Py_NotImplemented){ }
which has been folded into the logic below instead
LGTM. Maybe could add a benchmark to |
I worry that this needs to handle broken ctypes buffer formats in the same way we do elsewhere. Will try to look at the diff later |
0187811
to
1b57708
Compare
Great, but that should be a follow-on PR, not part of this one, unless it is a one-line fix |
@eric-wieser since this PR reuses existing code, could we push off possible broken ctypes PEP 3118 handling to a different issue/PR? This seems like a clear win for more common memoryview and |
1b57708
to
9f8826a
Compare
@mattip: At a cursory glance, I was worried this introduced a regression in the following case:
Testing seems to show that the error is produced both before and after this patch, so there's no need to worry about it here. |
9f8826a
to
2363084
Compare
Prior to this commit np.array([array_like]) would recursively copy each element of array_like. This is due to the fact that setArrayFromSequence only special-cased lists of NumPy arrays, any other object was treated as a sequence even if it supported buffer or __array*__ interfaces. See tensorflow/tensorflow#27692 for details. The commit generalizes the special-case in setArrayFromSequence to any array-like, i.e. a buffer or an object with __array__, __array_interface__ __array_struct__.
2363084
to
71fc59d
Compare
I created issue #13628 about This looks ready to merge, any more comments? |
Thanks @superbobry |
This seems to have broken pandas, where |
Fixup of gh-13399 to avoid a regression with pandas. The actual output after this is the same as before, but incorrect with respect to the output shape. However, it seems a bit larger and an issue to look at after the 1.17 release. * BUG: test, fix regression for array([pandas.DataFrame()]) * BUG: refactor to make sure tmp is DECREFfed
…rray-like" This reverts commit 71fc59d which was part of numpygh-13399 and the followup commit ba53a63 from numpygh-13663. The issue is that we mix Sequence protocol and `__array__` usage when coercing arrays. It seems easiest to revert this for the 1.17.x release and make a larger breaking change in the 1.18.x release cycle. Since this only occurs for stranger array-likes. This revert is limited to the 1.17.x branch for now.
…rray-like" This reverts commit 71fc59d which was part of numpygh-13399 and the followup commit ba53a63 from numpygh-13663. The issue is that we mix Sequence protocol and `__array__` usage when coercing arrays. It seems easiest to revert this for the 1.17.x release and make a larger breaking change in the 1.18.x release cycle. Since this only occurs for stranger array-likes. This revert is limited to the 1.17.x branch for now.
…rray-like" This reverts commit 71fc59d which was part of numpygh-13399 and the followup commit ba53a63 from numpygh-13663. The issue is that we mix Sequence protocol and `__array__` usage when coercing arrays. It seems easiest to revert this for the 1.18.x release and make a larger breaking change in the 1.19.x release cycle. Since this only occurs for stranger array-likes.
Prior to this commit
would recursively copy each element of
array_like
. This is due to the fact thatsetArrayFromSequence
only special-cased lists of NumPy arrays, any otherobject was treated as a sequence even if it supported buffer or
__array*__
interfaces. See tensorflow/tensorflow#27692 for details.
The commit generalizes the special-case in
setArrayFromSequence
to anyarray-like, i.e. a buffer or an object with
__array__
,__array_interface__
__array_struct__
.