-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
REGR: np.argwhere on pd.Series raises ValueError #35334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
the 0.25.x behavior was wrong |
that's whats returned by numpy on the underlying arr, see numpy/numpy#13610 (numpy/numpy#15555 (comment))
|
not broken anything so marking as ready for review. do we want an atomic PR to just address the np.argwhere or do we want to look at the other __array_wrap__ on Indexes as well and include more NumPy functions. |
I don't think those links are relevant. np.argwhere returns an array with different dimensions see, https://numpy.org/doc/stable/reference/generated/numpy.argwhere.html and so we cannot box the result in a Series. (unless we reshape the data and break consistency with NumPy) |
There is a problem (stemming from numpy only passing context for ufuncs) if the result dimensions and shape match the input, boxing can produce confusing results
here boxing in a DataFrame with the same labels makes no sense. without the context we don't know which function is being called. maybe we just raise if context is None instead or always return a Numpy array and push on with implementing __array_function__ where we know the calling function. |
@jreback closing this. There are issues with boxing the return of numpy operations eg. #35334 (comment) I think the best course of action is leave the wrapping as is to avoid breaking api changes. Any issues, such as #35331 will need to wait for array_function to be implemented. (even though that issue was a regression) The fundamental issues are that the semantics of an array and a DataFrame are significantly different (and hence the return of np.argwhere shouldn't be boxed) and numpy only passing context for ufuncs (hence we don't know np.argwhere is the function calling array_wrap) |
not sure what behaviour we want (will add tests after discussion). We should be thinking of removing __array_wrap__ in favour of __array_ufunc__ and __array_function__ anyhow.
from https://numpy.org/doc/stable/user/basics.subclassing.html?highlight=__array_wrap__#array-wrap-for-ufuncs-and-other-functions
in 0.25.3 we returned the NumPy array (albeit with a warning)