Skip to content

argwhere does not work with pandas Series #15555

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
drkarthi opened this issue Feb 12, 2020 · 10 comments
Closed

argwhere does not work with pandas Series #15555

drkarthi opened this issue Feb 12, 2020 · 10 comments
Labels

Comments

@drkarthi
Copy link

drkarthi commented Feb 12, 2020

np.argwhere() does not work on a pandas series in v1.18.1, whereas it works in an older version v1.17.3. Also, np.where() works on a pandas series but np.argwhere() does not.

Reproducing code example:

import numpy as np
import pandas as pd

ser = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])
np.argwhere(ser < 0)

Error message:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<__array_function__ internals>", line 6, in argwhere
  File "/Users/kramanathan/anaconda3/envs/py3.7/lib/python3.7/site-packages/numpy/core/numeric.py", line 584, in argwhere
    return transpose(nonzero(a))
  File "<__array_function__ internals>", line 6, in nonzero
  File "/Users/kramanathan/anaconda3/envs/py3.7/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 1896, in nonzero
    return _wrapfunc(a, 'nonzero')
  File "/Users/kramanathan/anaconda3/envs/py3.7/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 58, in _wrapfunc
    return _wrapit(obj, method, *args, **kwds)
  File "/Users/kramanathan/anaconda3/envs/py3.7/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 51, in _wrapit
    result = wrap(result)
  File "/Users/kramanathan/anaconda3/envs/py3.7/lib/python3.7/site-packages/pandas/core/generic.py", line 1918, in __array_wrap__
    return self._constructor(result, **d).__finalize__(self)
  File "/Users/kramanathan/anaconda3/envs/py3.7/lib/python3.7/site-packages/pandas/core/series.py", line 292, in __init__
    f"Length of passed values is {len(data)}, "
ValueError: Length of passed values is 1, index implies 5.

Numpy/Python version information:

1.18.1 3.7.4 (default, Aug 13 2019, 20:35:49)

@drkarthi drkarthi changed the title argwhere does not work with pandas Series in the latest release argwhere does not work with pandas Series Feb 12, 2020
@eric-wieser
Copy link
Member

Likely caused by me in #13610

@eric-wieser

This comment has been minimized.

@drkarthi
Copy link
Author

I have just updated the issue with the full error message

@drkarthi
Copy link
Author

Also interestingly, it does not fail with Python 3.6.9 and numpy 1.18.1 but fails with Python 3.7.4 and numpy 1.18.1

@eric-wieser
Copy link
Member

My guess is your pandas version is different between the two python versions?

@seberg
Copy link
Member

seberg commented Feb 12, 2020

I think this may be a pandas issue, and the error is probably correct. Note that I get:

/usr/lib/python3/dist-packages/numpy/core/fromnumeric.py:61: FutureWarning: Series.nonzero() is deprecated and will be removed in a future version.Use Series.to_numpy().nonzero() instead

with my default pandas+numpy version. So I doubt it was a change inside NumPy that caused the change in behaviour.

Now... the whole thing comes down to __array_wrap__ in pandas which cannot wrangle the (array) result back into a Series. That makes a lot of sense, but I am not sure whether we want to modify what NumPy does here.

@drkarthi
Copy link
Author

yes, my pandas versions are 0.25.3 where it does not fail and 1.0.1 where it fails

@drkarthi
Copy link
Author

drkarthi commented Feb 12, 2020

If the issue is in converting an array to a Series, it seems like a pandas issue. I can take a look into why they chose to deprecate Series.nonzero(), since it provides an inconsistent user interface between some functions (np.where) and others (np.argwhere).

@drkarthi
Copy link
Author

The reason I think this is an issue (either for pandas or numpy) is that the user does not necessarily know that np.argwhere uses the nonzero() function behind the scenes.

@miccoli
Copy link
Contributor

miccoli commented Oct 26, 2023

Just for reference: I was not able to reproduce this bug with numpy 1.26 and pandas 2.1.1... maybe it is safe to close the issue.

@seberg seberg closed this as completed Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants