Skip to content

invert_permutation() function #9880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
shoyer opened this issue Oct 17, 2017 · 5 comments
Open

invert_permutation() function #9880

shoyer opened this issue Oct 17, 2017 · 5 comments

Comments

@shoyer
Copy link
Member

shoyer commented Oct 17, 2017

The "obvious" way to get the rank of elements in a NumPy array is to use array.argsort().argsort():
https://stackoverflow.com/questions/5284646/rank-items-in-an-array-using-python-numpy

This is obviously slightly inefficient, due to sorting the array twice. In contrast, the optimal solution simply inverts the order of the array elements for the second argsort(), which can be done in a single pass using indexing assignment:

array = numpy.array([4,2,7,1])
temp = array.argsort()
ranks = numpy.empty(len(array), int)
ranks[temp] = numpy.arange(len(array))

I'd like to propose adding an invert_permutation() or inverse_permutation() helper function to take care of this logic, which could be modeled off this helper function in xarray:
https://github.com/pydata/xarray/blob/2949558b75a65404a500a237ec54834fd6946d07/xarray/core/nputils.py#L38-L55

As precedent, note that TensorFlow has such a function, too: https://www.tensorflow.org/versions/r1.0/api_docs/python/tf/invert_permutation

@eric-wieser
Copy link
Member

eric-wieser commented Oct 18, 2017

What about supporting the argsort(axis=...) equivalent? put_along_axis from #8714 would make that straightforward to implement:

def inverse_permutation(indices, axis=0):
    # I don't think axis == None has a useful interpretation unless indices.ndim == 1
    axis = np.core.multiarray.normalize_axis_index(axis, indices.ndim)
    perm = np.empty(indices.shape, dtype=np.intp)
    values_shape = [1] * indices.ndim
    values_shape[axis] = -1
    values = np.arange(indices.shape[axis], dtype=np.intp).reshape(values_shape)
    np.put_along_axis(perm, indices, values, axis=axis)
    return perm

@bashtage
Copy link
Contributor

pandas has a similar function named rank, although it is 1-based. It also returns floats since ties can generate non-integer values.

import pandas as pd
pd.DataFrame([[3],[2],[9],[-1],[0],[11],[-3]]).rank()
Out[3]: 
     0
0  5.0
1  4.0
2  6.0
3  2.0
4  3.0
5  7.0
6  1.0

@eric-wieser
Copy link
Member

Updated my comment above with the axis-generic implementation

@eric-wieser
Copy link
Member

Would lib.index_tricks be a sensible home for such a function?

@shoyer
Copy link
Member Author

shoyer commented Sep 6, 2018

Would lib.index_tricks be a sensible home for such a function?

Yes, probably -- do we expose this module as public API?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants