-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
ENH: Added atleast_nd #7804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Added atleast_nd #7804
Conversation
numpy/core/shape_base.py
Outdated
""" | ||
res = [] | ||
for ary in arys: | ||
ary = asanyarray(ary) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted on the mailing list, if we go for this at all, you can just write
ary = array(ary, copy=False, subok=True, ndim=n)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not if I add an option to place the dimensions at the end. I may still work this in without the ndim
argument for that case though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not use your suggestion literally because zero-dim arrays need special handling, but I did something quite similar, just without the ndmin
parameter (it's not ndim
for array
).
d432669
to
40b299a
Compare
38ca1ce
to
75a9e16
Compare
So yeah... |
bump |
can you implement the lower dimension variants in terms of this one? |
Not |
unless this function is significantly slower I would suggest to implement 1d and 2d in terms of this one for the extra test coverage and less code. |
75a9e16
to
7831134
Compare
Done. Reduced |
7831134
to
5b44313
Compare
☔ The latest upstream changes (presumably #7823) made this pull request unmergeable. Please resolve the merge conflicts. |
numpy/core/shape_base.py
Outdated
------- | ||
res : ndarray | ||
An array with ``a.ndim >= n``. Copies are avoided where | ||
possible, and a view with `n` or more dimensions is returned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a copy ever needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know enough about the internals to know if res = array(ary, copy=False, subok=True)
will ever return a copy under normal circumstances. My guess is no, but I wanted to do some mild CYA in the docs, just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will only make a copy for non numpy arrays (e.g., lists).
On Mon, Aug 29, 2016 at 8:12 AM Joseph Fox-Rabinovitz <
notifications@github.com> wrote:
In numpy/core/shape_base.py
#7804 (comment):
- ary : array_like
The input array. Non-array inputs are converted to arrays.
Arrays that already have `n` or more dimensions are preserved.
- n : scalar
The minimum number of dimensions required for each of the input
arrays.
- pos : {'start', 'end'}, optional
Where to add new dimensions. All new dimensions are added on the
same side of the shape. Default is 'start'.
- Returns
- res : ndarray
An array with `a.ndim >= n`. Copies are avoided where
possible, and a view with `n` or more dimensions is returned.
I do not know enough about the internals to know if res = array(ary,
copy=False, subok=True) will ever return a copy under normal
circumstances. My guess is no, but I wanted to do some mild CYA in the
docs, just in case.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/numpy/numpy/pull/7804/files/5b443132ec6c30aef974c73b30b510e6b60e8755#r76622781,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABKS1p0XcxxYEGc_49tnk41VdMQUHXgoks5qkvSFgaJpZM4JFvBv
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the docstring to reflect that now.
8794a2e
to
0f13b53
Compare
☔ The latest upstream changes (presumably #7922) made this pull request unmergeable. Please resolve the merge conflicts. |
0f13b53
to
62b5724
Compare
see issue 8214 |
numpy/core/shape_base.py
Outdated
Arrays that already have `n` or more dimensions are preserved. | ||
n : scalar | ||
The minimum number of dimensions required for each of the input | ||
arrays. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's only ever 1 array, for each of the input arrays doesn't seem to make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. Fixed now.
This looks pretty solid to me. Did we ever discuss it on the mailing list? Two suggestions:
|
Mailing list thread start: https://mail.scipy.org/pipermail/numpy-discussion/2016-July/075759.html. There was no solid consensus to merge, but no strong objections either if I remember correctly. |
I think that suggestion 2 is adequately covered in the notes section already. I added another entry for suggestion 1. |
A couple of comments on the API:
|
47933b7
to
b7ed12d
Compare
@shoyer. I've squashed in all the changes you requested. |
numpy/core/shape_base.py
Outdated
The minimum number of dimensions required. | ||
pos : int, optional | ||
The index to insert the new dimensions. May range from | ||
``-ary.ndim`` to ``+ary.ndim``. Non-negative indices indicate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not true - the lower bound is -ary.ndim - 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the catch. I had a feeling at the back of my head that the indices should not be the same, but was in too much of a hurry to place it. Fixed now.
I agree
Can you give a concrete example of where you'd expect this? As you note, for the examples you've given so far, it doesn't matter. Given that we've come up with no convincing use cases for intermediate values, perhaps we should just allow only |
b7ed12d
to
d84ebd3
Compare
@eric-wieser The only use-case I can think of is something I did a while back where I needed to do a dot product over the last two dimensions in a 2D image that contained essentially small 2D matrices for each pixel, which would eventually be converted to color values (so a 4D array). After all the axis rolling, it turned out that the best way to broadcast to the particular RGB format I was interested in was to insert two unit axes at index 2. I honestly can't remember the exact details, and I'm sure there is a better way to do it, but I think it's better to split the existing dimensions rather than the new ones. I would also like to leave the ability to insert into an arbitrary index, not just 0 and -1. |
d84ebd3
to
d00e0b6
Compare
I forgot to remove the |
d00e0b6
to
849afcf
Compare
Rebased. Is everyone more or less happy with the state of this PR? |
Say I have a 1D array with shape (N,), and I want to interpret it as an image so I can pass it to my image processing tools that all expect 3D arrays where the first two dimensions are spatial and the last are channels. To ensure that inputs are in correct formats I would use atleast_nd with n=3 and pos=-1. In this case I would expect the new shape to be (N, 1, 1). A similar case might arise when working with 4D tensor libraries like pytorch. I don't think its an entirely common case, but I think it would exist and in that rare case I would expect new axis to be grouped. I think the key idea that illustrates my expectation is what happens when there are 3 new axes? Will there be two at the front and one at the end or the other way. It seems ambiguous to allow for any other case than grouped axes.
This is my preference, but I'm not attached to it. I just think its conceptually easier to understand document and use. It prevents users from shooting themselves in the foot. |
@Erotemic While I agree with you that grouping the new axes rather than the existing ones is by far the more sensible approach, I prefer to allow positions other than 0 and -1. For the work that I do, the color planes are stored separately, so it is often easier to have the channel number as the first rather than the last dimension. In that case it would be very helpful to insert the unit dimension into index -3. |
c74febd
to
c8a0caa
Compare
Bump? |
Function only accepts one array and an indexed placement spec. Masked arrays fully supported. atleast_1d and atleast_2d re-implemented in terms of new func. Tests included.
c8a0caa
to
8720da5
Compare
@eric-wieser Want to revisit this? You seemed the most critical. @madphysicist Was this discussed on the list at some point? |
@charris. At some length, but a very long time ago: https://mail.python.org/pipermail/numpy-discussion/2016-July/075722.html |
I only skimmed the discussions, but: As syntactic sugar, I think it's a good addition. I was just looking at code that would use this: The Suggestion for this PR: Optimize for the most common case where |
[[[[1, 2, 3]]]] | ||
""" | ||
ary = array(ary, copy=False, subok=True) | ||
if ary.ndim: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this if?
Why not move the pos =
line into the if extra >0
block?
pos = normalize_axis_index(pos, ary.ndim + 1) | ||
extra = ndim - ary.ndim | ||
if extra > 0: | ||
ind = pos * (slice(None),) + extra * (None,) + (Ellipsis,) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code reminds me it would be nice to have an axis
argument to put
: see issue #8765
This addition looks desirable. @madphysicist I'm going to close this, but feel free to rebase and resubmit. That will bring it closer to the head of the list :) |
Introduced generalized
atleast_nd
function. Similar toatleast_1d
,atleast_2d
,atleast_3d
. Also added for masked arrays. Tests included.This PR also fixes an issue with masked arrays that would crash the
atleast_*
functions if multiple scalars were input, e.g.np.ma.atleast_1d(1.0, 2.0, 3.0)
would raise an error.