Skip to content

ENH: Add dtype parameter to linspace and logspace functions. #3482

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 16, 2013

Conversation

jjhelmus
Copy link
Contributor

Also fixed up PEP8 format in related files

@charris
Copy link
Member

charris commented Jun 28, 2013

So what does this do and why?

@jjhelmus
Copy link
Contributor Author

arange and other NumPy fuctions allows you to define the dtype of the returned type with a dtype parameter. This adds the same functionality to the logspace and linspace functions. You can get the same results with

linspace(0, 10).astype('float32') but I think linspace(0, 10, dtype='float32') is more in line with other NumPy functions.

@charris
Copy link
Member

charris commented Jun 28, 2013

OK, seems reasonable, but the decription needs to be part of the commit message. The first line would probably look better as

ENH: Add dtype parameter to linspace and logspace functions.

@@ -75,22 +79,23 @@ def linspace(start, stop, num=50, endpoint=True, retstep=False):
"""
num = int(num)
if num <= 0:
return array([], float)
return array([], dtype)
if endpoint:
if num == 1:
return array([float(start)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

float needs to be replaced here.

@seberg
Copy link
Member

seberg commented Jun 28, 2013

Just to point out. If no dtype is given, we might want to check the dtype of start/stop, i.e. for better complex support (what I mean is, maybe: if dtype is None: np.result_type(start, stop, 0.), or so). The dtype of start/stop can actually still affect the dtype with the code as is, though I am not sure that matters much, since it preserves the floating point precision in any case.

@seberg
Copy link
Member

seberg commented Jun 28, 2013

Sorry, nvm the second point, I missed the y.astype at the end.

@jjhelmus
Copy link
Contributor Author

I would remove the float from the num <= return statement but it would break the TestLinspace:test_type as we would be returning a int type when num <= 0 and a float in other cases.

@seberg
Copy link
Member

seberg commented Jun 28, 2013

If you want to be overly clean, you could use: dtype = np.result_type(start, stop, dtype); np.array([start], dtype=dtype). But I don't quite understand, you should just move the cast into the dtype keyword argument.

if retstep:
return y, step
return y.astype(dtype), step
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After double checking, this seems not quite right. dtype(None) == np.float64 in other words, this casts to float no matter what the input is if None is given, which disagrees with the documentation.

@seberg
Copy link
Member

seberg commented Jun 28, 2013

hah, I am not thining straight sorry. The result type of course is nonsense, unless dtype is None (and in that case it should be result_type(start, stop, 0.).

@jjhelmus
Copy link
Contributor Author

Truthfully the default is to have a float64/float32 return type to maintain backwards compatibility, do we want to keep that or should we break compatibility and have linspace/logspace determine the most appropriate return type. My vote is to use a float return type by default, which means the docs should reflect this.

Also are you at the SciPy sprint? I'm in the skimage room. We could probably work this out much faster with a face-to-face discussion.

@seberg
Copy link
Member

seberg commented Jun 28, 2013

Well, for most cases the return type used to be whatever was calculated. Which is equivalent to what np.return_type(start, stop, 0.) or the dtype of start + stop + 0.. This was not true for the special cases of 0 or 1 elements though.

The issue is quite simple try np.linspace(1j, 4j) it works fine and someone might use it, but I think there is a test missing for this (it will misbehave for the 0 and 1 element special cases, but...). And it would seem that the PR as is would break it.

I am on the wrong continent for face to face discussion :).

@jjhelmus
Copy link
Contributor Author

The code now defaults to result_type(start, stop, float(num)) if the dtype is not provided. This will be float64 unless start or stop is complex. This will still fail if num is complex, but so will the int(num) line.

@seberg
Copy link
Member

seberg commented Jun 29, 2013

OK. I just started wondering whether arr.astype(None) should default to doing nothing!? (It should be a minimal change in the C-code only.) Anyone got an opinion on that?

@njsmith
Copy link
Member

njsmith commented Jun 29, 2013

.astype(None) looks like it should be an error to me, since None is not a
type. Is that what you mean?
On 29 Jun 2013 09:24, "seberg" notifications@github.com wrote:

OK. I just started wondering whether arr.astype(None) should default to
doing nothing!? (It should be a minimal change in the C-code only.) Anyone
got an opinion on that?


Reply to this email directly or view it on GitHubhttps://github.com//pull/3482#issuecomment-20226419
.

@seberg
Copy link
Member

seberg commented Jun 29, 2013

Good point. I see three options:

  1. arr.astype(None) is the same as arr.astype(np.dtype(None)) which is effectively arr.astype(np.float64). I.e. the "default" is np.float64 which is the case currently.
  2. arr.astype(None) should be an error, since it doesn't really have a default.
  3. arr.astype(None) defaults to dtype of arr, so it is like arr.astype(arr.dtype). The only use for that would be in code like this doing return result.astype(dtype, copy=False), since when an input dtype is None we should keep the dtype as it is. But I guess this should not be very typical.

I was first thinking 3., but... Like you said, maybe it should be changed to option 2. with a deprecation process. Especially considering that there are otherwise two possibilities and I don't see that there is one that is "obviously expected".

@charris
Copy link
Member

charris commented Jul 10, 2013

@seberg What is the status of this? I'm a bit worried as to what happens if dtype is not an inexact type. This also need a note in the doc/release/1.8.0-notes.rst if it goes in.

@seberg
Copy link
Member

seberg commented Jul 10, 2013

Hmmm, not sure about dtype being int having any traps, @jjhelmus what do you think? Even if it is a bit shaky though, dtype being not an inexact type is impossible unless you explicitely ask it to be. So I am not sure I would worry about it (who would set it to an inexact type ;)?)

@jjhelmus
Copy link
Contributor Author

@seberg I do not see an issue with int (or other non-inexact) dtype as long as it is only used when specially requested by a user. For example, I could see a user using something like:

np.linspace(1,10,10, dtype='int64')

which would return [1, 2, 3, 4..., 10] as int64. Obviously something like

np.linspace(1,10,11, dtype='int64')

Would truncate the values, but it was requested so I don't see an issue.

I do think adding a check for dtype is None to the logspace function like that in the linspace would be good. Then if the behavior of arr.astype(None) changes it would have no effect.

@seberg
Copy link
Member

seberg commented Aug 5, 2013

Sorry, busy and forgot about this. There is a bit of an issue with the truncation for logspace because casting to integer is done too early (i.e. after/within the linspace, not after the power operation), resulting in a loss of precision. For the linspace step within logspace, we might want to force an inexact type.

As charris noted, it should have a short mention in the release notes, too. Other then that, as far as I remember this should be ready to go in.

@charris
Copy link
Member

charris commented Aug 16, 2013

@jjhelmus Want to add something to the release notes? The 1.8 branch will be made this Sunday.

@charris
Copy link
Member

charris commented Aug 16, 2013

That would be in doc/release/1.8.0-notes.rst.

@jjhelmus
Copy link
Contributor Author

I fixed up the truncation of the logspace by adding the casting after the call to linspace. Also added a note about the new functionality in the release notes. Let me know if I should move this to 1.9.0-notes.rst since 1.8 has been branched. I'll need to rebase or merge with master to do this.

@seberg
Copy link
Member

seberg commented Sep 7, 2013

Hey, sorry was rather busy lately. If you don't want to lobby a lot for it, I would skip into 1.9. (if you do, do it fast ;)), even if it is my fault for holding this up. Doing the logspace like this seems fine to me (I don't want to argue about higher precision floats, since it all seems a bit tricky and probably not worth it, and it is documented what it does).

So, after rebasing I think this can be merged.

Many NumPy functions such as arange allow users to define the dtype of the
returned type with a dtype parameter.  This adds this same functionality
to the logspace and linspace functions.
@jjhelmus
Copy link
Contributor Author

Added in seberg return to logspace when dtype is None, rebased onto numpy/master, squashed commits into a single commit, and moved the documentation of the new functionality to 1.9.0. Should be ready to merge now.

@seberg
Copy link
Member

seberg commented Sep 16, 2013

Thanks, seems like it is time to merge this. :)

seberg added a commit that referenced this pull request Sep 16, 2013
ENH: Add dtype parameter to linspace and logspace functions.
@seberg seberg merged commit 4f1f9d2 into numpy:master Sep 16, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants