-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
ENH: Make 'low' optional in randint #7151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Can someone take a look at this? |
@@ -34,11 +34,11 @@ class Randint(Benchmark): | |||
|
|||
def time_randint_fast(self): | |||
"""Compare to uint32 below""" | |||
np.random.randint(0, 2**30, size=10**5) | |||
np.random.randint(low=0, high=2**30, size=10**5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should keep the invocation as it was before to ensure backwards compatibility. The same goes for all the other benchmarks and tests you modified in a similar way, especially the ones that just use the function but don't explicitly test it, such as test_multiarray.py, test_mem_overlap.py, etc. If necessary, add a couple more to use the new functionality instead of changing the old ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow what backwards compatibility you are trying to enforce here. I see no reason why tests should continue to use a "dated" API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API is not dated since I am going to guess that most existing code calls this function as np.random.randint(somenumber)
and np.random.randint(somenumber, someothernumber)
. Just because the new version is a useful convenience does not mean that the old one is invalid or even less desirable. Removing the old calling convention from so many of the tests is just asking for trouble later down the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API would become dated if this PR is merged because low
no longer is a positional argument. While it may be worth adding a test to ensure the non-keyword argument call still works, I don't think it should persist in the code-base with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing non-test code is imo ok, but when changing test code you need to make sure that no coverage of the old still supported way of calling it is lost. It is not obvious if that has been done.
Please add new test cases for the old way of calling the function if you have removed them.
I may have overlooked it but I also see no tests for the new None
cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@juliantaylor : The tests are here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not very enthusuastic about changing all the old unit tests to use the new keywords.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO unit tests should always use the most up-to-date versions of the API for clarity and consistency. That is why I put one test in this PR for backwards compatibility in which I call the function using the "old" way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think guaranteeing that old functionality hasn't broken is more important than checking that new functionality works. I favor virtually never changing old unit tests, unless there is an intentional backwards incompatible change.
I would be OK with you leaving the old unit tests alone, but duplicating some of them and adding keywords in the duplicates, if your new unit tests below don't cover those cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I specifically wrote a test here to address that concern. Maybe it should be expanded, but I don't follow why you would want to have all of your other tests testing the old functionality.
One other thing that bothers me that case #2 has non-intuitive behavior (the old behavior) compared to how the rest of the function now works. At the very least there should be a very emphatic note in the docs saying that the previous behavior is a special case for the new function, preserved for backwards compatibility. Alternatively, it may be better to introduce a wrapper for |
@@ -3680,7 +3680,7 @@ def luf(lamdaexpr, *args, **kwargs): | |||
|
|||
Examples | |||
-------- | |||
>>> x = np.random.randint(9, size=(3, 3)) | |||
>>> x = np.random.randint(low=9, size=(3, 3)) | |||
>>> x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure your comment quite applies here because this isn't even documentation for np.randint
where you are commenting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Removed.
I think I will go for the documentation emphasis on backwards incompatibility. I think the wrapper would add unnecessary complexity to already complicated function. |
Travis and Appveyor are both happy, so should be good to merge then. |
Can someone with merging capabilities take a look at this? I think it is ready to go. |
This is a good idea, and the behavior looks right. I agree with some of the comments above though, where I replied. |
where `lowbnd` is defined as previously. If only `high` is None, | ||
then if `low` > 0, results are from [0, `low`). This behaviour is | ||
intended to maintain **backwards compatibility**. Otherwise, results | ||
are from [`low`, `highbnd`), where `highbnd` is defined as previously. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @madphysicist here that instead of saying "This behaviour is intended to maintain backwards compatibility", I would prefer to say something like "This is so that randint(n) will give a random integer from 0 to n".
I think most/many uses of randint are still going to use positional arguments, even after this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latter explanation is confusing IMO: it seems very out of place when you compare it to the case when low=None
and high != None
. Backwards compatibility is the sole reason for me leaving the behavior as it is. Otherwise, I would have changed it.
Probably many will still use randint
with positional arguments, but nevertheless people should be encouraged to adapt to the new API since it technically isn't breaking. I'm not sure I follow your reasoning as to why function calls in the code-base should continue to follow positional arguments when using =
is a lot cleaner and clearer just because of ingrained user habits.
Added functionality to accept a combination of 'low' and 'high' parameter values such as (None, None) or (None, <number>). The (None, None) option is different from np.empty because it doesn't fill an array with 'Py_None' values.
Just a question: have you bounced this off the mailing list? If so, would you mind posting the link here for reference? This is a big change, even if it does nominally maintain backwards compatibility. While I am personally not a huge fan of this change, but I do not think that should stop it from happening if the consensus is that people would like to see it done. |
@madphysicist : There is an email out now, and the link is here |
I appreciate that. I think that if you get a lot of positive response, it will help convince the devs to merge your changes. For future reference, here is the archive link: https://mail.scipy.org/pipermail/numpy-discussion/2016-February/075018.html |
@madphysicist : Well certainly a positive response from the general mailing list couldn't hurt! :) |
Alas, after posting to the email list, there didn't seem to be much support for this, so closing. |
Added functionality to accept a combination of
low
andhigh
parameter values such as(None, None)
or(None, <number>)
. The(None, None)
option is different fromnp.empty
because it doesn't fill an array with justPy_None
values. Such options are useful for example if you want to generate numbers for various givendtype
arguments but don't want to have to keep passing inlow
every time.The resulting behavior is then defined as follows:
low == None
andhigh == None
Numbers are generated over the range
[lowbnd, highbnd)
, wherelowbnd = np.iinfo(dtype).min
, andhighbnd = np.iinfo(dtype).max
, wheredtype
is the provided integral type.low != None
andhigh == None
If
low >= 0
, numbers are still generated over the range[0, low)
, but iflow
< 0, numbersare generated over the range
[low, highbnd)
, wherehighbnd
is defined as above.low == None
andhigh != None
Numbers are generated over the range
[lowbnd, high)
, wherelowbnd
is defined as above.