ENH: Make 'low' optional in randint #7151

gfyoung · 2016-01-30T20:52:03Z

Added functionality to accept a combination of low and high parameter values such as
(None, None) or (None, <number>). The (None, None) option is different from np.empty because it doesn't fill an array with just Py_None values. Such options are useful for example if you want to generate numbers for various given dtype arguments but don't want to have to keep passing in low every time.

The resulting behavior is then defined as follows:

low == None and high == None

Numbers are generated over the range [lowbnd, highbnd), where lowbnd = np.iinfo(dtype).min, and highbnd = np.iinfo(dtype).max, where dtype is the provided integral type.

low != None and high == None

If low >= 0, numbers are still generated over the range [0, low), but if low < 0, numbers
are generated over the range [low, highbnd), where highbnd is defined as above.

low == None and high != None

Numbers are generated over the range [lowbnd, high), where lowbnd is defined as above.

gfyoung · 2016-02-04T00:20:10Z

Can someone take a look at this?

madphysicist · 2016-02-04T17:06:06Z

benchmarks/benchmarks/bench_random.py

@@ -34,11 +34,11 @@ class Randint(Benchmark):

    def time_randint_fast(self):
        """Compare to uint32 below"""
-        np.random.randint(0, 2**30, size=10**5)
+        np.random.randint(low=0, high=2**30, size=10**5)


I think you should keep the invocation as it was before to ensure backwards compatibility. The same goes for all the other benchmarks and tests you modified in a similar way, especially the ones that just use the function but don't explicitly test it, such as test_multiarray.py, test_mem_overlap.py, etc. If necessary, add a couple more to use the new functionality instead of changing the old ones.

I'm not sure I follow what backwards compatibility you are trying to enforce here. I see no reason why tests should continue to use a "dated" API.

The API is not dated since I am going to guess that most existing code calls this function as np.random.randint(somenumber) and np.random.randint(somenumber, someothernumber). Just because the new version is a useful convenience does not mean that the old one is invalid or even less desirable. Removing the old calling convention from so many of the tests is just asking for trouble later down the line.

The API would become dated if this PR is merged because low no longer is a positional argument. While it may be worth adding a test to ensure the non-keyword argument call still works, I don't think it should persist in the code-base with this change.

changing non-test code is imo ok, but when changing test code you need to make sure that no coverage of the old still supported way of calling it is lost. It is not obvious if that has been done.
Please add new test cases for the old way of calling the function if you have removed them.

I may have overlooked it but I also see no tests for the new None cases

@juliantaylor : The tests are here

I'm also not very enthusuastic about changing all the old unit tests to use the new keywords.

IMHO unit tests should always use the most up-to-date versions of the API for clarity and consistency. That is why I put one test in this PR for backwards compatibility in which I call the function using the "old" way.

I think guaranteeing that old functionality hasn't broken is more important than checking that new functionality works. I favor virtually never changing old unit tests, unless there is an intentional backwards incompatible change.

I would be OK with you leaving the old unit tests alone, but duplicating some of them and adding keywords in the duplicates, if your new unit tests below don't cover those cases.

I specifically wrote a test here to address that concern. Maybe it should be expanded, but I don't follow why you would want to have all of your other tests testing the old functionality.

madphysicist · 2016-02-04T18:49:42Z

low != None and high == None

If low >= 0, numbers are still generated over the range [0, low), but if low < 0, numbers
are generated over the range [low, highbnd), where highbnd is defined as above.

One other thing that bothers me that case #2 has non-intuitive behavior (the old behavior) compared to how the rest of the function now works. At the very least there should be a very emphatic note in the docs saying that the previous behavior is a special case for the new function, preserved for backwards compatibility.

Alternatively, it may be better to introduce a wrapper for randint that has completely consistent behavior for all the argument combinations. That is basically what your code is doing anyway. Instead of munging the parameters directly in randint, why not keep the original function as-is and have a consistent wrapper that knows what to pass in to it?

gfyoung · 2016-02-04T19:31:08Z

numpy/add_newdocs.py

@@ -3680,7 +3680,7 @@ def luf(lamdaexpr, *args, **kwargs):

    Examples
    --------
-    >>> x = np.random.randint(9, size=(3, 3))
+    >>> x = np.random.randint(low=9, size=(3, 3))
    >>> x


I'm not sure your comment quite applies here because this isn't even documentation for np.randint where you are commenting.

Agreed. Removed.

gfyoung · 2016-02-04T22:59:08Z

I think I will go for the documentation emphasis on backwards incompatibility. I think the wrapper would add unnecessary complexity to already complicated function.

gfyoung · 2016-02-10T20:14:07Z

Added test for old non-keyword behavior
Expanded documentation and examples for randint
Added low = None and high < 0 test for completeness

Travis and Appveyor are both happy, so should be good to merge then.

gfyoung · 2016-02-17T09:15:32Z

Can someone with merging capabilities take a look at this? I think it is ready to go.

ahaldane · 2016-02-17T14:23:27Z

This is a good idea, and the behavior looks right.

I agree with some of the comments above though, where I replied.

ahaldane · 2016-02-17T14:33:52Z

numpy/random/mtrand/mtrand.pyx

+        where `lowbnd` is defined as previously. If only `high` is None,
+        then if `low` > 0, results are from [0, `low`). This behaviour is
+        intended to maintain **backwards compatibility**. Otherwise, results
+        are from [`low`, `highbnd`), where `highbnd` is defined as previously.


I agree with @madphysicist here that instead of saying "This behaviour is intended to maintain backwards compatibility", I would prefer to say something like "This is so that randint(n) will give a random integer from 0 to n".

I think most/many uses of randint are still going to use positional arguments, even after this PR.

The latter explanation is confusing IMO: it seems very out of place when you compare it to the case when low=None and high != None. Backwards compatibility is the sole reason for me leaving the behavior as it is. Otherwise, I would have changed it.

Probably many will still use randint with positional arguments, but nevertheless people should be encouraged to adapt to the new API since it technically isn't breaking. I'm not sure I follow your reasoning as to why function calls in the code-base should continue to follow positional arguments when using = is a lot cleaner and clearer just because of ingrained user habits.

Added functionality to accept a combination of 'low' and 'high' parameter values such as (None, None) or (None, <number>). The (None, None) option is different from np.empty because it doesn't fill an array with 'Py_None' values.

madphysicist · 2016-02-17T14:55:29Z

Just a question: have you bounced this off the mailing list? If so, would you mind posting the link here for reference? This is a big change, even if it does nominally maintain backwards compatibility. While I am personally not a huge fan of this change, but I do not think that should stop it from happening if the consensus is that people would like to see it done.

gfyoung · 2016-02-17T15:01:51Z

@madphysicist : There is an email out now, and the link is here

madphysicist · 2016-02-17T15:08:46Z

I appreciate that. I think that if you get a lot of positive response, it will help convince the devs to merge your changes.

For future reference, here is the archive link: https://mail.scipy.org/pipermail/numpy-discussion/2016-February/075018.html

gfyoung · 2016-02-17T15:09:52Z

@madphysicist : Well certainly a positive response from the general mailing list couldn't hurt! :)

gfyoung · 2016-02-17T21:53:52Z

Alas, after posting to the email list, there didn't seem to be much support for this, so closing.

charris added 01 - Enhancement component: numpy.random labels Jan 31, 2016

madphysicist reviewed Feb 4, 2016
View reviewed changes

gfyoung reviewed Feb 4, 2016
View reviewed changes

ahaldane reviewed Feb 17, 2016
View reviewed changes

ENH: Made low optional in randint

527c354

Added functionality to accept a combination of 'low' and 'high' parameter values such as (None, None) or (None, <number>). The (None, None) option is different from np.empty because it doesn't fill an array with 'Py_None' values.

gfyoung closed this Feb 17, 2016

gfyoung deleted the randint_no_low branch February 17, 2016 21:53

gfyoung mentioned this pull request May 7, 2017

ENH: Allow Randint to Broadcast Arguments #6938

Closed

Uh oh!

ENH: Make 'low' optional in randint #7151

ENH: Make 'low' optional in randint #7151

Uh oh!

Conversation

gfyoung commented Jan 30, 2016

Uh oh!

gfyoung commented Feb 4, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

madphysicist commented Feb 4, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gfyoung commented Feb 4, 2016

Uh oh!

gfyoung commented Feb 10, 2016

Uh oh!

gfyoung commented Feb 17, 2016

Uh oh!

ahaldane commented Feb 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

madphysicist commented Feb 17, 2016

Uh oh!

gfyoung commented Feb 17, 2016

Uh oh!

madphysicist commented Feb 17, 2016

Uh oh!

gfyoung commented Feb 17, 2016

Uh oh!

gfyoung commented Feb 17, 2016

Uh oh!

Uh oh!