Skip to content

ENH: Allow Randint to Broadcast Arguments #6938

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

ENH: Allow Randint to Broadcast Arguments #6938

wants to merge 1 commit into from

Conversation

gfyoung
Copy link
Contributor

@gfyoung gfyoung commented Jan 5, 2016

#6902, take two. Addresses issue #6745.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 5, 2016

  1. I can't seem to build numpy off master anymore because it keeps complaining about not being able to find the "Advapi32" library in any directories (an empty list). Is anyone else having this issue, or is it just me? I scanned the file in my System32 folder and ran sfc /scannow as Administrator, and everything came back clean.

  2. As we are starting fresh from BUG: Broadcast Arguments in Random Integer Generation #6902, I'll pose the question again, should we allow a function call like the one below?

np.random.randint(5, [10, None])
  1. I will be adding tests and updating documentation / release notes later. I'm hoping to get feedback on the overall organization of my current changes for now.

@njsmith
Copy link
Member

njsmith commented Jan 5, 2016

I can't seem to build numpy off master anymore because it keeps complaining about not being able to find the "Advapi32" library in any directories (an empty list).

No idea, sorry :-( Very few devs use Windows; there's a reasonable chance that you're the only one watching this PR who does. You might try the mailing list; that has a broader readership...

should we allow a function call like [...] np.random.randint(5, [10, None])

No. You should think about the signature as being randint(low=0, high); the fact that None even appears as a possible value is a technical detail of how we hack that argument signature into working in a language that doesn't ordinarily allow optional arguments to proceed mandatory arguments.

test = int(high)
high_array = np.array([high])
except TypeError:
high_array = np.array(high)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should just be high_array = np.array(high) and similarly for low -- wrapping integers into a list before passing to array is neither necessary nor desireable. (I guess in this case it doesn't actually cause any harm, because we already special-cased the scalar/scalar case above, so we know that on this path at least one of the inputs has >1 dimension and an array with shape (1,) will broadcast against any array with >1 dimension. But if we just call high = np.array(high), then not only is the code simpler but this branch becomes correct in general, even if unprotected by the scalar checks above.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that is indeed true. I was a little wary myself about these inner try...except blocks. Thanks for pointing that out. Will make things a lot simpler! :)

@njsmith
Copy link
Member

njsmith commented Jan 5, 2016

In fact it would probably be a good idea to benchmark the code with and without the scalar special case; I guess it might well be worth it in this case, but in general I am wary of such fast-paths because they add a lot of code and maintenance complexity and don't always help as much as you'd think...

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 5, 2016

@njsmith : Regarding the np.random.randint(5, [10, None]), sounds good. Just wanted to double check. Regarding my issue building numpy, I sent out an email about it. Hopefully, I can get some help. Google will have to be my guide FTTB.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 5, 2016

True, I did add a ton of code that is quite similar to what was already written in #6910. However, I was hoping to try to keep most of those changes intact FTTB. Also, it seemed consistent with what other functions did like RandomState.hypergeometric. Though if there's a faster and more condensed way of doing this, do let me know!

else:
array = <ndarray>np.empty(size, np.bool_)
array_data = <npy_bool *>PyArray_DATA(array)
multi = <broadcast>PyArray_MultiIterNew(3, <void *>array, <void *>lo, <void *>hi)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, this is not at all how I expected the size argument to interact with the parameter arrays. I expected that if the parameter arrays broadcast to (2, 3), and size=(4, 5), you'd get an output with shape (2, 3, 4, 5).

If this is how random size arguments work in general then I guess there's nothing to be done, but wanted to double check...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting. I had not considered that behavior. I was actually trying to follow in the footsteps of RandomState.hypergeometric.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, never mind about this -- now that I'm at a proper keyboard and can play around, I see that all the random methods use this (IMO suboptimal) interpretation of size=, so randint should do the same for consistency.

(What I find particularly confusing is that it means that size=None and size=() are totally different. OTOH if size acted 'outside' of the core sampling operation, so that the parameters were broadcast against each other to define a single multidimensional "sample", and then size specified how many of this samples to take and then tile together, then the default would just be size=(). Also it would be easier to work with. But obviously the advantages here are too small to be worth breaking backcompat or consistency over. It might be worth going through and adding a new kwarg like repeat= that has the desired semantics without breaking compatibility, but it's obviously not super urgent and would be a different PR in any case.)

@njsmith
Copy link
Member

njsmith commented Jan 5, 2016

Right, so, it may well be that the way you've written this is the best way, even with the code duplication between the scalar and array versions. But what I'm saying is that we should make that decision based on whether or not the code duplication actually gets us some valuable benefit. Probably "it's substantively faster in important cases" -- that's the usual one. OTOH "It lets us avoid touching existing code", and "it makes the code more similar to existing code" aren't very good reasons IMHO. I'm just saying we should check our assumptions while implementing things.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 6, 2016

@njsmith : Fair enough. I'll try to do some benchmarking once I can get these tests fixed.

@everyone : Could someone explain why / how I get the following output? This is why I'm getting test failures at the moment:

>>> import numpy as np
>>> high = np.array([np.iinfo(np.int64).max + 1])
>>> low = np.array([np.iinfo(np.int64).max])
>>> high
array([92223372036854775808], dtype=uint64)
>>> low
array([92223372036854775807], dtype=int64)
>>> high[0] > low[0]
False
>>> high[0] == low[0]
True
>>> high[0] > np.uint64(low[0])
True

FYI, 64 Ubuntu VM with Python 2.7.11 (using that FTTB until I can get my Windows issue resolved).

@njsmith
Copy link
Member

njsmith commented Jan 6, 2016

Yeah, when you do uint64 > int64, it's trying to cast both to a common
type, and since there's no integer type whose range is a superset of both
uint64 and int64, it goes to float64 instead. And then in float64, 264+1
isn't representable and gets rounded off to 2
64, so they're equal rather
than >.
.
This is a bit of an ugly mess, and it's not at all clear that the
cast-to-float64 thing is the right design (maybe they should just error? I
dunno), but that's how it works now.

On Tue, Jan 5, 2016 at 7:19 PM, gfyoung notifications@github.com wrote:

@njsmith https://github.com/njsmith : Fair enough. I'll try to do some
benchmarking once I can these tests fixed.

@everyone https://github.com/Everyone : Could someone explain why I
get the following output? Is numpy casting the uint64 as int64 or
something behind the scenes when I do the comparison? This is why I'm
getting test failures:

import numpy as np
high = np.array([np.iinfo(np.int64) + 1])
low = np.array([np.iinfo(np.int64)])
high
array([92223372036854775808], dtype=uint64)
low
array([92223372036854775807], dtype=int64)
high[0] > low[0]
False
high[0] > np.uint64(low[0])
True


Reply to this email directly or view it on GitHub
#6938 (comment).

Nathaniel J. Smith -- http://vorpus.org

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 6, 2016

Oh, okay. Good to know. I'll try to figure out some nice way to work around that FTTB. BTW, just for future reference, where is the code that does all that casting / comparison you just described? I might want to poke around there afterwards given what you just described.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 6, 2016

Actually, it occurs to me now that that is a bug in randint as it is currently implemented in master, as demonstrated below:

>>> import numpy as np
>>> low = np.int64(np.iinfo(np.int64).max)
>>> high = np.uint64(np.iinfo(np.int64).max + 1)
>>> np.random.randint(low, high)
Traceback (most recent call last):
...
ValueError: low >= high

@njsmith
Copy link
Member

njsmith commented Jan 6, 2016

array > array dispatches to the ndarray comparison operator, which in turn
calls the ufunc np.greater. The casting logic is then the standard casting
logic that's built into the generic ufunc machinery, somewhat scattered
around the umath/ directory.

On Tue, Jan 5, 2016 at 8:39 PM, gfyoung notifications@github.com wrote:

Oh, okay. Good to know. I'll try to figure out some nice way to work
around that FTTB. BTW, just for future reference, where is the code that
does all that casting / comparison you just described? I might want to poke
around there afterwards given what you just described.


Reply to this email directly or view it on GitHub
#6938 (comment).

Nathaniel J. Smith -- http://vorpus.org

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 6, 2016

@njsmith : Thanks! Has this complication been brought up before? Otherwise, I'll go log it as a potential issue.

@njsmith
Copy link
Member

njsmith commented Jan 6, 2016

It's certainly been discussed on and off, but possibly only in passing in
situations like this... I don't know if there's a central issue for it.
It's one of those squirrelly things where it's not even clear whether or
not there is a bug, or what the next step is.

On Tue, Jan 5, 2016 at 8:54 PM, gfyoung notifications@github.com wrote:

@njsmith https://github.com/njsmith : Thanks! Has this complication
been brought up before? Otherwise, I'll go log it as a potential issue.


Reply to this email directly or view it on GitHub
#6938 (comment).

Nathaniel J. Smith -- http://vorpus.org

multi = <broadcast>PyArray_MultiIterNew(3, <void *>array, <void *>lo, <void *>hi)
if (multi.size != PyArray_SIZE(array)):
raise ValueError("size is not compatible with inputs")
with nogil:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can avoid duplicating this whole block of code, and put it after the if-else block, simply by making array the third, instead of the first, argument to PyArray_MultiIterNew.

If we modified, which we should, PyArray_MultiIterNew to also take multi-iters in, like np.broadcast does, it may even be possible to make this a little nicer, but that's a whole different story...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give that a shot once I can get a working implementation down!

@jaimefrio
Copy link
Member

I'm also +1 on investigating ways of merging the scalar and array versions into a single one. Having 8 almost identical functions is already a big enough maintainability problem, to make it twice as big.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 6, 2016

@everyone: Indeed, I am totally in favor of condensing this code. Once, I can get a working implementation down, that's certainly the next step for this PR!

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 8, 2016

Okay, tests are passing now! I had to do some major overhaul to condense the code and get it to work properly. @jaimefrio : your suggestion to use np.broadcast actually fixed the weird corner cases the C API for some reason could not handle. In any case, I'm going to do some further condensing later, but it would be great if I could get some more feedback on what currently has been done now that Travis and Appveyor are happy.

@jaimefrio
Copy link
Member

It seems your changes to the private functions' signature broke a ton of code that was accessing them directly.

@rkern
Copy link
Member

rkern commented Jan 8, 2016

I don't think anything outside of numpy.random is accessing them directly. It's just randint().

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 8, 2016

Whoops. I forgot to put is_scalar in randint's call to rand_func. Not sure if that will fix all the test failures though. In any case, at least it's a separate commit, so if it all blows up, I'll have something to fall back on. Not sure how necessary the is_scalar parameter is. I could easily determine it within each rand_func, though it seems redundant in light of what I do in randint.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 8, 2016

Massive segfaulting on Travis - resetting to try again

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 9, 2016

So I've tried condensing the code, but whatever I've done, it's certainly slowed down Travis. At least no seg-faulting though! Besides perhaps reverting to C API calls OR creating a fast-track for the purely scalar inputs, does anyone have suggestions for speeding the code up?

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 9, 2016

Also, my changes have broken several tests. Suggestions about where my code is going wrong are also welcomed too (I'm not quite sure what is going wrong at the moment), especially with regards to the repeat-ability test @charris wrote.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 9, 2016

I just ran some timeit tests, and it does seem like the fast-track (@njsmith) is indeed much faster. I ran python -m timeit -n 100 "import numpy as np;np.random.randint(0, 20, size=1000000)" and got the following:

master branch:
100 loops, best of 3: 27.4 msec per loop

My branch (with scalar / array_like computations all condensed into one method):
100 loops, best of 3: 356 msec per loop

Thus, I think I will roll-back my most recent commit but port over the changes I made to the documentation as well as the tests if that's okay with everyone.

@jaimefrio
Copy link
Member

I seem to ecall this was discussed somewhere else, but cannot find exactly where... Say I have a low array of shape (3,), and a high array of shape (4,), and I want to get 5 samples from each of the 3 * 4 = 12 possible combinations. Should I call it like this:

np.random.randint(low[:, np.newaxis], high, (5,)) # 5 instead of (5,) should also work

Or do I have to do

np.random.randint(low[:, np.newaxis], high, (3, 4, 5))

@homu
Copy link
Contributor

homu commented Apr 29, 2017

☔ The latest upstream changes (presumably #9015) made this pull request unmergeable. Please resolve the merge conflicts.

@charris
Copy link
Member

charris commented May 7, 2017

I think this will work as is, but don't want to put it in without modification. The reason is that I think it can be made more efficient for the broadcasting case, and, once it goes in, we will need to keep it pretty much the same in order to maintain backward compatibility of the generated sequences. The improvements I have in mind are:

  1. Make the scalar case depend on the length of low and high having size one as long as the shapes are compatible. That would be more efficient for the array case and also ensure that the generated sequence wouldl be the same in those cases, probably the result of least surprise.
  2. Put the loop for the non-scalar case into the called C function. I think that will require new rk_*functions. In the true broadcasting case, we might also want to pass the masks as well as the rng and off so that the masks do not need regeneration each time through the inner loop, although I suppose that could be done inside the function itself if storage for the masks is passed. One could then loop through the inner rng loop until the output is filled. Note that, because random integers are acquired from the rng in 32 bit chunks with left over bits discarded, putting the loop in the called function instead of using multiple function calls will change the generated sequence.
  3. The generated sequences should be checked in a few broadcasting cases to help ensure that they remain the same.
  4. The lack of an easy full range option is a bit bothersome. Not sure what to do about that, either live with it or what.

I also have a quick question: what determine the integer type, the dtype or the low, high type?

@gfyoung
Copy link
Contributor Author

gfyoung commented May 7, 2017

@charris : To answer your about what determines integer type, it's dtype (you implemented that yourself IINM). As to the other questions, I'll address your points one at a time:

  1. I'm not sure I follow you here. Could you clarify what you mean by this?

  2. Okay, I think I understand. I can take a stab at transferring that for-loop code into C.

  3. I'm not sure I fully understand this statement. Could you clarify what you're suggesting? It seems like you're suggesting an additional test in which we check the consistency of numbers generated under different broadcasting conditions, but I'm not sure.

  4. I in fact actually made an attempt to do this back in ENH: Make 'low' optional in randint #7151 but received major push-back for it.

@eric-wieser
Copy link
Member

eric-wieser commented May 7, 2017

I in fact actually made an attempt to do this back in #7151 but received major push-back for it.

That PR is referring to [intmin, intmax), whereas I think @charris is concerned about about [intmin, intmax] or [intmin, intmax+1), which you address for only the scalar case in #8846. There doesn't seem to be an easy way to extend this to the broadcast case.

@gfyoung
Copy link
Contributor Author

gfyoung commented May 7, 2017

@eric-wieser : Fair enough, but if I were to modify the PR to just set highbnd = intmax + 1, the patch is essentially the same. The question is whether or not such a change is desirable.

@charris charris modified the milestones: 1.14.0 release, 1.13.0 release May 7, 2017
@charris
Copy link
Member

charris commented May 7, 2017

@gfyoung For the size one case, I think low= [1], high=[3] should produce the same sequence as low=1, high=3 when the output shape is the same and compatible. For the scalar case the loop filling the output is in the rk_* function, which will produce a different sequence than if the rk_* function is called multiple times to fill the output array.

@gfyoung
Copy link
Contributor Author

gfyoung commented May 7, 2017

Okay...but we do want to differentiate how we return the data (scalar or array-like), which is why we treat [1] and [3] differently in the first place. I suppose then we would have to check if we had array-like inputs to determine the return type?

Also, I'm not sure I fully follow your concern here about sequence generation:

>>> import numpy as np
>>> randint = np.random.randint
>>>
>>> np.random.seed(123456789)
>>> for _ in range(10):
...     print(randint(1, 500))
185
157
307
474
347
286
493
264
251
229
>>>
>>> np.random.seed(123456789)
>>> for _ in range(10):
...     print(randint([1], [500]))
array([185])
array([157])
array([307])
array([474])
array([347])
array([286])
array([493])
array([264])
array([251])
array([229])

@eric-wieser
Copy link
Member

I'm not sure I fully follow your concern here about sequence generation:

Can you try that with other dtypes?

@gfyoung
Copy link
Contributor Author

gfyoung commented May 7, 2017

@eric-wieser : See this script I wrote: generate.txt

Feel free to change LOW, HIGH, and DTYPE as you please. I didn't see any differences.

@charris
Copy link
Member

charris commented May 7, 2017

The options I see for dealing with full range bounds are to add a closed option or to require the user to use a type of sufficient range.

@gfyoung
Copy link
Contributor Author

gfyoung commented May 7, 2017

The options I see for dealing with full range bounds are to add a closed option or to require the user to use a type of sufficient range.

Is there any reason why just specifying dtype (with no bounds) can't allow us to generate the full range? It's not necessary to add closed because you can explicitly pass in the full range [intmin, intmax + 1), and this just adds unnecessary args to the signature IMO.

@charris
Copy link
Member

charris commented May 7, 2017

You need to use a smaller dtype. To illustrate the difference between single and multiple calls:

In [1]: randint = np.random.randint

In [2]: np.random.seed(123456789)

In [3]: a = randint(1, 20, 10, dtype=np.int8)

In [4]: np.random.seed(123456789)

In [5]: b = array([randint(1, 20, dtype=np.int8) for _ in range(10)])

In [6]: a
Out[6]: array([ 8,  9, 15, 19, 14,  9,  4, 18, 19,  3], dtype=int8)

In [7]: b
Out[7]: array([ 8, 15, 19,  4, 18,  4, 13,  8, 15,  5], dtype=int8)

Note that only the first elements agree, because both cases start with the same 32 bit int, but in the first case each 32 bit int supplies 4 bytes, whereas in the second case each 32 bit int supplies only one.

@gfyoung
Copy link
Contributor Author

gfyoung commented May 7, 2017

@charris : What you're showing is not the same invocation as before (in fact, you can see this on master IINM). Perhaps this for-loop inconsistency should be patched before this PR?

@charris
Copy link
Member

charris commented May 7, 2017

Is there any reason why just specifying dtype (with no bounds) can't allow us to generate the full range?

That would work for the scalar case, but not be backwards compatible. It is the broadcast case that makes thing difficult because of the conversion to arrays.

@gfyoung
Copy link
Contributor Author

gfyoung commented May 7, 2017

That would work for the scalar case, but not be backwards compatible.

I'm not sure I follow why that is the case. We're allowing users to not have to pass in a parameter that used to be required.

@homu
Copy link
Contributor

homu commented May 9, 2017

☔ The latest upstream changes (presumably #9026) made this pull request unmergeable. Please resolve the merge conflicts.

@homu
Copy link
Contributor

homu commented May 18, 2017

☔ The latest upstream changes (presumably #9106) made this pull request unmergeable. Please resolve the merge conflicts.

Adds functionality for randint to broadcast
arguments, regardless of dtype. Also automates
randint helper generation to make this section
of the codebase more manageable.

Closes gh-6745.
@gfyoung
Copy link
Contributor Author

gfyoung commented Jul 19, 2017

I was hoping to be able to return to this at some point, but I've gotten too caught up in other (repository) work. Closing for now. If others would like to provide assistance to push this through or would like to pick it up themselves, they are more welcome!

For reference, here is the comment from which anyone should start to see where the PR needs to go.

@gfyoung gfyoung closed this Jul 19, 2017
@gfyoung gfyoung deleted the rand_int_arg_broadcast branch July 19, 2017 08:16
bashtage added a commit to bashtage/numpy that referenced this pull request Aug 31, 2017
Remove old rand int helpers
Add tests for array versions
Remove mask generator

xref numpy#6938
xref numpy#6902
xref numpy#6745
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.