Skip to content

BUG: Large overhead in some random functions #15460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
seberg opened this issue Jan 28, 2020 · 6 comments · Fixed by #15511
Closed

BUG: Large overhead in some random functions #15460

seberg opened this issue Jan 28, 2020 · 6 comments · Fixed by #15511

Comments

@seberg
Copy link
Member

seberg commented Jan 28, 2020

This issue is extracted from gitter (by @andyfaff – I saw it only randomly, so creating this before we forget).

Some random generator functions such as random() have a huge overhead compared to their legacy RandomState versions:

rs = np.random.RandomState()
rg = np.random.default_rng()

%timeit rs.random()                                       
# 432 ns ± 9.18 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit rg.random()                                       
# 5.19 µs ± 61.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

The reason for this is the support of the dtype= keyword argument. A secondary reason (and maybe second speed issue) is that np.dtype.name is a very slow operations (it could plausibly be cached).

However, the solution here will be to simply avoid the whole np.dtype(dtype).name construct as much as possible and maybe adding a fastpath for the case when dtype is not used. np.dtype(dtype).type is np.float64 may be a solution, or dtype is np.float64 or np.dtype(dtype) is np.float64 to speed up the default branch.


Checking that we have benchmarks – or adding simple small array ones in a first commit – for this would be good when this is fixed.

@eric-wieser
Copy link
Member

avoid the whole np.dtype(dtype).name construct as much as possible

Agreed, hopefully the slowness is enough to motivate people not to use it ;)

@andyfaff
Copy link
Member

Is a similar speedup possible for Generator.uniform, as well as Generator.random?. It seems that they're probably going to be doing exactly the same, except the former does the optional scaling.

>>> %timeit rs.random()
308 ns ± 7.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

>>> %timeit rg.uniform()
2.11 µs ± 73.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

>>> %timeit rg.random()
3.37 µs ± 85.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

We're only talking a couple of microseconds, but it's worth it if it's an easy win.

@seberg
Copy link
Member Author

seberg commented Jan 29, 2020

I consider it a speed bug, since the added overhead is a small factor larger than a typical (fairly simple) python function call (and almost a factor of 10 of what it should be).
This is probably a fairly straight forward issue to solve, so PRs welcome.

@przemb
Copy link
Contributor

przemb commented Feb 4, 2020

I will try to speed it up ;) WIP

@mattip
Copy link
Member

mattip commented Feb 4, 2020

@przemb you may find the comments starting here from gh-15463 useful

@przemb
Copy link
Contributor

przemb commented Feb 4, 2020

@mattip Thank you so much :) I tried to follow mentioned comments, in case of any potential improvements please let me know.
PR: #15511

przemb added a commit to przemb/numpy that referenced this issue Feb 5, 2020
slow calls to np.dtype.name replaced with np.dtype,
mtrand.pyx and _generator.pyx updated,
test test_warns_byteorder updated

before:
%timeit rs.random(): 520 ns ± 33.1 ns per loop
%timeit rg.random(): 6.36 µs ± 222 ns per loop

after:
%timeit rs.random(): 453 ns ± 6.95 ns per loop
%timeit rg.random(): 594 ns ± 9.66 ns per loop
charris pushed a commit to charris/numpy that referenced this issue Mar 1, 2020
slow calls to np.dtype.name replaced with np.dtype,
mtrand.pyx and _generator.pyx updated,
test test_warns_byteorder updated

before:
%timeit rs.random(): 520 ns ± 33.1 ns per loop
%timeit rg.random(): 6.36 µs ± 222 ns per loop

after:
%timeit rs.random(): 453 ns ± 6.95 ns per loop
%timeit rg.random(): 594 ns ± 9.66 ns per loop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants