Skip to content

MNT: more benchmark cleanup #26639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 11, 2024
Merged

Conversation

ngoldbaum
Copy link
Member

In the vein of #26638, this makes more of the benchmarks reproducible. It also deletes some dead code.

@ngoldbaum ngoldbaum force-pushed the benchmark-investigation branch from 8d56a60 to 5e0dded Compare June 7, 2024 03:49
@@ -13,7 +13,8 @@ class MeshGrid(Benchmark):
timeout = 10

def setup(self, size, ndims, ind, ndtype):
self.grid_dims = [(np.random.ranf(size)).astype(ndtype) for
rnd = np.random.RandomState(1864768776)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are already touching these, should we use the recommended

rng = np.random.default_rng(1864768776)
rng.random(size, dtype)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of NEP 19 is that we're supposed to use np.random.RandomState for stable, reproducible RNG sequences across numpy verisons, which is why I used it explicitly in this PR. I guess it's been quite a while since NEP 19 - is that no longer the best recommendation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngoldbaum you're correct here. The testing use case (which applies to benchmarking too) for RandomState hasn't changed.

@charris
Copy link
Member

charris commented Jun 7, 2024

RandomState has the advantage that the bit stream is guaranteed to stay the same, we don't make that promise for the new generators, so I think it is still the right choice for testing. Hmm, I wonder if that stability guarantee has implications for the threading work?

@mattip
Copy link
Member

mattip commented Jun 9, 2024

Hmm. Not this PR, but maybe we should document the use-cases explicitly:

For stable benchmarking, use np.random.RandomState since it is guaranteed to be stable across versions. When the goal is to obtain close-to-theoretical random distribution of values, then `np.random.Generator should be used.

array_class = array_type[0]
self.arr = getattr(SortGenerator, array_class)(self.ARRAY_SIZE, dtype, *array_type[1:])
self.arr = getattr(SortGenerator, array_class)(self.ARRAY_SIZE, dtype, *array_type[1:], rnd)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linter is annoyed at this line

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All LGTM, let's get this in. Thanks @ngoldbaum & reviewers.

Linter complaint is irrelevant, so ignoring. @mattip's suggestion to document the RandomState use case better in a follow-up sounds like a good idea to me.

@rgommers rgommers merged commit db63ee8 into numpy:main Jun 11, 2024
63 of 68 checks passed
@rgommers rgommers added this to the 2.1.0 release milestone Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants