Skip to content

BENCH: consistently test benchmarks (specifically argmax/argmin) #20785

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mattip opened this issue Jan 11, 2022 · 3 comments
Open

BENCH: consistently test benchmarks (specifically argmax/argmin) #20785

mattip opened this issue Jan 11, 2022 · 3 comments

Comments

@mattip
Copy link
Member

mattip commented Jan 11, 2022

In PR #17989 changes to unrelated code changed argmax/argmin benchmarks significantly. We should be more consistent about benchmarking PRs so we have a baseline. Perhaps the benchmark itself is unstable, perhaps somehow this PR changed some compiler branching. It would be nice to set up daily/weekly/per PR/ benchmark runs and compare them before merging PRs.

For this particular problem, we should try different dtypes and benchmark ordering to see what is causing the change in argmax/argmin.

@mattip
Copy link
Member Author

mattip commented Jan 11, 2022

I thought this might be a duplicate but I could not find an issue to more tightly integrate asv (which seems to have last run in 2020) or codespeed (used by cpython and pypy). Related issues: #15987 and #15992

@seberg
Copy link
Member

seberg commented Jan 20, 2022

After the new discussion on the mailing list, we should likely set some mallopt's when running benchmarks, just to remove that from the possible reasons for fluctuations.

Having the benchmarks run regularly/automatic again would be nice indeed. And if those random fluctuations go away even "regression notifications" could be nice to have again.
(I have no big interest in pushing this, but I would be open to ideas such as asking for a small-development grant to get a good system up and running.)

@Illviljan
Copy link
Contributor

Inspired by https://labs.quansight.org/blog/2021/08/github-actions-benchmarks/ xarray has added a workflow that you probably can copy/paste pretty easily, see pydata/xarray#5796.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants