Open
Description
In PR #17989 changes to unrelated code changed argmax/argmin benchmarks significantly. We should be more consistent about benchmarking PRs so we have a baseline. Perhaps the benchmark itself is unstable, perhaps somehow this PR changed some compiler branching. It would be nice to set up daily/weekly/per PR/ benchmark runs and compare them before merging PRs.
For this particular problem, we should try different dtypes and benchmark ordering to see what is causing the change in argmax/argmin.