Skip to content

Refactor benchmarking suite to use Criterion and add microbenchmarks #2367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Dec 14, 2020
Merged

Conversation

orf
Copy link
Contributor

@orf orf commented Dec 11, 2020

This PR refactors the benchmark suite to use Criterion (https://docs.rs/criterion/0.3.3/criterion/).

I've completely re-written the benchmarks, which I don't think where running correctly before. Now every file inside benchmarks/ is both parsed and executed by rustpython + cpython and the results compared. Adding a new benchmark is as simple as adding a file to that directory, and any exceptions while executing the code are correctly displayed to the console rather than just panicking.

I've also added an extra benchmark that runs pystone three times with a different step count for both cpython and rustpython and compares the results.

I added the ability to run microbenchmarks quickly and easily. For example to benchmark context manager exceptions you could add:

from contextlib import contextmanager

@contextmanager
def try_catch(*args, **kwargs):
    try:
        yield
    except RuntimeError:
        pass

# ---

with try_catch():
    raise RuntimeError()

Only the part after the # --- will be benchmarked. Here are the results:

Screenshot 2020-12-13 at 20 39 25

We could then also use this github action to run them as part of the CI steps in merge requests.

@orf
Copy link
Contributor Author

orf commented Dec 11, 2020

Pystone comparison:

Screenshot 2020-12-11 at 17 47 59

Parsing:

Screenshot 2020-12-11 at 17 48 52

File comparisons:

Screenshot 2020-12-11 at 18 16 11
Screenshot 2020-12-11 at 18 16 06
Screenshot 2020-12-11 at 18 15 58

And the full report:

criterion.zip

@orf orf changed the title Refactor benchmarking suite to use Criterion Refactor benchmarking suite to use Criterion and add microbenchmarks Dec 13, 2020
Copy link
Member

@coolreader18 coolreader18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is really great!! Definitely much cleaner than our old setup. I pushed a commit to still allow running pystone outside the context of the benchmarker, otherwise I think this is good to go!

@coolreader18
Copy link
Member

The main thing I use benchmarks for is determining whether a change I make is actually faster, and by how much. Usually, I just cargo build --release on master and move it to benches/rustpython-norm, then I do the same on my feature branch and move it to benches/rustpython-opt, and then I run something along the lines of hyperfine -L X norm,opt './rustpython-{X} benchmarks/pystone.py' -w2. Do you know if criterion has anything for that, benchmarking 2 versions of the program against each other?

@orf
Copy link
Contributor Author

orf commented Dec 13, 2020

Thank you 😊! I wasn't really sure what to add to the micro-benchmarks so I kind of just guessed, but I figured we can add more in the future easily enough.

The benchmarks are not perfect right now - the cpython branch included parsing overhead while the rustpython one does not. If you're happy then we could merge this and work on fixing that later, I don't think the overhead is that significant.

@orf
Copy link
Contributor Author

orf commented Dec 13, 2020

The main thing I use benchmarks for is determining whether a change I make is actually faster, and by how much. Usually, I just cargo build --release on master and move it to benches/rustpython-norm, then I do the same on my feature branch and move it to benches/rustpython-opt, and then I run something along the lines of hyperfine -L X norm,opt './rustpython-{X} benchmarks/pystone.py' -w2. Do you know if criterion has anything for that, benchmarking 2 versions of the program against each other?

It acts similar to the standard bench library. You'd run the benchmarks on master, then again on your branch. It would tell you the changes in throughout or latency between the two. Any subsequent changes you make would also be benchmarked.

But honestly I'm not sure how it would work with your specific workflow, but I do think that it's more extensible and provides a more intuitive way of exploring results than the the standard bench library.

Plus it doesn't need nighty, so... 🎉🎉

@coolreader18
Copy link
Member

Yeah, that definitely sucks that there's no easy API in the cpython crate to compile and then run a code object. Maybe we could pass it as a local variable to a script that runs exec(code), or maybe we could just use the lower level APIs from cpython.

Yeah, I think criterion is definitely better than the nightly benching apis, but I wasn't able to use those the way I do for this and I probably won't be able to use criterion either. Although, maybe I could compile with cargo bench --no-run and then compare those binaries' outputs.

@coolreader18 coolreader18 merged commit c9a332c into RustPython:master Dec 14, 2020
@coolreader18
Copy link
Member

Thanks for contributing!

@mireille-raad
Copy link
Member

If it you want, i can create a web page on rustpython.github.io called "benchmarks".

If you can automate the benchmark to run and spit out some images in a directory in https://github.com/RustPython/rustpython.github.io , then it would be trivial to show those on a webpage.

is this possible/neat or unnecessary (and demotivating)?

@youknowone
Copy link
Member

@mireille-raad I think that will be a fun feature. So one upvote. But it also can be demotivating too :p

@coolreader18
Copy link
Member

I also think that would be a good idea it would be a good way of tracking performance improvements we make over time. In retrospect, it would've been nice to collect data on the perf changes I've been making over the last couple months, but hindsight is 20/20 🙃

@mireille-raad
Copy link
Member

well, let's do this then :) I will start a new issue :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants