Skip to content

[RFC] Create a benchmarking repository #3000

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
DimitrisJim opened this issue Sep 1, 2021 · 2 comments
Open

[RFC] Create a benchmarking repository #3000

DimitrisJim opened this issue Sep 1, 2021 · 2 comments
Labels
C-discussion A discussion RFC Request for comments

Comments

@DimitrisJim
Copy link
Member

Summary

Move the current code for benchmarking into a dedicated repository and replace it with a slimmer, faster to execute suite.

Detailed Explanation

Currently, running the benchmarks is dreadfully slow and doesn't really allow us to easily check for differences between branches.

The issue isn't the execution of the tests as much as compiling with lto enabled. Though this seems like the correct approach (system CPython's which the comparison is made against are should be compiled with lto too) it currently means the compilation (which is already quite time consuming) becomes longer. In general lto should be used in the final comparison with CPython but not when trying to evaluate between two versions of RustPython in a, relatively, quick way (or, at least, it should be optional).

In addition to this, the current benchmarking code is hard-coded to always bench against CPython. Again, though this is the correct approach, we really don't need this information when trying to improve the performance of RustPython in comparison to itself. A quicker and easier way to answer the question 'Are my changes making things faster?' is what is required.

A separate repository for benchmarking isn't a new concept, CPython [1], PyPy [2], Nuitka [3] all do it (and they attempt to have a uniform set of benchmarks). So this would just follow along with a nice precedent they've set. How the information from benchmarkings is presented is another question and another issue already exists for this.

The benches directory can then contain benchmarks (micro and normal) that only evaluate RustPython. Criterion still sounds like the best option since it also appears to support profiling.

Drawbacks and Alternatives

More work, the alternative is not changing things and have benchmarking be a slight pain to

Unresolved Questions

If moved, should the repository use criterion? Though it does the job and the feedback it presents is useful, we could use Python libraries for benchmarking/visualization. There's currently test_benchmarks.py which uses pytest-benchmark but I'm not sure how rigorous the testing it does is. PyPerformance is probably the end goal but RustPython doesn't seem to be able to run it quite yet.

@DimitrisJim DimitrisJim added RFC Request for comments C-discussion A discussion labels Sep 1, 2021
@mireille-raad
Copy link
Member

I love the idea of benchmarking RustPython against itself and making the process fast/easy to use. As you said, it will be really good to be able to answer the question 'Are my changes making things faster?'

From what you explained, separating things seems to be the right approach. I don't have a strong opinion, so I leave it to others to discuss/decide.

You mentioned in the drawbacks that it is a pain because it is more work, I can help especially if doing this once means that over time, benchmarks are faster and it is easier to improve RustPython. IMHO, it is better to deal with something once than having a smaller amount of pain that gets inflicted over time in batches.

For presenting the benchmark info, I imagine that we would want to build a command line thing that tells someone if their changes make RustPython faster and by how much.

For a web UI, there is https://rustpython.github.io/benchmarks -- I commented/updated that issue and submitted a PR to add the link to the homepage.
We can either add a header that lets you choose what to compare (CPython or versions of RustPython) or we could do a new page to show RustPython performance over time, we could do something with timeline or animated images.

@DimitrisJim
Copy link
Member Author

So I've been recently working on this and I think we should be able to get pyperformance to work by tweaking it a bit. This has a number of nice perks:

  • Its the official suite of Python benchmarks.
  • It has builtin commands for running/comparing different branches/implementations.
  • It has the ability to run benchmarks in an isolated environment so as to minimize external noise.

We're almost there, pyperf (what pyperformance essentially wraps) is a couple of issues away from being workable on RustPython, I believe it's mostly some time module related things I need to add.

So this thankfully answers the running the benchmarks part, a README with the appropriate command would take care of that.

As for presenting the results: The long term goal is having RustPython results presented on speed.python.org along with other implementations (currently on CPython and PyPy if I'm not mistaken). Until we get there, presenting the benchmarks on our github.io is definitely the way to go.

We can either add a header that lets you choose what to compare (CPython or versions of RustPython) or we could do a new page to show RustPython performance over time, we could do something with timeline or animated images.

Exactly my thoughts, it's also what speed.python.org currently does to some extent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-discussion A discussion RFC Request for comments
Projects
None yet
Development

No branches or pull requests

2 participants