[RFC] Create a benchmarking repository #3000

DimitrisJim · 2021-09-01T16:40:04Z

Summary

Move the current code for benchmarking into a dedicated repository and replace it with a slimmer, faster to execute suite.

Detailed Explanation

Currently, running the benchmarks is dreadfully slow and doesn't really allow us to easily check for differences between branches.

The issue isn't the execution of the tests as much as compiling with lto enabled. Though this seems like the correct approach (system CPython's which the comparison is made against are should be compiled with lto too) it currently means the compilation (which is already quite time consuming) becomes longer. In general lto should be used in the final comparison with CPython but not when trying to evaluate between two versions of RustPython in a, relatively, quick way (or, at least, it should be optional).

In addition to this, the current benchmarking code is hard-coded to always bench against CPython. Again, though this is the correct approach, we really don't need this information when trying to improve the performance of RustPython in comparison to itself. A quicker and easier way to answer the question 'Are my changes making things faster?' is what is required.

A separate repository for benchmarking isn't a new concept, CPython [1], PyPy [2], Nuitka [3] all do it (and they attempt to have a uniform set of benchmarks). So this would just follow along with a nice precedent they've set. How the information from benchmarkings is presented is another question and another issue already exists for this.

The benches directory can then contain benchmarks (micro and normal) that only evaluate RustPython. Criterion still sounds like the best option since it also appears to support profiling.

Drawbacks and Alternatives

More work, the alternative is not changing things and have benchmarking be a slight pain to

Unresolved Questions

If moved, should the repository use criterion? Though it does the job and the feedback it presents is useful, we could use Python libraries for benchmarking/visualization. There's currently test_benchmarks.py which uses pytest-benchmark but I'm not sure how rigorous the testing it does is. PyPerformance is probably the end goal but RustPython doesn't seem to be able to run it quite yet.

The text was updated successfully, but these errors were encountered:

mireille-raad · 2021-10-25T15:04:08Z

I love the idea of benchmarking RustPython against itself and making the process fast/easy to use. As you said, it will be really good to be able to answer the question 'Are my changes making things faster?'

From what you explained, separating things seems to be the right approach. I don't have a strong opinion, so I leave it to others to discuss/decide.

You mentioned in the drawbacks that it is a pain because it is more work, I can help especially if doing this once means that over time, benchmarks are faster and it is easier to improve RustPython. IMHO, it is better to deal with something once than having a smaller amount of pain that gets inflicted over time in batches.

For presenting the benchmark info, I imagine that we would want to build a command line thing that tells someone if their changes make RustPython faster and by how much.

For a web UI, there is https://rustpython.github.io/benchmarks -- I commented/updated that issue and submitted a PR to add the link to the homepage.
We can either add a header that lets you choose what to compare (CPython or versions of RustPython) or we could do a new page to show RustPython performance over time, we could do something with timeline or animated images.

DimitrisJim · 2021-10-26T06:36:06Z

So I've been recently working on this and I think we should be able to get pyperformance to work by tweaking it a bit. This has a number of nice perks:

Its the official suite of Python benchmarks.
It has builtin commands for running/comparing different branches/implementations.
It has the ability to run benchmarks in an isolated environment so as to minimize external noise.

We're almost there, pyperf (what pyperformance essentially wraps) is a couple of issues away from being workable on RustPython, I believe it's mostly some time module related things I need to add.

So this thankfully answers the running the benchmarks part, a README with the appropriate command would take care of that.

As for presenting the results: The long term goal is having RustPython results presented on speed.python.org along with other implementations (currently on CPython and PyPy if I'm not mistaken). Until we get there, presenting the benchmarks on our github.io is definitely the way to go.

We can either add a header that lets you choose what to compare (CPython or versions of RustPython) or we could do a new page to show RustPython performance over time, we could do something with timeline or animated images.

Exactly my thoughts, it's also what speed.python.org currently does to some extent.

DimitrisJim added RFC Request for comments C-discussion A discussion labels Sep 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Create a benchmarking repository #3000

[RFC] Create a benchmarking repository #3000

DimitrisJim commented Sep 1, 2021

mireille-raad commented Oct 25, 2021

Uh oh!

DimitrisJim commented Oct 26, 2021

Uh oh!

[RFC] Create a benchmarking repository #3000

[RFC] Create a benchmarking repository #3000

Comments

DimitrisJim commented Sep 1, 2021

Summary

Detailed Explanation

Drawbacks and Alternatives

Unresolved Questions

mireille-raad commented Oct 25, 2021

Uh oh!

DimitrisJim commented Oct 26, 2021

Uh oh!