-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
[proof of concept] Implement parts of the core in Python bytecode #5025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is an interesting concept. Did you happen to measure performance differences? |
No, didn't get to that yet. I guess it wouldn't be too hard to do it for |
I did some benchmarking, running a simple loop like this: def test():
x = range(1000)
for _ in range(300):
sum(x) Running on a PYBD-SF2 @ 120MHz I get:
So bytecode is half the speed of the existing C implementation. Then I used mpy-cross to compile the Python implementation of sum to native Python code, and put that in instead of the frozen bytecode. The results for the benchmark of this native Python vs C were:
That's not too bad, the native code generator is close to the C version! See latest commit for the native code blob, for both x86-64 and Thumb2. The size of these blobs is a bit bigger than the C version, but with a bit of optimisation in the native emitter these blobs could become a bit smaller and faster, possibly getting close to C. There is a lot of scope here for further work. As shown, it possible to reimplement parts of the core in Python, which is compiled to either bytecode (small but slow) or native machine code (fast but bigger, and potentially on par with the existing C implementation). A benefit of writing in Python rather than C is that the underlying architecture of the system (of MicroPython, the VM, etc) is hidden. For example, whether exceptions are implemented using NLR or simple return codes is irrelevant to the Python implementation of a function (eg This is very meta and gets into the territory of PyPy (MicroPyPy!), where the interpreter is written in itself (RPython a reduced dialect of Python). |
This is a very cool demonstration Damien. I'd like to see arguments, eventually, to both the make process for building MicroPython, and to mpy-cross to favour speed vs. size, that would set a MACRO that we can use throughout the code base to prefer for example Python vs C, like you have here. Perhaps an option to favour less heap or more heap too? |
Yes, that makes sense, speed vs size (like
It might be tricky to have that option orthogonal to speed-vs-size. In general the code always attempts to reduce heap usage. |
Turn off PWM pin during PulseOut construct
This is an automated heads-up that we've just merged a Pull Request See #13763 A search suggests this PR might apply the STATIC macro to some C code. If it Although this is an automated message, feel free to @-reply to me directly if |
I will close this because it's not really going anywhere. Maybe it can be revisited one day. |
This is a proof-of-concept to show how it's possible to reimplement parts of the MicroPython core in Python bytecode. So far in this PR the builtin
sum(iterable, start=0)
function is changed to a pure Python implementation and "hand frozen" in to the firmware.The main reason to do this is to reduce code size while retaining the same functionality. The code size change with this PR is:
That's only a small decrease but this principle would scale to reimplementing a lot of built in functionality (eg string methods) to get a decent reduction in code size.
Some points to note:
sum()
will crash the VM and needs a small check added to fix this (would add a bit of code but only once)sum()
was generated with mpy-cross and can actually be optimised down by one bytesum()
includes a default argument