Skip to content

py/persistentcode: Only store a single instance of source_file in mpy. #4472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

dpgeorge
Copy link
Member

@dpgeorge dpgeorge commented Feb 7, 2019

This is a relatively simple change to only store a single instance of the source file name in a .mpy file. It assumes that all code blocks (classes, functions, lambdas, etc) in a single .mpy file come from the same .py source file, which I think is a fair assumption (and is the case for the output of mpy-cross).

No extra RAM is needed to save or load the new .mpy files. It should be faster loading because the source file is only loaded at the start of loading the .mpy.

On average it reduces .mpy file size by about 5%.

On average reduces .mpy file size by about 5%.
@dpgeorge
Copy link
Member Author

dpgeorge commented Feb 7, 2019

For background see #3054

@pfalcon
Copy link
Contributor

pfalcon commented Feb 7, 2019

It assumes that all code blocks (classes, functions, lambdas, etc) in a single .mpy file come from the same .py source file, which I think is a fair assumption (and is the case for the output of mpy-cross).

Well, one obvious optimization to would be to allow to merge several .mpy's into one. But to get real benefits from that, it would take to allow to share e.g. the same constant table among the different functions, i.e. change serialization format anyway.

But if planning to do that, how it should be done is apparently introducing a bitflag field before each code object with bits of "read new/use the same as previous" for filename/const table/etc.

@dpgeorge
Copy link
Member Author

dpgeorge commented Feb 8, 2019

one obvious optimization to would be to allow to merge several .mpy's into one

I'm not sure how that would work: a file has meaning to Python import (import x corresponds to x.py) so if multiple files are in one .mpy then it must all be imported as one unit, corresponding to a single filename.

@pfalcon
Copy link
Contributor

pfalcon commented Feb 8, 2019

I'm not sure how that would work: a file has meaning to Python import (import x corresponds to x.py) so if multiple files are in one .mpy then it must all be imported as one unit, corresponding to a single filename.

Apparently, that would need to be addressed somehow by someone doing that optimization, e.g. init code added which creates entries in sys.modules manually, etc. Conceptually, it's the same thing which Android's .dex did to individual Java .class'es.

@dpgeorge
Copy link
Member Author

dpgeorge commented Mar 5, 2019

Superseded by #4549

@dpgeorge dpgeorge closed this Mar 5, 2019
@dpgeorge dpgeorge deleted the py-mpy-single-source-file branch March 5, 2019 05:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants