Skip to content

mpremote: Support bytecode raw paste for 'mpremote run module.mpy' #8744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

projectgus
Copy link
Contributor

@projectgus projectgus commented Jun 9, 2022

For resource constrained devices without mpremote mount support, it is useful to be able to directly run precompiled bytecode (mpremote run) when developing, rather having to send source code each time.

Currently this is possible using pyboard.py - the implementation loads some loader Python and injects the bytecode as a variable. This approach reduces RAM usage further by using the "raw paste" window to paste into the same code path that loads and executes .mpy files (thanks Damien for this tip).

Summary of changes

  • Adds a new raw paste command 'B' for 'raw paste bytecode' (raw paste command 'A' is the current/default paste command)
  • Adds an escaping mechanism (<char> is escaped as Ctrl-F <char + 8>) during raw paste so byte values less than 8 can be sent without triggering Ctrl-C or Ctrl-D handlers. The two-byte escape sequence still counts as one byte in the paste window.
  • Adds relevant support to mpremote.py
  • Adds the same "loader Python" approach to mpremote.py which is already in pyboard.py, for devices without raw paste support. Unclear how useful this is, I think it's only needed if using newer mpremote.py with older firmware, but there's no device-side changes and minimal host changes.

Impact

Building for PYBV11 with default settings, .text segment +128 bytes.

TODOs

  • My understanding might be wrong, but this still currently streams the entire module into a memory buffer (mp_raw_code_load, mp_make_function_from_raw_code) and then executes it ( mp_call_function_0). Maybe this can be made to execute it in chunks as each Python statement completes, meaning less RAM usage for modules which execute some statements on import and don't need to keep those around. But I'm very new to this and haven't dug right into it, could be way off. (I think the soft reset before run makes this a bit of a moot point, although there are still potential savings for 'run'-ing a bytecode that doesn't define any code.)
  • Set up some protocol macro defines in the code so all of the possible command sequences are grouped together, to make the protocol easier to understand from the top down.
  • Test this works with MICROPY_REPL_EVENT_DRIVEN (Seems only JavaScript port uses this method, at least by default, and it doesn't support mpremote - if I should test this on a different port, let me know.)
  • Test this works with 'Ctrl-K inject file'
  • Set up compilation guards for ports which don't support .mpy loading
  • Check for regressions due to the new escape sequences anywhere that pyboard.py calls raw paste routines
  • Implement sending of bytecode without raw paste in mpremote (can add the 'injected variable' approach from pyboard.py into mpremote)
  • Measure actual memory usage (currently testing on a board with no mem_info)
  • Check for memory leaks (I'm not yet across how this rawcode buffer gets cleaned up if you load another one...)

Compatibility

I believe this approach is compatible between different mpremote.py and MicroPython versions:

  • Newer mpremote.py will detect that older MicroPython can't support the "bytecode raw paste" command ("B")
  • Old mpremote.py will not try to send this data at all.

Except for the case of anyone who was sending a raw Ctrl-F character as-is in a raw paste, in which case that will stop working as expected if the mprempote.py + firmware versions don't match (due to the Ctrl-F escape handler).

@projectgus
Copy link
Contributor Author

@jimmo @andrewleech @dpgeorge Do you folks have any suggestions about this?

@dpgeorge dpgeorge added py-core Relates to py/ directory in source tools Relates to tools/ directory in source, or other tooling labels Jun 9, 2022
@dpgeorge
Copy link
Member

dpgeorge commented Jun 9, 2022

I think @andrewleech had an idea (and maybe somewhere in a PR) to build .mpy files on the fly in mpremote (calling out to mpy-cross). That would fit well here.

@codecov-commenter
Copy link

Codecov Report

Merging #8744 (a933f69) into master (5bb2a85) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #8744      +/-   ##
==========================================
- Coverage   98.31%   98.31%   -0.01%     
==========================================
  Files         155      156       +1     
  Lines       20304    20326      +22     
==========================================
+ Hits        19962    19983      +21     
- Misses        342      343       +1     
Impacted Files Coverage Δ
py/compile.c 99.69% <0.00%> (-0.07%) ⬇️
py/objmodule.c 100.00% <0.00%> (ø)
extmod/modurandom.c 100.00% <0.00%> (ø)
ports/unix/moduos.c 18.91% <0.00%> (ø)
ports/unix/mpconfigport.h 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5bb2a85...a933f69. Read the comment docs.

@andrewleech
Copy link
Contributor

I've got an open draft MR which adds support for manifest files to mpremote - with this any python files referenced in a manifest.py are bytecompiled into a local folder with mpy-cross, exposed via mount, then added to sys.path so that the device can easily import everything.
I want to refactor this such that when using regular mpremote mount, any time the device tries to import blah.mpy from the mount, mpremote would check for blay.py, mpy-cross it, then deliver it to the device instead.

So this would reduce ram use and increase import speed (less data to transfer) but still requires mount. These changes wouldn't need any new support within micropython either, but don't address all of your needs - specifically directly running an mpy file without mount support.

Would it be feasible to include vfs support in a build without any specific filesystem libraries? That way mount could be used without needing the bulk of any one of the fs formats. That being said, while I haven't looked at your code in detail I presume a raw paste mode can be added with less code than the VFS framework.

I haven't looked into the ram buffer size needs while importing a mpy file over mount but expect that bytecode would be read "line by line" so to speak by the vm? I don't know if it buffers the entire mpy file in ram when running from a (mount) filesystem?

@peterhinch
Copy link
Contributor

I want to refactor this such that when using regular mpremote mount, any time the device tries to import blah.mpy from the mount, mpremote would check for blay.py, mpy-cross it, then deliver it to the device instead.

That sounds awesome. Well worth doing!

A problem could occur with "incompatible mpy format" messages. Would it be possible to check the firmware version on the target against the cross-compiler version? A mismatch could prevent pre-compilation and issue a warning.

@projectgus projectgus force-pushed the feature/mpremote_run_mpy branch from a933f69 to 717a1bd Compare July 6, 2022 23:26
@projectgus projectgus changed the title mpremote: Experiment with bytecode raw paste for 'mpremote run module.mpy' mpremote: Support bytecode raw paste for 'mpremote run module.mpy' and 'mpremote repl --inject-file' Jul 6, 2022
@projectgus projectgus changed the title mpremote: Support bytecode raw paste for 'mpremote run module.mpy' and 'mpremote repl --inject-file' mpremote: Support bytecode raw paste for 'mpremote run module.mpy' Jul 6, 2022
@projectgus projectgus force-pushed the feature/mpremote_run_mpy branch from 717a1bd to 249cb20 Compare July 7, 2022 02:16
- Adds a new raw paste command 'B' for 'raw paste bytecode' ('A' is
'paste source')

- Adds an escaping mechanism (Ctrl-F <char + 8>) during raw paste so
bytes less than 8 can be sent without triggering Ctrl-C or Ctrl-D
handlers. The two-byte escape sequence still counts as one byte in
the paste window.

- Adds relevant support to mpremote.py

Signed-off-by: Angus Gratton <gus@projectgus.com>
Signed-off-by: Angus Gratton <gus@projectgus.com>
Signed-off-by: Angus Gratton <gus@projectgus.com>
The raw repl language is gradually getting more complex. To structure
the code a bit more, introduce some names for the different control
characters and the "init command" sequence which is currently
only used for starting a paste.

Signed-off-by: Angus Gratton <gus@projectgus.com>
@projectgus projectgus force-pushed the feature/mpremote_run_mpy branch from 249cb20 to 02826f0 Compare July 7, 2022 02:49
@projectgus
Copy link
Contributor Author

Hi @andrewleech,

Sorry I didn't reply to these earlier on:

Would it be feasible to include vfs support in a build without any specific filesystem libraries? That way mount could be used without needing the bulk of any one of the fs formats. That being said, while I haven't looked at your code in detail I presume a raw paste mode can be added with less code than the VFS framework.

This is a good question! Actually the target I'm focusing on, B_L072Z_LRWAN1 (32KB RAM), has the necessary base vfs support, but on master it runs out of memory when mpremote mount tries to push the _fs_hook_code which implements the dummy filesystem! This is 5124 bytes of Python source.

With this change and a hacky patch to pre-compile the _fs_hook_code on top then I can do mpremote mount successfully, but it still uses almost half the available heap once the FS class is loaded:

❯ mpremote mount .
Local directory . is mounted at /remote
Connected to MicroPython at /dev/ttyACM0
Use Ctrl-] to exit this shell
>
MicroPython v1.19.1-106-g249cb207d-dirty on 2022-07-07; B-L072Z-LRWAN1 with STM32L072CZ
Type "help()" for more information.
>>> import gc, micropython; gc.collect(); micropython.mem_info()
stack: 588 out of 3072
GC: total: 12096, used: 5360, free: 6736
 No. of 1-blocks: 76, 2-blocks: 11, max blk sz: 19, max free sz: 279

Compared to a plain REPL:

❯ mpremote
Connected to MicroPython at /dev/ttyACM0
Use Ctrl-] to exit this shell
MicroPython v1.19.1-106-g249cb207d-dirty on 2022-07-07; B-L072Z-LRWAN1 with STM32L072CZ
Type "help()" for more information.
>>> import gc, micropython; gc.collect(); micropython.mem_info()
stack: 588 out of 3072
GC: total: 12096, used: 352, free: 11744
 No. of 1-blocks: 6, 2-blocks: 2, max blk sz: 5, max free sz: 725

So I'm thinking "mpremote mount" may not ever be viable on these very small systems.

I haven't looked into the ram buffer size needs while importing a mpy file over mount but expect that bytecode would be read "line by line" so to speak by the vm? I don't know if it buffers the entire mpy file in ram when running from a (mount) filesystem?

Honestly I don't fully understand this yet, I'm going to take another look through it soon.

@projectgus
Copy link
Contributor Author

projectgus commented Jul 7, 2022

I haven't looked into the ram buffer size needs while importing a mpy file over mount but expect that bytecode would be read
"line by line" so to speak by the vm? I don't know if it buffers the entire mpy file in ram when running from a (mount) filesystem?

My reading of persistentcode.c is that it reads all of the byte code into memory before it executes. This seems probably necessary to keep complexity down as the bytecode may contain jumps, calls to child scopes, etc. that don't happen linearly.

One interesting thing I noticed is that the "outer scope" of bytecode (i.e. the code which is executed as the module imports) is kept in memory after the execution/import completes (for any call that goes via mp_raw_code_load, which includes both importing from a source or bytecode file or repl --inject-file). As far as I can tell, this bytecode isn't needed again so the rc->fun_data buffer could be freed after execution/import completes.

Downside, in most cases freeing this won't save a lot - for example in _fs_code_hook it's 75 bytes of bytecode out of a 3000 byte file. It'd only be significant for modules which execute a lot of one-time code on import.

@dpgeorge @jimmo Does that sound right, or am I missing something?

@projectgus projectgus marked this pull request as ready for review July 8, 2022 00:13
RetiredWizard pushed a commit to RetiredWizard/micropython that referenced this pull request Dec 30, 2023
@projectgus
Copy link
Contributor Author

This is an automated heads-up that we've just merged a Pull Request
that removes the STATIC macro from MicroPython's C API.

See #13763

A search suggests this PR might apply the STATIC macro to some C code. If it
does, then next time you rebase the PR (or merge from master) then you should
please replace all the STATIC keywords with static.

Although this is an automated message, feel free to @-reply to me directly if
you have any questions about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
py-core Relates to py/ directory in source tools Relates to tools/ directory in source, or other tooling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants