Dynamic native modules v2 #1627

dpgeorge · 2015-11-16T15:56:53Z

Now that persistent bytecode is supported (well, at least the beginnings of it are there, still pending #1619) the dynamic-native-modules branch is stale, since it used a different format for the loadable .mpy file.

This PR improved upon dynamic-native-modules branch by adding support to load .mpy files that contain position-independent code compiled from C (or any other language). Features of this new version are:

Same .mpy format for bytecode and loadable C code.
Framework for dynamic loadable C code is now much more comprehensive and gives access to the full set of runtime functions that the native emitter has (which is enough to do anything within uPy runtime, albeit not always in the most efficient way).
Much more efficient "linking" of local qstrs.
More realistic path to allowing code to be compiled either as a dynamic module, or statically (ie compiled into the uPy binary). This can be achieved by changing some macros in py/persistnative.h.

At the moment things are working and you can test it by building modx.mpy in the examples/modx directory (just type "make"), and then building unix port, and then doing "import modx" (make sure modx.mpy is in the current directory, or the module search path).

Building modx example with "make CROSS=1" will allow modx.mpy to be loaded on Thumb2 arch (eg pyboard, wipy).

As usual, naming things is hard. There is now the MICROPY_PERSISTENT_NATIVE config variable to enable this dynamic loadable C module feature, along with some structures and functions named with "persistent_native". It's a long name, but I can't think of anything better. And it's only going to get more confusing because one day there will be support for persistent native code generated by the native emitter (eg @micropython.native or @micropython.viper). Anyway, what we have here is a start and we will need to say that things are subject to change.

pfalcon · 2015-11-16T19:04:21Z

As usual, naming things is hard. There is now the MICROPY_PERSISTENT_NATIVE config variable to enable this dynamic loadable C module feature, along with some structures and functions named with "persistent_native"

Ok, why not call it "loadable native" after all?

Otherwise, I looked thru code. In my current state, I can't understand it well ;-). Well, I see that it reuses a lot of machinery from persistent bytecode and native codegens, which causes you to select "persistent native" term and ".mpy" extension. I'm not sure I agree with either. Regardless of underlying implementation, cached (byte)code and code purposely written in C are 2 rather different things, and I don't think they should be mixed up on terminology level and external user interface levels. I'd suggest ".mpd" extension for compiled modules (following ".pyd" for CPython). Unless you have arguments why it should be .mpy. For example, I'd +0.5 argument that it saves a file look up. But we'll definitely end up confusing users (for example, "cached" .mpy can always be deleted and will be regenerated from source, not so for "implemented in C" .mpy).

pfalcon · 2015-11-30T21:49:56Z

Some random thoughts: nice way to get forward with this would be building some real-world code with this. Like sqlite ;-).

And no, examples/modx/modx.c here in the patch doesn't give warm fuzzy feelings - it's very different from how built-in modules are coded. I'd set that as one of main goals to be able to code modules so they were able to be used both statically and dynamically (efficiently!). That will certainly require resurrecting my idea of having custom preprocessor for stuff like QSTR(foo) (for dynamic module that would resolve to something like __qstr_arr[QSTR_foo] and code to init __qstr_arr).

dpgeorge · 2015-11-30T22:13:42Z

And no, examples/modx/modx.c here in the patch doesn't give warm fuzzy feelings - it's very different from how built-in modules are coded. I'd set that as one of main goals to be able to code modules so they were able to be used both statically and dynamically (efficiently!).

I went to a lot of effort with this new version to provide this feature. See these macros:
https://github.com/micropython/micropython/pull/1627/files#diff-f66815f6580dbaa8ad2614d01ad1dad3R34

QSTR(foo) (for dynamic module that would resolve to something like __qstr_arr[QSTR_foo]

That's how it works at the moment.

For dynamic modules everything must go through a table: qstrs, constants, and calls to the runtime (unless you want to provide a proper linker in uPy to link external symbols, but that's going to require a lot of code, and specific code for each arch).

With this option enabled MicroPython supports loading of .mpy files that contain code compiled directly from C (ie dynamic loadable modules). Position independent code is enabled by having 2 "link" tables: one for runtime functions and constants, and one for qstrs that are local to the loaded code. The runtime function table is shared with that used by the native emitter. The qstr table needs to be populated by the loaded module on loading of this module.

To build just type "make". To build for Thumb2 target, use "make CROSS=1". Then modx.mpy is ready for importing.

dpgeorge · 2017-09-20T02:21:03Z

This was rebased on top of current master and force pushed. @aykevl you may want to try it out.

aykevl · 2017-09-20T20:21:31Z

Thanks! It doesn't seem to work out of the box (probably due to the age), I'm now fixing a few things. Will share when I have something working.

aykevl · 2017-09-26T15:50:29Z

Thank you for the Python 2 fix. I use Debian which uses Python 2 by default. It now works on my side.

Smaller microcontrollers may want the former without the latter.

adritium · 2017-11-13T22:12:51Z

For an MCU with no filesystem and if qstrings are not used, is it possible to load new modules or native code by modifying the table static const mp_rom_map_elem_t mp_builtin_module_table[] and the other tables that you'd normally modify when creating a new module?

If it's not possible now, could this become possible with some linker file modification or is this fundamentally impossible for micropython? Or fundamentally impossible for any executable?

I can imagine that with qstrings, this would be impossible or very hard because the qstring table would not be populated with the new module.

@dpgeorge @pfalcon @aykevl

aykevl · 2017-11-13T22:23:41Z

For an MCU with no filesystem and if qstrings are not used

So what would be the use case for dynamically loadable modules, then? Why not just integrate them into the ROM?
I'm curious, because I've seen such mentions before and I'm not sure in which way dynamically loadable modules could be useful when there is no filesystem.

It might be possible if you make mp_builtin_module_table[] not const and leave some room at the end for newly loaded modules.

adritium · 2017-11-13T22:47:26Z

So what would be the use case for dynamically loadable modules, then? Why not just integrate them into the ROM?

We want to give our customers the ability to extend the micropython installation but only allow them to write to certain areas of flash (so they can't brick the module).

The micropython installation is very integrated into our product so to flash everything as part of a monolithic .elf, we'd have to give out our .objs for them to link to their .objs . . . which we don't want to do.

I'm not saying it's a compelling enough feature for the community here to spend oodles of time on it; I just want to know whether it's possible to do this assuming you don't use qstrings.

Though, if all it takes is adding a #define MAKE_MODULES_EXTENSIBLE along with adding some number of blank entries in some tables . . . it seems like it'd pass the ROI smell test.

aykevl · 2017-11-13T23:05:19Z

We want to give our customers the ability to extend the micropython installation but only allow them to write to certain areas of flash (so they can't brick the module).

Ah, that's certainly a use case, though I wonder if such a feature could be useful for open source projects. But I'm not a maintainer so can't say anything about whether it will be done.

I think the harder problem is disabling qstrs. I suspect they're integrated so deeply they can't simply be disabled - e.g. they're used for fast comparisons between any two strings.

pfalcon · 2019-01-24T22:53:59Z

So, I'm looking into this again. And for the life of me I cannot understand what were the ideas and requirements which went into this "persistent native" stuff. The only reason I may imagine is desire to prototype something for saving @micropython.native/@micropython.viper code into .mpy. Because there's no other explanation why would all this over-engineering, all these parallel hierarchies of "persistent native" functions and types be required to implement just "dynamically loadable modules". (Heck, it's not clear why they would be needed even for persisting - there's already types for native and viper functions).

Perhaps I'm just stupid.

But real fun just begins. The included modx.c doesn't have any (global) variables. That's not realistic, any more or less non-trivial code will have variables, data structures, etc. Trying to add those, I see [rip+xx] in the .o file. But within the produced .elf, there're direct addressing/immediate values instead! Stupid modern compiler smartasses! Ok, adding ld -fPIC. It complains that it's possible only with -shared. Ok, building that, and trying to comprehend the resulting code. Remembering that a few things in the world are as overengineered and bloated as ELF shared libraries, and the only way to not get nausea with them is -Bsymbolic. Ok, but the code in .mpy doesn't disassemble well. After enough peering into it, becomes clear that objcopy -O binary manages to corrupt the section content when copying it out of shlib. That's the modern compiler infrastructure again - automagically "relaxing" PIC code into non-PIC without anybody asking, and being unable to copy a hundred of bytes verbatim without corruption.

Looking at the generated code again, the persistence worthy of Sisyphus becomes apparent:

 226:   48 8d 2d f3 00 00 00    lea    rbp,[rip+0xf3]        # 320 <_ctx>
 22d:   53                      push   rbx
 22e:   48 8b 45 00             mov    rax,QWORD PTR [rbp+0x0]
 232:   ff 50 08                call   QWORD PTR [rax+0x8]
 235:   48 89 c3                mov    rbx,rax
 238:   48 8b 45 00             mov    rax,QWORD PTR [rbp+0x0]
 23c:   4c 89 e7                mov    rdi,r12
 23f:   ff 50 08                call   QWORD PTR [rax+0x8]
 242:   48 8d 3c 03             lea    rdi,[rbx+rax*1]
 246:   48 8b 45 00             mov    rax,QWORD PTR [rbp+0x0]
 24a:   5b                      pop    rbx
 24b:   5d                      pop    rbp
 24c:   41 5c                   pop    r12
 24e:   48 8b 00                mov    rax,QWORD PTR [rax]

Look how carefully it loads pointer to the function table again and again, again and again - instead of just caching it in a register! After jerking back and forth, this one becomes apparent too - wonderful semantics of the C language, where every function is suspected of being able to modify a global. Wait, this should, work, right:

restrict const mod_ctx_t *_ctx;

?

No! It's 2019, but in C, it's possible to only declare var as ever-changing (volatile) or fully constant-down-to-being-immediate-value. Ok, after some thinking (which included abusing .got table to do the needful), solution was found of declaring it extern const mod_ctx_t * const _ctx; in one compilation unit, so there it was treated as cachable constant, and in another - as const mod_ctx_t *_ctx;, so it could be actually initialized at runtime, not compile-time.

Summing up: the idea to use -fPIC seemed bright, but actually is brittle like hell. And I'm looking at the most popular arch. Something like Xtensa will just crumble down. Anyway, I'm proceeding ;-).

pfalcon · 2019-01-25T00:44:57Z

After enough peering into it, becomes clear that objcopy -O binary manages to corrupt the section content when copying it out of shlib.

Heh, that was done by elftompy.py (of course, I'm changing module format). Poor binutils slandered by me!

pfalcon · 2019-02-28T08:47:54Z

Some more notes on the design of dynaloaded modules format: #4535 (comment)

dpgeorge · 2019-10-16T01:59:27Z

This PR is well and truly superseded by #5083

pfalcon force-pushed the master branch 6 times, most recently from 9167980 to 1cc81ed Compare April 10, 2016 22:16

pfalcon force-pushed the master branch from 91ecff0 to 56e7ebf Compare January 28, 2017 09:08

dpgeorge mentioned this pull request Feb 28, 2017

Add method/function for calling non-python functions by function pointer #2894

Closed

dpgeorge added 4 commits September 20, 2017 12:19

py: For native emitter, make pointers in function table constant.

e0bed55

example/modx: Add example of persistent native module.

be3325e

To build just type "make". To build for Thumb2 target, use "make CROSS=1". Then modx.mpy is ready for importing.

unix: Enable loading of persistent native .mpy modules.

b02cecc

dpgeorge force-pushed the dynamic-native-modules-v2 branch from 162a024 to b02cecc Compare September 20, 2017 02:19

tools/mpy-tool.py: Make it work with Python2 again.

b19c51c

aykevl added 3 commits October 18, 2017 00:53

examples/modx: Make the modx example work with the Cortex M0

f87506b

examples/modx: Ignore build directories.

b7efae3

py/nativeglue: Fix MICROPY_PERSISTENT_NATIVE without MICROPY_EMIT_NATIVE

a85c780

Smaller microcontrollers may want the former without the latter.

aykevl mentioned this pull request Jun 14, 2018

Implement a module system for external C modules. #3871

Closed

aykevl mentioned this pull request Aug 1, 2018

[discussion] Sharing custom C modules is hard #4001

Closed

dpgeorge mentioned this pull request Feb 21, 2019

Add support to save native, viper and inline-asm code to .mpy files #4535

Closed

dhalbert mentioned this pull request Apr 5, 2019

MicroPython external C modules adafruit/circuitpython#1752

Closed

dpgeorge mentioned this pull request Jul 10, 2019

trying to build mpy with dynamic libs for esp32 with xtensa-gcc #4916

Closed

dpgeorge mentioned this pull request Oct 16, 2019

Tool to generate native .mpy files from a .elf file (dynamically loadable native code) #5083

Merged

dpgeorge closed this Oct 16, 2019

dpgeorge added the py-core Relates to py/ directory in source label Oct 16, 2019

dpgeorge deleted the dynamic-native-modules-v2 branch July 8, 2022 13:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamic native modules v2 #1627

Dynamic native modules v2 #1627

dpgeorge commented Nov 16, 2015

pfalcon commented Nov 16, 2015

pfalcon commented Nov 30, 2015

dpgeorge commented Nov 30, 2015

dpgeorge commented Sep 20, 2017

aykevl commented Sep 20, 2017 •

edited

Loading

aykevl commented Sep 26, 2017 •

edited

Loading

adritium commented Nov 13, 2017

aykevl commented Nov 13, 2017

adritium commented Nov 13, 2017

aykevl commented Nov 13, 2017

pfalcon commented Jan 24, 2019

pfalcon commented Jan 25, 2019

pfalcon commented Feb 28, 2019

dpgeorge commented Oct 16, 2019

Dynamic native modules v2 #1627

Dynamic native modules v2 #1627

Conversation

dpgeorge commented Nov 16, 2015

pfalcon commented Nov 16, 2015

pfalcon commented Nov 30, 2015

dpgeorge commented Nov 30, 2015

dpgeorge commented Sep 20, 2017

aykevl commented Sep 20, 2017 • edited Loading

aykevl commented Sep 26, 2017 • edited Loading

adritium commented Nov 13, 2017

aykevl commented Nov 13, 2017

adritium commented Nov 13, 2017

aykevl commented Nov 13, 2017

pfalcon commented Jan 24, 2019

pfalcon commented Jan 25, 2019

pfalcon commented Feb 28, 2019

dpgeorge commented Oct 16, 2019

aykevl commented Sep 20, 2017 •

edited

Loading

aykevl commented Sep 26, 2017 •

edited

Loading