-
-
Notifications
You must be signed in to change notification settings - Fork 8.2k
RFC: Allow registration of modules at their definition. #4289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
py/obj.h
Outdated
@@ -816,6 +816,9 @@ static inline mp_obj_dict_t *mp_obj_module_get_globals(mp_obj_t module) { | |||
// check if given module object is a package | |||
bool mp_obj_is_package(mp_obj_t module); | |||
|
|||
// declare a module as a builtin, handled by makemoduledefs.py | |||
#define MP_REGISTER_MODULE(name, ref, guard) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd try to look for better self-describing name than "guard". Likewise, for other params. And maybe even then comment should describe what are they.
py/objmodule.c
Outdated
@@ -226,6 +225,11 @@ STATIC const mp_rom_map_elem_t mp_builtin_module_table[] = { | |||
|
|||
// extra builtin modules as defined by a port | |||
MICROPY_PORT_BUILTIN_MODULES | |||
|
|||
#ifdef MICROPY_REGISTERED_MODULES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use indented #ifdef, per the latest code conventions.
py/makemoduledefs.py
Outdated
|
||
|
||
if __name__ == '__main__': | ||
main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As github shows, EOL is missing.
+1, this is definitely move in the right direction. I'm not sure if it could be done simpler (e.g. without yet another preprocessor script), but I assume it's not. I personally wouldn't go further until this patch is merged, or explicitly acked by the maintainer (like, "this goes in, convert other modules in this same PR"). |
I also feel like adding more python pre-processors seems a bit excessive, but I haven't been able to find a better solution either. gcc preprocessor does now have the ability to push/pop defines, so you can build a structure that appends to existing defines and build it up that way, however it still requires all the definitions to be appended to be in the include path (headers) with the destination (objmodule.c) being at the end and be the only thing to expand the final macro. It was looking harder to achieve cleanly. I did consider using the preprocessor on the c files first to know what should or shouldn't be included rather than needing the #if statement to be provided, but that would definitely be a lot slower than this pure python/regex method. Other than all that, this script does seem to run faster than I can notice it, and is hopefully fairly immune to extra whitespaces / line breaks in the macro usage. Most importantly, this will not result in any extra ram/code/runtime needed, unlike any kind of dynamic registration of enabled modules. I definitely agree the python script needs cleaning up, comments and/or documenting etc. I also don't like the arg names, I just haven't spent any time trying to structure it better (it grew out of the script used in cmodules branch). The current implementation is intended as a proof of concept to raise this PR as a discussion start. |
Another approach that can be used is to use linker sections. Each module would put the mp_obj_module_t into a named section. The linker winds up concatenating these together. At startup some code would walk through each of the module objects in the named section and register them. |
That's too expensive. However, statically creating the whole of |
Actually, I think it could be done with zero startup cost by simply having the qstr/pointers to the modules added to the existing mp_builtin_module_table. |
I like the idea, the thought of bending the linker into giving us a registration array never occurred to me. If each row of the |
Yes, would give troubles to people with adhoc toolchain, but actually would give issues even for unix and windows ports. So, I'd say skip it for now, given that preprocessor is already written. |
Yeah, true, and it would require duplicating a suitable linker definition into each ports' linker files, with changes for each different kind of compiler. T |
3244181
to
cae619a
Compare
I've updated the branch to be a bit cleaner, the script more descriptive. For references, the generated // Automatically generated by makemoduledefs.py.
#if (MICROPY_PY_ARRAY)
extern const struct _mp_obj_module_t mp_module_array;
#define MODULE_DEF_MP_QSTR_ARRAY { MP_ROM_QSTR(MP_QSTR_array), MP_ROM_PTR(&mp_module_array) },
#else
#define MODULE_DEF_MP_QSTR_ARRAY
#endif
#define MICROPY_REGISTERED_MODULES \
MODULE_DEF_MP_QSTR_ARRAY \
// MICROPY_REGISTERED_MODULES |
cae619a
to
5e87dca
Compare
py/makemoduledefs.py
Outdated
#!/usr/bin/env python | ||
|
||
# This pre-processor parses provided objects' c files for | ||
# `MP_REGISTER_MODULE(module_name, obj_module, enabled_define)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that (markdown?) markup here makes sense. At least, we don't use it anywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, yeah, I do find myself doing that a bit these days...
py/makemoduledefs.py
Outdated
import argparse | ||
|
||
|
||
def find_c_file(vpath, obj_file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can be a nitpick, by first impression I have in my head looking at this line is question "why args in that order?" ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't really thought about it.
On one hand, vpath
had a list of root paths, with the obj_file being appending on the end of one of them, so the two are joined together in that visual order.
On the other hand, obj_file
is kind of the point of the function, with vpath being the list of paths to search in.
Thanks for the updates. Well, now a one-line commit message stands out as a bare pole, can you add something there (for the lack of real documentation, commit message serves as documentation). And to clarify - you don't scan each byte of the entire source with this preprocessing script, do you? |
py/makemoduledefs.py
Outdated
|
||
def find_c_file(vpath, obj_file): | ||
""" | ||
Search the provided vpaths for the c file that matches the provided object_file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't docstring style be something more standard like
""" Short description which ends with period.
Rest of description
"""
or
""" Short description which ends with period. """
Just checking, don't know if this is set in stone somewhere for uPy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never loved the first line starting on the same line as the opening """, but it does seem to be the more standard convention in PEP 257.
py/makemoduledefs.py
Outdated
def find_c_file(vpath, obj_file): | ||
""" | ||
Search the provided vpaths for the c file that matches the provided object_file | ||
:param List[str} vpath: List of base paths, similar to gcc vpath |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List[str}
is that correct? Also uses different captialization than the line below it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That one is a typo, thanks. I'm not completely sure about the later ones, being a list of tuples.
""" | ||
c_file = None | ||
relative_c_file = os.path.splitext(obj_file)[0] + ".c" | ||
relative_c_file = relative_c_file.lstrip('/\\') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since relative_c_file this is passed to os.path.join this should not be needed I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely needed, as the files being passed in from makefile with the build folder stripped off come in with a "/" at the start.
Passing this in as the right hand arg of os.path.join
, it gets treated as an absolute path and the left hand arg (from vpath) is ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it gets treated as an absolute path
Well, that's new to me. Probably because the implementation is inconsistent between windows/linux and the type of slash used :(
py/makemoduledefs.py
Outdated
:param c_file: path to c file to check | ||
:return: List((module_name, obj_module, enabled_define)) | ||
""" | ||
pattern = re.compile( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be closer to where it's used, or maybe don't use seperate pattern at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah in previous structure this pattern compilation happened outside the for loop iterating over files, so should be a decent efficiency improvement. It's now essentially moved inside the loop though so the benefit has been lost.
I'm not sure just how much is gained by only compiling this once up front, rather than once per file... so not too sure if it's worth moving it to global scope to keep it separate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry should have been clearer, I meant: perhaps it's clearer and more concies to simply not compile to a pattern at all, but just pass the regex string as first argument to re.findall.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no I agree it's somewhat clearer to just include the pattern in the findall, however it's faster to compile the pattern once and reuse for all the files looped over, so a tradeoff between readable and speed. How much speed though, I'm not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps it's clearer and more concies to simply not compile to a pattern at all, but just pass the regex string as first argument to re.findall.
Well, nope. Compiling patterns is the best practice, shortcuts like that exist only for once-off scripts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I prefer the once off compilation, my last push moved the pattern compilation to global scope to ensure it's a once off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much speed though, I'm not sure.
Yes that's what struck me first time. De-clarifying code just for the sake of 'might be faster' in some specific Python version in a program which is likely going to be limited by disk I/O, don't know. Well, it won't hurt either.
py/makemoduledefs.py
Outdated
return [] | ||
|
||
with open(c_file) as c_file_handle: | ||
c_text = c_file_handle.read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just return list(re.findall(pattern, c_file_handle.read())) here?
Minor nitpick: open() doesn't really return a 'handle' like there's handles in C, but it's called a 'file object'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Splitting it up was useful while debugging to get the pattern right, less important now!
for mod_def in mod_defs: | ||
print(" {mod_def} \\".format(mod_def=mod_def)) | ||
|
||
print("// MICROPY_REGISTERED_MODULES") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems spurious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is somewhat, however without this the last line of the file ends with \
so it kind of looks like the file is unfinished or has been truncated.
Well... it does run a regex over the text of every c file in the current make pattern (files being built for the port, not every c file). |
During make, makemoduledefs.py parses the compiled ports' c files for MP_REGISTER_MODULE(module_name, obj_module, enabled_define) These are used to generate a header with the required entries for "mp_rom_map_elem_t mp_builtin_module_table[]" in py/objmodule.c
5e87dca
to
11e7507
Compare
The current method of adding a new builtin/global modules is to have the module object defined in the c file for said module, but the python module name and matching
#define
enable guard defined in the centralobjmodule.c
file.This PR provides a mechanism for moving this name and guard definition to the actual module file, removing the need to edit the
objmodule.c
file whenever you want to add a new module.There is sort of fake macro provided
MP_REGISTER_MODULE(name, obj, enabled)
for use where modules are being defined.These are parsed by the included python script during make (not the c the preprocessor) to generate the header of required entries in the builtins modules list in
objmodule.c
This was originally written to replace my json based method of declaring modules in #4195 but I think it could have wider appeal across the micropython codebase.
If you're interested in this system, I'm happy to replace the other existing module definitions with the "macro", similar to the array one I've used as reference. I'll also update my cmodule branch to suit.
I'd also like to see a similar system used for module init (early and late) and de-init functions, so that modules can themselves declare the functions needed for their set up or tear down in
main.c
rather than having to maintain such a list of functions manually in all the separate ports main files. There could be some added complexity there with regard to dependencies and ordering, but that's a separate issue to this PR.