Skip to content

[RFC] split rustpython_stdlib from rustpython_vm #3102

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
youknowone opened this issue Sep 21, 2021 · 11 comments
Closed

[RFC] split rustpython_stdlib from rustpython_vm #3102

youknowone opened this issue Sep 21, 2021 · 11 comments
Labels
RFC Request for comments

Comments

@youknowone
Copy link
Member

Summary

Split rustpython_stdlib from rustpython_vm which contains non-necessary rust modules for vm itself.

Detailed Explanation

Advantages:

  • We will have pressure to define public APIs by dogfooding, because rustpython_stdlib cannot access rustpython_vm private.
  • Compiling will be faster by deviding crates. At least fot rustpython_stdlib.
  • stdlib can be a feature. Without it, the binary size will be smaller than now.

Drawbacks, Rationale, and Alternatives

Unresolved Questions

@DimitrisJim
Copy link
Member

DimitrisJim commented Sep 25, 2021

Question on this since #3132 got me thinking.

Doesn't vm require minimal support from the stdlib? sys seems like it defines things that are also used in vm/, importlib should be around for importing things and io and encodings also are required if I recall correctly.

I can see how a subset of the stdlib could be separated but not really seeing how all could be yanked and placed in a new crate with could be a nostdlib feature, a limited-stdlib maybe. This is just a quick though on it, though, what's your idea on this?

@youknowone
Copy link
Member Author

I think modules defined in sys.builtin_module_names must be included in vm. Other modules will be splited.

I also think the name vm::stdlib doesn't fit anymore.
Maybe vm::module include builtin modules (since it is defined in sys.builtin_module_names),
and they will be re-exported through rustpython_stdlib again

@coolreader18
Copy link
Member

coolreader18 commented Sep 27, 2021

Something I've been thinking about - oh which it seems also @DimitrisJim brought up - some things should be core to the interpreter, right? Like _signal, or _weakref, or _codecs - imo, anything that's required to start up rustpython (and to be honest, I thought it was a given that sys/builtins would stay in the vm). CPython makes this distinction via "built-in" modules vs dlls. _io is <module 'io' (built-in)>, but zlib is <module 'zlib' from '/usr/lib/python3.9/lib-dynload/zlib.cpython-39-x86_64-linux-gnu.so'>. Obviously we don't have extension modules, but I think it's more productive/useful to put those ancillary modules in a separate crate while keeping the actually core ones in rustpython-vm.

@coolreader18
Copy link
Member

Oh, right - like sys.builtin_module_names. These all seem preeetty core to the interpreter, except for things like functools/itertools but those are still very heavily used in all kinds of python code, they aren't like zlib which is domain-specific.

('_abc', '_ast', '_codecs', '_collections', '_functools', '_imp', '_io',
 '_locale', '_operator', '_peg_parser', '_signal', '_sre', '_stat', '_string',
 '_symtable', '_thread', '_tracemalloc', '_warnings', '_weakref', 'atexit',
 'builtins', 'errno', 'faulthandler', 'gc', 'itertools', 'marshal', 'posix',
 'pwd', 'sys', 'time', 'xxsubtype')

@coolreader18
Copy link
Member

Maybe the distinction we could have, if we kept these in rustpython-vm, is corelib instead of stdlib? Obviously a different set of priorities than Rust's core vs std, but a similar concept imo.

@youknowone
Copy link
Member Author

youknowone commented Sep 27, 2021

#3129 is now working version. it solves native modules, but not pylib

@youknowone
Copy link
Member Author

youknowone commented Sep 28, 2021

I have another public stdlib api suggestion in #3165

@youknowone
Copy link
Member Author

I don't remember which PR it was, but Jim commented about sys.stdlib_module_names.

@coolreader18 In the sense of contrasting stdlib_module_names in builtin_module_names in python api, I think corelib sounds more like original concept. How do you think about it?

@jamestwebber
Copy link
Contributor

As an interested (very) minor contributor I really like the idea of defining a corelib for minimal RustPython–only what's needed for the language itself. What is considered the CPython stdlib could be a feature, or even split into multiple feature flags. Particularly esoteric libraries would be good candidates to be moved out into external packages.

The list from sys.builtin_module_names seems like a good core, with the caveat that xxsubtype is CPython-specific module (not tested, see #3250). Similarly marshal is deliberately undocumented, and is only used for CPython internal stuff like byte-code.

So RustPython might needs its own versions of those modules but trying to port them doesn't seem useful and they shouldn't block a corelib PR.

@DimitrisJim
Copy link
Member

Running Python (3.8) with increased verbosity (-vv) does give a list of imports during start-up. For normal file invocations, this is:

_imp _thread _warnings _weakref _frozen_importlib_external _io marshal posix time zipimport 
_codecs codecs encodings.aliases encodings encodings.utf_8 _signal encodings.latin_1 
_abc abc io _stat stat _collections_abc genericpath posixpath os _sitebuiltins _locale 
_bootlocale types warnings importlib importlib.machinery importlib.abc _operator operator keyword 
_heapq heapq itertools reprlib _collections collections _functools functools contextlib importlib.util

Launching the REPL does entail some additional imports: readline, atexit, rlcompleter

@youknowone
Copy link
Member Author

This is finished by #3620

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RFC Request for comments
Projects
None yet
Development

No branches or pull requests

4 participants