-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[RFC] Support embedding and sandboxing untrusted code #4210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
What's your requirements for the embedding? It sounds more like sandboxing to run arbitrary third party code, rather than a simple embedding to run first or second party code. For example, limiting running time is not a common requirement of embedding. Sharing more detailed usage will give us more detailed image what kind of requirement you have. I agree to have sandboxing support to leverage our wasm support. Here are a few more answers about the questions. import cannot be disabled, but importlib can be disabled. I recently added You seem to want to have memory quota. This is not easy in python. Catching exceeding quota is comparably easy, but preventing exceeding quota will not. guess you tried |
Thank you for your quick answer! Yes, our need is indeed more about sandboxing untrusted code beyond simple embedding. Regarding our use case, we are creating an educational game creation platform (https://cand.li) that currently has its custom visual language and is written in a mix of Typescript and Rust (compiled to WASM). We are progressively migrating more and more of our code base to Rust, and in the medium term Python support is something our customers (teachers and students) are asking for, the idea being that they can start with the visual language, then write their own advanced blocks in Python, even sharing them with less-advanced students. However, sandboxing is critical:
Regarding memory quota, I imagine that it would be possible for all internal Python data to go through a gate checking the quota when being created, and basically fail execution if the quota is reached (for us that would be enough). Of course I guess it would be an optional feature. I'm under the impression that it touches a similar problematic as the garbage collection, and maybe can be designed at the same time. Regarding the intrinsics that can allocate an arbitrary amount of memory, like Regarding importing, allowing import from a restricted set of pure Python or Rust-based modules defined outside the sandbox would be very good, just being able to prevent accessing the computer's filesystem would be critical. |
The IO parts looks like to be natively blocked by wasm environment. |
Yes, but at least in our case, we would like to have it on native as well, as we plan to make a native app at some point. I imagine that other game projects would have a similar need.
Good to hear that you are interested! I'm happy to provide input from the use case side of the thing, whenever it is helpful! |
Hey, I was just wondering if this is possible in the latest version of RustPython. I wish to run a native app that can run user-generated Python code. Ideally, it would be as feature-rich as possible but without access to IO of any type. An alternative I have thought of is to use a wasm build and call the interpreter through that to do sandboxing. |
I'm very interested in using this library in my projects because Python is such a popular language in data science, but stability in the face of arbitrary code is very important to me. I wouldn't want a user to be able to accidentally crash their browser or lock up their machine. To limit the amount of memory (or even the rate at which memory is allocated), perhaps the library could support custom allocators? Rust uses the For execution duration, obviously a multi-threaded application could just use a dedicated thread for the vm and terminate it whenever it wants, but a single-threaded application won't have that luxury. Is there a way to get an iterator from the vm instead of trusting that |
Did you find anyway to do it?
Did you try that? |
@NoelJacob I did try it, and it worked, however it is complicated. I cannot disclose all the details, since it was for work. |
Summary
As far as I have seen, RustPython is not yet suitable for "safe" embedding, meaning that executed Python code can block or hurt the caller code, because:
Detailed Explanation
I wish to use RustPython as a scripting language within a game engine, running third-party user code. A requirement for me is that this code is run in a safe way. As far as I have seen (but I might have missed some elements), it is currently not the case:
ExecutingFrame
in its execution loop. Maybe theExecutionResult
type could be extended with anInstructionBudgetExceeded
variant or similar (which could later be expanded to support step by step interactive debugging).os
module to WASM, maybe a feature flag would be a good addition. Similarly, it should be possible to not enable some Windows specific code and fully disable or controls IOs (including network) and side-effect functions (such as delay) regardless of the target platform.Drawbacks, Rationale, and Alternatives
The rationale is to use RustPython as an embedding language within larger software, such as game engines. In these, the software must fully control the scripting environment's limits.
The main drawback is increased code complexity within RustPython, but I believe it can be done cleanly, with some work of course. The split of the Std library (#3102) was already a step towards the direction of embedding.
The alternatives are to not implement this feature, or do it in a fork. A similar issue exists (#3090), but it is more of a question, so I thought an RFC-style new issue is better.
Unresolved Questions
There are quite some design questions obviously, but I guess first one should agree whether this overall feature makes sense for the project, then the design can be worked out. Probably a unified way to control embedding would be elegant.
The text was updated successfully, but these errors were encountered: