Skip to content

CPython compatible Drop for PyObjectRc #2128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 18, 2020

Conversation

youknowone
Copy link
Member

No description provided.

@youknowone youknowone force-pushed the objectarc branch 5 times, most recently from 3bc9b78 to 6b479e1 Compare August 19, 2020 01:42
@youknowone youknowone marked this pull request as ready for review August 19, 2020 01:43
@BenLewis-Seequent
Copy link

I'm wondering if it would be better/simpler to define PyObjectRc as something like:

struct PyObjectRc {
    inner: Arc<...>
}

impl Drop for PyObjectRc {
    fn drop(&mut self) {
        if self.inner.strong_count() == 1 {
            // call __del__
        }
        // drop Arc
    }
}

The only problem i see with this is that there is a race condition if two references are dropped at the same time, but I think a lock(possibly a global lock) just needs to be added to ensure the drop of the Arc and checking of strong count are atomic. But we only need to do that if the object has a del method in the first place.

@coolreader18
Copy link
Member

Yeah, that's what I was thinking -- we could put a ManuallyDrop around the inner arc, and either ManuallyDrop::drop() or ManuallyDrop::take() the field.

@youknowone
Copy link
Member Author

@Skinny121 thanks, idk why I didn't think there must be public interface for that.
@coolreader18 I have no idea how manually drop help it. could you tell me more about the idea? as i know, ManuallyDrop is good tool to reorder the fields drop but container and fields are always drop as reverse order of construction. So container first and fields later - it seems nowhere to use ManuallyDrop between wrapper and Arc. Probably you are thinking about somewhere else?

@coolreader18
Copy link
Member

The key would be for ManuallyDrop::take(); I don't think it's really feasible to use the thread local VM and call __del__ directly inside drop(), since a PyObject could be stored somewhere else and dropped while the VM isn't active; I think we should try to do something like have a queue of objects that need to be destructed, and check it periodically like we do check_signals

@coolreader18
Copy link
Member

coolreader18 commented Aug 23, 2020

And even if we do go with the thread local VM, it's tricky to call __del__ while you still have to clone the inner field. If you take() it and pass it to __del__ then the reference count of self while inside the destructor will be 1.

Edit: Nevermind, that would probably lead to issues with infinitely recurring into a destructor

@youknowone
Copy link
Member Author

youknowone commented Aug 24, 2020

Ok, I got what you meant about ManuallyDrop.

About __del__ timing, it is easy to avoid by one of below:

  • Easy way: increase count, run __del__, decrease count
  • Harder way: Create arc with count 2 and drop when it goes to count 2. decrease 1 more and drop arc

The thread_local problem:
I agree current way is not ideal. I didn't look up exact point to safely set the shared value. I think we at least need a thread local variable regardless it is a vm, an executing frame or a vec (for sweeping). I set it on frame because It was the safetest place for every choice, but it can be improved in future if the value is vm. things will be easier when a thread is always bound by a vm.

I don't know well about vm, gc and their performance that much.
But I am thinking about PyPy's pros and cons by its design.

I believe there is an opportunity of better performance for collecting and release.

I am thinking of making RustPython as a drop-in-replacement of CPython.
For this purpose, it requires a sort of 'hard compatibility' of CPython.
I don't think it is specific future of RustPython, but at lease we can consider a mode like that. Hard-compatibilty will give potential opportunity between CPython and PyPy.

One of the common problem is resource releasing timing. Some (wrong) python code is depending of CPython's drop timing, which only can simulated by calling __del__ at exact timing to go to rc=0. So, I want to have this one as a possible memory management model, unless we can prove RustPython never can implements c/c++ extension interface.

@youknowone youknowone changed the title __del__ with PyObjectArc CPython compatible Drop for PyObjectRc Sep 12, 2020
@youknowone youknowone force-pushed the objectarc branch 2 times, most recently from ff0b199 to a3d1d18 Compare September 12, 2020 23:23
@youknowone youknowone force-pushed the objectarc branch 4 times, most recently from 93e8cfd to 06bc718 Compare September 14, 2020 00:41
@coolreader18
Copy link
Member

Ah, it's an issue with stuff being dropped while in vm.initialize(), I'll fix that

@coolreader18 coolreader18 self-requested a review September 16, 2020 15:32
@coolreader18
Copy link
Member

@Skinny121 what do you think?

vm/src/import.rs Outdated
@@ -14,45 +14,57 @@ use crate::vm::{InitParameter, VirtualMachine};
#[cfg(feature = "rustpython-compiler")]
use rustpython_compiler::compile;

pub fn init_importlib(vm: &mut VirtualMachine, initialize_parameter: InitParameter) -> PyResult {
pub fn init_importlib(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is receiving vm as an argument but requries to call enter_vm. this is only called in initialization step, so it will not make any problem. but this is making a bit uncomfortable feeling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I make init_importlib pub(crate)?

Copy link
Member

@coolreader18 coolreader18 Sep 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yeah, that's definitely a tricky part, I original wanted to have init_importlib take a &vm, but then it wouldn't be able to set vm.import_func

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't think about pub(crate) at that time, but it looks like to be. Even if there is a need to manally init importlib, it must be a method of vm or interpreter rather than this kind of function. I think we'd better to list up pub apis. Sometimes internal stuffs are marked as pub just to be shared in the crate.

@@ -1537,29 +1577,76 @@ impl<'vm> Drop for ReprGuard<'vm> {
}
}

pub struct Interpreter {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does Interpreter matche to vm 1-to-1 relationship or is there any other usage of vm?
If it is 1-to-1, I don't feel like we really need a new type for that; we only need the new methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The separate struct is so that you can't access the VM except through putting it in the VM stack, otherwise, it wouldn't be guaranteed that the VM would be in stack while using it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, will there be any difference if we put interpreters in VM stack instead of VMs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially did that, but then it was tricky passing an Interpreter to initialize(&mut interp.vm, &interp). Definitely could make it happen though, especially if we're making init_importlib crate-private

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, got it. interface visibility is a good reason.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually feels like a 1-to-n relationship as each thread has it's own vm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems each thread has its own vm and also its own interpreter. so still vm and interpreter is 1-to-1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was originally thinking that Interpreter would cover all vms across threads for that interpreter instance; like it could have a thread-local value to hold the vm, and it could be Send+Sync and whichever thread you called it from it would get that thread's vm, but I'm not sure how actually useful that would be.

@BenLewis-Seequent
Copy link

Looks pretty good. PyObjectRc's looks like they can still escape crate::vm::thread::enter_vm though, which means that this doesn't guarantee that a vm is always available in the drop. The way to ensure this is to add a lifetime parameter to PyObjectRc, though I'm unsure how feasible that would actually be though.

@coolreader18
Copy link
Member

I added another commit to rework vm initialization a bit; not sure if it's totally on-topic but I think it makes more sense with how the Interpreter struct works now, and I think it's good to have the InitParameter be more explicit, since either way having an expectation for how it should work by default isn't great. If you expect it by default not to have access to external imports but it does, that could be a security risk, and if you expect it by default to have access to external imports but it doesn't, that might be confusing why something like import os fails.

@coolreader18
Copy link
Member

Should we merge this?

@youknowone
Copy link
Member Author

If the design is good enough, let's go and fix remained problems later. this pr is growing bigger. bigger patch will cause more frequent conflicts for other PRs

@youknowone youknowone merged commit a0d7294 into RustPython:master Sep 18, 2020
@youknowone youknowone deleted the objectarc branch September 18, 2020 05:38
@youknowone
Copy link
Member Author

youknowone commented Sep 18, 2020

I think the comments here is related to #1940

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants