-
Notifications
You must be signed in to change notification settings - Fork 747
Enable pythonnet to survive in the Unity3d editor environment. #752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable pythonnet to survive in the Unity3d editor environment. #752
Conversation
The Unity editor unloads the C# domain and creates a new domain every time it recompiles C# sources. This caused two crashes. 1. After a domain unload, python is now pointing to a bunch of objects whose C# side is now garbage. Solution: Shutdown() the engine on domain reload, which calls Py_Finalize. 2. After a domain unload, Py_Finalize, and a new Py_Intialize, python still keeps pointers in gc for any python objects that were leaked. And pythonnet leaks. This means that python's gc will be calling tp_traverse, tp_clear, and tp_is_gc on types that are implemented in C#. This crashes because the implementation was freed. Solution: implement those calls in code that is *not* released on domain unload. Side effect: we leak a page on every domain unload. I suspect but didn't test that python gc performance will be slightly faster. Changes required to implement and test: 3. Use python's platform package to determine what we're running on, so we can use the right mmap/mprotect or VirtualAlloc/VirtualProtect routines, the right flags for mmap, and (theoretically) the right native code. Side effect: to port to a new platform requires updating some additional code now. 4. New unit tests. (a) A new test for domain reload. It doesn't run under .NET Core because you can't create and unload a domain in .NET Core. (b) New unit tests to help us find where to add support for new platforms.
Now we see if I flubbed the merge... |
@benoithudson just to be clear this enables pythonnet survival in Unity3D editor even when AppDomains are not available on that platform? Is AppDomain only used for testing? |
Codecov Report
@@ Coverage Diff @@
## master #752 +/- ##
==========================================
- Coverage 77.18% 77.12% -0.06%
==========================================
Files 62 62
Lines 5596 5688 +92
Branches 897 903 +6
==========================================
+ Hits 4319 4387 +68
- Misses 983 1004 +21
- Partials 294 297 +3
Continue to review full report at Codecov.
|
Exactly the other way: Unity uses app domains, loads and unloads them frequently. And that caused a crash. CI runs tests in a setting without app domains (and also with app domains). So the patch works either way. You don't need domains for the fix I made; only the test of whether the system survives domain unload needs domains. |
@denfromufa Good for merge? |
I have not finished reviewing yet
…On Tue, Oct 16, 2018, 4:10 AM Benedikt Reinartz ***@***.***> wrote:
@denfromufa <https://github.com/denfromufa> Good for merge?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#752 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHgZ5aoZMpLRzbYAcj8E4Uh9MapmIJFIks5ulaJvgaJpZM4XaILg>
.
|
@benoithudson I'm mostly done reviewing the code, this is monumental achievement! One thing that bothers me is if the bytes/assembly code can be replaced with a call to a compiled C code as part of pythonnet? @filmor @amos402 what do you think? the native dll can even be embedded as a resource: |
The beauty of this solution is that it does not need all of the boilerplate that you have when using a proper DLL or .so. The code for the two functions is OS independent, it only depends on the CPU architecture. I'm okay with what is here, I would just like us to look into alternative solutions that don't leak the memory page (or at the very least only leak it once). |
It seems it's complex to do that. amos402/cpython@3fb76b5 |
@benoithudson Can you check whether @amos402 solution works for you as well before we merge? |
Thanks! @amos402 should get credit for debugging this crash to the fact that python keeps its gc list around. The main reason I didn't go with a .so/.dll/.dylib that we could dlopen is from failing to figure out the build system on the various platforms. I was also initially trying to stick to compile once, run anywhere so we could deploy a single .net dll. I tried and failed to figure out how to figure out how to allocate the page once, stuff the address somewhere, and reuse it later. By design you can't really do that in .net ; and python also doesn't see any reason to make that easy. (In py27 you can via Py_SetProgramHome and friends, but that's gross, doesn't work if anyone else is pulling the same hack, and fails in py3x where they copy the strings.) @amos402's trick of modifying python itself looks great if it can pass the cpython test suite and get past review on cpython. I love the C# monkey-patch for backporting the fix into already-deployed cpython. I'm just worried about unintended consequences. |
I've triggered the build for #754 again, it seems to not be quite sound with regard to memory handling (though that doesn't necessarily have to be due to this PR). I'm currently leaning towards merging this one (#752) first such that we have a chance of preparing a 2.4.0 release before end of the year and afterwards look into #754 again. @denfromufa @benoithudson @amos402 do you agree? |
@filmor : I'm in favour of your plan. I won't be sad if my code gets buried as soon as we have a better alternative. |
Yes. I still need more tests for fixing that PR. |
👍 |
Thanks a lot for your and your colleagues efforts :) |
The Unity editor unloads the C# domain and creates a new domain every time it recompiles C# sources.
This caused two crashes.
After a domain unload, python is now pointing to a bunch of objects whose C# side is now garbage. Solution: Shutdown() the engine on domain reload, which calls Py_Finalize.
After a domain unload, Py_Finalize, and a new Py_Intialize, python still keeps pointers in gc for any python objects that were leaked. And pythonnet leaks. This means that python's gc will be calling tp_traverse, tp_clear, and tp_is_gc on types that are implemented in C#. This crashes because the implementation was freed. Solution: implement those calls in code that is not released on domain unload. Side effect: we leak a page on every domain unload. I suspect but didn't test that python gc performance will be slightly faster.
Changes required to implement and test:
3. Use python's platform package to determine what we're running on, so we can use the right mmap/mprotect or VirtualAlloc/VirtualProtect routines, the right flags for mmap, and (theoretically) the right native code. Side effect: to port to a new platform requires updating some additional code now.
Does this close any currently open issues?
#714 ; see also #745 and #751.