Skip to content

Enable pythonnet to survive in the Unity3d editor environment. #752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 18, 2018

Conversation

benoithudson
Copy link
Contributor

The Unity editor unloads the C# domain and creates a new domain every time it recompiles C# sources.

This caused two crashes.

  1. After a domain unload, python is now pointing to a bunch of objects whose C# side is now garbage. Solution: Shutdown() the engine on domain reload, which calls Py_Finalize.

  2. After a domain unload, Py_Finalize, and a new Py_Intialize, python still keeps pointers in gc for any python objects that were leaked. And pythonnet leaks. This means that python's gc will be calling tp_traverse, tp_clear, and tp_is_gc on types that are implemented in C#. This crashes because the implementation was freed. Solution: implement those calls in code that is not released on domain unload. Side effect: we leak a page on every domain unload. I suspect but didn't test that python gc performance will be slightly faster.

Changes required to implement and test:
3. Use python's platform package to determine what we're running on, so we can use the right mmap/mprotect or VirtualAlloc/VirtualProtect routines, the right flags for mmap, and (theoretically) the right native code. Side effect: to port to a new platform requires updating some additional code now.

  1. New unit tests. (a) A new test for domain reload. It doesn't run under .NET Core because you can't create and unload a domain in .NET Core. (b) New unit tests to help us find where to add support for new platforms.

Does this close any currently open issues?

#714 ; see also #745 and #751.

The Unity editor unloads the C# domain and creates a new domain every time it recompiles C# sources.

This caused two crashes.
1. After a domain unload, python is now pointing to a bunch of objects whose C# side is now garbage. Solution: Shutdown() the engine on domain reload, which calls Py_Finalize.

2. After a domain unload, Py_Finalize, and a new Py_Intialize, python still keeps pointers in gc for any python objects that were leaked. And pythonnet leaks. This means that python's gc will be calling tp_traverse, tp_clear, and tp_is_gc on types that are implemented in C#. This crashes because the implementation was freed. Solution: implement those calls in code that is *not* released on domain unload. Side effect: we leak a page on every domain unload. I suspect but didn't test that python gc performance will be slightly faster.

Changes required to implement and test:
3. Use python's platform package to determine what we're running on, so we can use the right mmap/mprotect or VirtualAlloc/VirtualProtect routines, the right flags for mmap, and (theoretically) the right native code. Side effect: to port to a new platform requires updating some additional code now.

4. New unit tests. (a) A new test for domain reload. It doesn't run under .NET Core because you can't create and unload a domain in .NET Core. (b) New unit tests to help us find where to add support for new platforms.
@benoithudson
Copy link
Contributor Author

Now we see if I flubbed the merge...

@den-run-ai
Copy link
Contributor

den-run-ai commented Oct 12, 2018

@benoithudson just to be clear this enables pythonnet survival in Unity3D editor even when AppDomains are not available on that platform? Is AppDomain only used for testing?

@codecov
Copy link

codecov bot commented Oct 12, 2018

Codecov Report

Merging #752 into master will decrease coverage by 0.05%.
The diff coverage is 83.96%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #752      +/-   ##
==========================================
- Coverage   77.18%   77.12%   -0.06%     
==========================================
  Files          62       62              
  Lines        5596     5688      +92     
  Branches      897      903       +6     
==========================================
+ Hits         4319     4387      +68     
- Misses        983     1004      +21     
- Partials      294      297       +3
Flag Coverage Δ
#setup_linux 69.42% <ø> (ø) ⬆️
#setup_windows 76.31% <83.96%> (-0.04%) ⬇️
Impacted Files Coverage Δ
src/runtime/extensiontype.cs 79.16% <ø> (+5.09%) ⬆️
src/runtime/classbase.cs 81.37% <ø> (+0.42%) ⬆️
src/runtime/pythonengine.cs 78.69% <100%> (+0.37%) ⬆️
src/runtime/typemanager.cs 82.65% <79.68%> (-1.39%) ⬇️
src/runtime/runtime.cs 91.15% <89.47%> (-0.25%) ⬇️
src/runtime/metatype.cs 64.65% <0%> (-6.9%) ⬇️
src/runtime/nativecall.cs 95.55% <0%> (-4.45%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 05a1451...072022b. Read the comment docs.

@benoithudson
Copy link
Contributor Author

benoithudson commented Oct 15, 2018

@benoithudson just to be clear this enables pythonnet survival in Unity3D editor even when AppDomains are not available on that platform? Is AppDomain only used for testing?

Exactly the other way: Unity uses app domains, loads and unloads them frequently. And that caused a crash.

CI runs tests in a setting without app domains (and also with app domains). So the patch works either way.

You don't need domains for the fix I made; only the test of whether the system survives domain unload needs domains.

@filmor
Copy link
Member

filmor commented Oct 16, 2018

@denfromufa Good for merge?

@den-run-ai
Copy link
Contributor

den-run-ai commented Oct 16, 2018 via email

@den-run-ai
Copy link
Contributor

@benoithudson I'm mostly done reviewing the code, this is monumental achievement!

One thing that bothers me is if the bytes/assembly code can be replaced with a call to a compiled C code as part of pythonnet? @filmor @amos402 what do you think?

the native dll can even be embedded as a resource:

https://stackoverflow.com/a/768429/2230844

@filmor
Copy link
Member

filmor commented Oct 18, 2018

The beauty of this solution is that it does not need all of the boilerplate that you have when using a proper DLL or .so. The code for the two functions is OS independent, it only depends on the CPU architecture. I'm okay with what is here, I would just like us to look into alternative solutions that don't leak the memory page (or at the very least only leak it once).

@filmor filmor added this to the 2.4.0 milestone Oct 18, 2018
@filmor filmor added the next label Oct 18, 2018
@amos402
Copy link
Member

amos402 commented Oct 18, 2018

It seems it's complex to do that.
To prevent Unity crash, only just need to reset the gc.

amos402/cpython@3fb76b5
I made a modify in cpython before like above, it works well. I already use this for months.
I just made up a pure C# version by trick for reseting th GC. #754
I still havn't test it fully for this new version.

@filmor
Copy link
Member

filmor commented Oct 18, 2018

@benoithudson Can you check whether @amos402 solution works for you as well before we merge?

@benoithudson
Copy link
Contributor Author

Thanks! @amos402 should get credit for debugging this crash to the fact that python keeps its gc list around.

The main reason I didn't go with a .so/.dll/.dylib that we could dlopen is from failing to figure out the build system on the various platforms. I was also initially trying to stick to compile once, run anywhere so we could deploy a single .net dll.

I tried and failed to figure out how to figure out how to allocate the page once, stuff the address somewhere, and reuse it later. By design you can't really do that in .net ; and python also doesn't see any reason to make that easy. (In py27 you can via Py_SetProgramHome and friends, but that's gross, doesn't work if anyone else is pulling the same hack, and fails in py3x where they copy the strings.)

@amos402's trick of modifying python itself looks great if it can pass the cpython test suite and get past review on cpython. I love the C# monkey-patch for backporting the fix into already-deployed cpython. I'm just worried about unintended consequences.

@filmor
Copy link
Member

filmor commented Oct 18, 2018

I've triggered the build for #754 again, it seems to not be quite sound with regard to memory handling (though that doesn't necessarily have to be due to this PR). I'm currently leaning towards merging this one (#752) first such that we have a chance of preparing a 2.4.0 release before end of the year and afterwards look into #754 again.

@denfromufa @benoithudson @amos402 do you agree?

@benoithudson
Copy link
Contributor Author

@filmor : I'm in favour of your plan. I won't be sad if my code gets buried as soon as we have a better alternative.

@amos402
Copy link
Member

amos402 commented Oct 18, 2018

I've triggered the build for #754 again, it seems to not be quite sound with regard to memory handling (though that doesn't necessarily have to be due to this PR). I'm currently leaning towards merging this one (#752) first such that we have a chance of preparing a 2.4.0 release before end of the year and afterwards look into #754 again.

@denfromufa @benoithudson @amos402 do you agree?

Yes. I still need more tests for fixing that PR.

@den-run-ai
Copy link
Contributor

👍

@filmor filmor merged commit 784190a into pythonnet:master Oct 18, 2018
@filmor
Copy link
Member

filmor commented Oct 18, 2018

Thanks a lot for your and your colleagues efforts :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants