Skip to content

Assemblymanager thread safety #277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Nov 17, 2016

Conversation

ArvidJB
Copy link
Contributor

@ArvidJB ArvidJB commented Oct 27, 2016

We observed a spurious crash when a spawned .NET thread loaded some additional assemblies triggered the AppDomain.AssemblyLoad event handler. This modified the namespaces Dictionary and led to an exception

System.InvalidOperationException: Collection was modified; enumeration operation may not execute.
at System.ThrowHelper.ThrowInvalidOperationException(ExceptionResource resource)
at System.Collections.Generic.Dictionary`2.KeyCollection.Enumerator.MoveNext()
at Python.Runtime.AssemblyManager.GetNames(String nsname) in C:\git\pythonnet\src\runtime\assemblymanager.cs:line 442

when we called import clr again from the Python thread. I was able to write a test which reproduces this (most of the time).
As a fix I propose to make namespaces a ConcurrentDictionary. The same issue should also apply to the assemblies list. Unfortunately there is no ConcurrentList, so I added a lock. All the operations under the assemblies lock seem simple enough to not cause any deadlocks by triggering the same codepath on a different thread.

@den-run-ai
Copy link
Contributor

This is failing on Travis CI (Linux) for 2 Python versions on your added assert test.

Concurrent list is not possible by design, but can't we use anything else from System.Collections.Concurrent or at least subclass list with locked methods?

http://stackoverflow.com/a/9995339/2230844

This is to prevent manual lock maintenance, especially when people would not be aware of multi-threading support.

@ArvidJB
Copy link
Contributor Author

ArvidJB commented Oct 27, 2016

Thanks for the suggestions: I will fix the test to be more predictable and not have any collateral impact on other tests and create a little wrapper for List<>.

@den-run-ai
Copy link
Contributor

@tonyroberts @filmor @vmuriart this pull request looks good to me. please review!

@ArvidJB is the list with locks your own implementation or you copied it from somewhere else?

@ArvidJB
Copy link
Contributor Author

ArvidJB commented Oct 27, 2016

@denfromufa that List is my own implementation (although I did read through the stackoverflow answers and some related discussions to come up with it).

@den-run-ai
Copy link
Contributor

then is it possible to avoid using SupportsRecursion? It is not recommended
according to msdn:

https://msdn.microsoft.com/en-us/library/system.threading.readerwriterlockslim.aspx

On Thu, Oct 27, 2016 at 12:46 PM, ArvidJB notifications@github.com wrote:

@denfromufa https://github.com/denfromufa that List is my own
implementation (although I did read through the stackoverflow answers and
some related discussions to come up with it).


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#277 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHgZ5Y-9RYloM5ha_9O-SDurCxFinP9lks5q4OOFgaJpZM4Kh7Fs
.

@ArvidJB
Copy link
Contributor Author

ArvidJB commented Oct 28, 2016

@denfromufa good point, I removed SupportsRecursion. I also added a read lock around _list.Count which I had missed earlier.

@den-run-ai
Copy link
Contributor

@ArvidJB can you add more imports in the test? "import clr" is very special compared to imports of Python modules and .NET assemblies.

@ArvidJB
Copy link
Contributor Author

ArvidJB commented Oct 28, 2016

@denfromufa sure, added some more imports. I only added "import clr" because that's what caused our problems with its call to GetNames(), but the other code paths potentially could also have race conditions.

Copy link
Contributor

@den-run-ai den-run-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now this fails on py2.7 on travis ci

@@ -66,7 +66,8 @@ def testModuleInterface(self):
self.assertEquals(type(System.__dict__), type({}))
self.assertEquals(System.__name__, 'System')
# the filename can be any module from the System namespace (eg System.Data.dll or System.dll)
self.assertTrue(fnmatch(System.__file__, "*System*.dll"))
self.assertTrue(fnmatch(System.__file__, "*System*.dll"),
"unexpected System.__file__" + System.__file__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArvidJB can you add a comment why this was added in test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The System namespace is spread out across a couple of DLLs: mscorlib.dll, System.dll, and all the other System.*.dlls. In ModuleObject we iterate through all the known assemblies for a namespace and pick the last one for the filename. The iteration order would be the on the hash of the assembly, so it should have always been unpredictable whether you get mscorlib.dll or one of the others.
Let me change the test such that mscorlib.dll is also accepted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS: I added the error message because I could not reproduce this on my machine and wanted to know what value it got in the integration tests.

@den-run-ai den-run-ai closed this Oct 30, 2016
@den-run-ai den-run-ai reopened this Oct 30, 2016
@den-run-ai
Copy link
Contributor

@tonyroberts @filmor @vmuriart this pull request looks good to me. please review, at least one of you!

@ArvidJB thanks for putting this together!

@ArvidJB
Copy link
Contributor Author

ArvidJB commented Nov 1, 2016

@denfromufa thanks for reviewing!
@tonyroberts @filmor @vmuriart, is this good to go in?

@vmuriart
Copy link
Contributor

vmuriart commented Nov 1, 2016

The last commit confused me, did we merge master into the pr before the pr is merged into master? Otherwise looks fine to me.

@den-run-ai
Copy link
Contributor

I updated this branch with the master to re-run the tests.

On Tue, Nov 1, 2016, 6:09 PM Vik notifications@github.com wrote:

The last commit confused me, did we merge master into the pr before the pr
is merged into master? Otherwise looks fine to me.


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#277 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHgZ5VN67rCl70gvvfaCBGPW92uYc9MZks5q58aogaJpZM4Kh7Fs
.

@den-run-ai
Copy link
Contributor

@ArvidJB I will wait couple days during release of pythonnet v2.2.0 before merging this. This also gives some time to @filmor @tonyroberts in case they would like to review this. I also need to check embedding tests, which are currently not setup in CI.

@filmor
Copy link
Member

filmor commented Nov 8, 2016

I did review this, didn't see any glaring errors. But this threading stuff is hard :)

How about using SynchronizedCollection instead of rolling your own implementation?

@ArvidJB
Copy link
Contributor Author

ArvidJB commented Nov 8, 2016

@filmor I would have preferred not to write this myself but I did not find a good match in the .NET framework. SynchronizedCollection is a weird class, it's in System.ServiceModel for some reason. And it does not provide thread-safe enumeration:
https://referencesource.microsoft.com/#System.ServiceModel/System/ServiceModel/SynchronizedCollection.cs,3a568f4594ad20f5,references
It only grabs the lock when you get the enumerator but when you iterate you are still stuck with the iterator of the underlying List which can be modified concurrently.

@filmor filmor merged commit 3b9e18a into pythonnet:master Nov 17, 2016
@ArvidJB ArvidJB deleted the assemblymanager_thread_safety branch June 25, 2018 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants