Skip to content

Modernize import hook #1369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jun 18, 2021
Merged

Conversation

BadSingleton
Copy link
Contributor

What does this implement/fix? Explain your changes.

This replaces the current import hook replacement with a MetaPathFinder, letting python handle non-CLR module import.
Does this close any currently open issues?

#727 #1245 #547 #111 and probably others.
Any other comments?

This fix also helps working around a mono runtime behaviour where a mono owns all threads that execute .NET code.
Checklist

Check all those that are applicable and complete.

Make sure to include one or more tests for your change
If an enhancement PR, please create docs and at best an example
Add yourself to AUTHORS
Updated the CHANGELOG

@BadSingleton BadSingleton force-pushed the modernize-import-hook branch 2 times, most recently from 200ba3d to f502b8e Compare January 25, 2021 19:50
var finder_inst = Runtime.PyDict_GetItemString(mod_dict, "finder_inst").DangerousGetAddressOrNull();
Runtime.XIncref(finder);
var metapath = Runtime.PySys_GetObject("meta_path");
Runtime.PyList_Append(metapath, finder);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These four lines look fishy: you don't use finder_inst and you are doing an incref on finder for no obvious reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh.. that should be finder_inst, not finder

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, it makes sense to add an overload for PyList_Append with proper reference types, then avoid manual refcounting.

Same for PySys_GetObject - you can rename the original one to Dangerous_PySys_GetObject to avoid refactoring other call sites.

This is a NIT though. Should come up with a better plan to switch everything.

BorrowedReference fromList = default;
var fromlist = false;
if (num_args >= 4)
var newset = Runtime.PySet_New(new BorrowedReference(IntPtr.Zero));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You Dispose in a finally; should this be a using var ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: I'd replace new BorrowedReference(IntPtr.Zero) with either BorrowedReference.Null or simply default.

{
if (IsLoadAll(fromList))
var nsSet = Runtime.PyDict_GetItemString(new BorrowedReference(root.dict), availableNsKey);
if (!nsSet.IsNull || nsSet.DangerousGetAddress() != Runtime.PyNone)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should BorrowedReference have an IsNullOrNone property? It seems likely to be a common idiom.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep separate IsNull and IsNone. Thought there's already IsNone.

@BadSingleton BadSingleton force-pushed the modernize-import-hook branch from 0f1219b to f9a3974 Compare June 1, 2021 15:28
@BadSingleton BadSingleton force-pushed the modernize-import-hook branch from 2860632 to 970a189 Compare June 2, 2021 15:08
Don't piggyback on AssemblyManager's AssemblyLoadHandler method because
it may be called from other threads, leading to deadlocks if it is
called while Python code is executing
Copy link
Member

@lostmsu lostmsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only reviewed in regards to usage of references and exceptions.

@filmor should still sign off on functionality.

@@ -9,6 +9,7 @@ namespace Python.Runtime
{
readonly IntPtr pointer;
public bool IsNull => this.pointer == IntPtr.Zero;
public bool IsNone => this.pointer == Runtime.PyNone;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be directly on BorrowedReference. None is a concept of the currently running runtime, and BorrowedReference might be a valid non-None reference from the previous runtime incarnation.

storage.GetValue("py_clr_module", out py_clr_module);
var rootHandle = storage.GetValue<IntPtr>("root");
root = (CLRModule)ManagedType.GetManagedObject(rootHandle);
BorrowedReference dict = Runtime.PyImport_GetModuleDict();
Runtime.PyDict_SetItemString(dict.DangerousGetAddress(), "clr", py_clr_module);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a property ClrModuleReference that returns a BorrowedReference of py_clr_module, and then used it in PyDict_SetItemString overload, that works with BorrowedReferences to avoid using Dangerous* methods.

var mod_dict = Runtime.PyModule_GetDict(import_hook_module);
Runtime.XIncref(mod_dict.DangerousGetAddress());
Runtime.PyTuple_SetItem(args.DangerousGetAddress(), 1, mod_dict.DangerousGetAddress());
Runtime.PyObject_Call(exec.DangerousGetAddress(), args.DangerousGetAddress(), IntPtr.Zero);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to make an overload with the new reference types when possible without rewriting existing code, than to call Dangerous* methods. See for example

internal static IntPtr PyObject_Type(IntPtr op)

One can also be made for PyTuple_SetItem using the new StolenReference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This applies to a few Runtime methods across the PR. Basically anything where you can add a new overload.

/// </summary>
static void SetupNamespaceTracking()
{
var newset = Runtime.PySet_New(default);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using var newset = ... would remove the need for try ... finally ...

var pyname = Runtime.PyString_FromString(name);
try
{
if (Runtime.PyList_Append(new BorrowedReference(__all__), pyname) != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, I screwed up PyList_Append. Its only overload schizophrenically mixes references and raw pointers. Can you change it to either have two overloads with only references and with only pointers, or just one that takes only references?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only a few calls to PyList_Append throughout the code base, I changed all of them to use only references.

/// This is needed because the import mechanics need
/// to set a few attributes
/// </summary>
[ForbidPythonThreads]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you have to forbid threads here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because there's a call to Runtime.PyDict_SetItem. I might be missing something and this is always called from CPython and the GIL is guaranteed to be held.

Comment on lines 600 to 606
ModuleObject mod = null;
using (var modname = spec.GetAttr("name"))
{
mod = ImportHook.Import(modname.ToString());
}
return mod;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT:

using var  modname = ...;
return ...;

@@ -1944,6 +1944,11 @@ internal static string PyModule_GetFilename(IntPtr module)


internal static IntPtr PyImport_Import(IntPtr name) => Delegates.PyImport_Import(name);
internal static int PyModule_AddObject(BorrowedReference module, string name, BorrowedReference stolenObject)
Copy link
Member

@lostmsu lostmsu Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like stolenObject parameter should be of type StolenReference here and below (present in the latest master).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StolenReference here and below (present in the latest master)

🎉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I'm reviewing the semantics, and this function only steals on success, so by using StolenReference, we leak on failure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dang you are right. Why did they not make it consistent? PyTuple_SetItem for example does steal even when there's an error. ugh...

Maybe we should make a wrapper, that always steals?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should not use BorrowedReference for the last parameter here. It makes false suggestion, that signature is valid, and borrowed reference is always enough when it is really not. I'd keep it IntPtr for now, and explain the behavior in the parameter doc.

Or alternatively, make a wrapper, that takes either BorrowedReference or StolenReference and handles refcounting internally, and keep the real PInvoke private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaning towards the IntPtr parameter to be as "consistant" as possible with the CPython API.

@BadSingleton BadSingleton requested a review from filmor June 8, 2021 19:51
@BadSingleton BadSingleton requested a review from lostmsu June 8, 2021 20:05
// pyHandle as is, do not convert.
if (value is ModuleObject modobj)
{
return modobj.pyHandle;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to return a new reference here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BadSingleton what about this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad missed that one.

@@ -1944,6 +1944,11 @@ internal static string PyModule_GetFilename(IntPtr module)


internal static IntPtr PyImport_Import(IntPtr name) => Delegates.PyImport_Import(name);
internal static int PyModule_AddObject(BorrowedReference module, string name, BorrowedReference stolenObject)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should not use BorrowedReference for the last parameter here. It makes false suggestion, that signature is valid, and borrowed reference is always enough when it is really not. I'd keep it IntPtr for now, and explain the behavior in the parameter doc.

Or alternatively, make a wrapper, that takes either BorrowedReference or StolenReference and handles refcounting internally, and keep the real PInvoke private.

Copy link
Member

@lostmsu lostmsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@filmor I'm good with the code. Please see if you are OK with the approach.

@filmor
Copy link
Member

filmor commented Jun 16, 2021

I'll review this today.

@hiaselhans
Copy link

great news! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants