Skip to content

gh-129250: allow pickle instances of generic classes #129446

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

tom-pytel
Copy link
Contributor

@tom-pytel tom-pytel commented Jan 29, 2025

This is a maybe PR, not sure if is not overkill for original issue #129250 which may be too niche?

Fixes OP issue by allowing pickling of anonymous type parameters - those which do not exist at module scope but are defined through generic class and function syntax class cls[T]: ... or def func[T](): .... This is done by adding a weakref owner variable which is set on creation for anonymous type params and provides a context for pickling so that on unpickle the type param can be recreated identically. TypeVar, ParamSpec and TypeVarTuple are modifed to work in this manner. Owner is only ever set at creation for type params which have __module__ == typing and the name either does not exist in typing or if it does is not exactly this typeparam.

If this is accepted as the way to go, then here are some questions:

  • TypeAlias does not currently support weak references for some reason so can't use this method to pickle the anon typeparams in type t[A, *B, **C] = .... This is probably not a problem as a TypeAlias can't instantiate objects with its anon typeparams like a func or class can? Can add support anyways now if weakrefs on TypeAlias will be added at some point in the future?

  • What to do on explicit assign to __type_params__, add owner to assigned typeparams where applicable? Remove from old?

  • Only path supported to anonymous type param is sequence index in owner __type_params__, are other paths possible/desired?

  • If builtin___build_class__() is not a good place to set new class typeparams owner, there are other options.

return 0;
}

PyObject *typing = PyImport_ImportModule("typing");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need to do this, we know when type params come directly from a PEP 695 generic.

An alternative implementation approach could be to save some information on TypeVar creation (when INTRINSIC_TYPEVAR is called). At that point, we don't have the function object yet, but we know what its module, qualname, and index are going to be, so we could store that information on the TypeVar, which should be enough to unpickle it.

Copy link
Contributor Author

@tom-pytel tom-pytel Jan 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR: Two possible approaches, both require a new INTRINSC. The first requires a minimal instruction emit added to compile.c whereas the second requires a bit more (peek future scope __qualname__). The second is the one currently up here.

  Disassembly of <code object <generic parameters of cls> ...
  ...
  1:0-2:8            LOAD_DEREF               2 (.type_params)
  1:0-2:8            CALL_INTRINSIC_1        10 (INTRINSIC_SUBSCRIPT_GENERIC)
  1:0-2:8            STORE_FAST_LOAD_FAST    17 (.generic_base, .generic_base)
  1:0-2:8            CALL                     3
+ 1:0-2:8            CALL_INTRINSIC_1        12 (INTRINSIC_SET_TYPE_PARAMS_OWNER)
  1:0-2:8            RETURN_VALUE

Or:

  Disassembly of <code object <generic parameters of func> at 0x7f812b61d360, file "../y.py", line 13>:
    --                   COPY_FREE_VARS           1

  13:0-13:0              RESUME                   0
  13:13-13:14            LOAD_CONST               0 ('T')
  13:13-13:14            CALL_INTRINSIC_1         7 (INTRINSIC_TYPEVAR)
  13:13-13:14            COPY                     1
  13:13-13:14            STORE_FAST               1 (T)
+ 13:13-13:14            LOAD_GLOBAL              0 (__name__)
+ 13:13-13:14            LOAD_CONST               1 ('cls.func')
+ 13:13-13:14            LOAD_SMALL_INT           0
+ 13:13-13:14            BUILD_TUPLE              3
+ 13:13-13:14            CALL_INTRINSIC_2         6 (INTRINSIC_SET_TYPEPARAM_OWNER)
  13:13-13:14            BUILD_TUPLE              1
  13:4-13:23             LOAD_CONST               2 (<code object func at 0x7f812b6cc9a0, file "../y.py", line 13>)
  13:4-13:23             MAKE_FUNCTION
  13:4-13:23             SWAP                     2
  13:4-13:23             CALL_INTRINSIC_2         4 (INTRINSIC_SET_FUNCTION_TYPE_PARAMS)
  13:4-13:23             RETURN_VALUE

@tom-pytel tom-pytel requested a review from carljm as a code owner February 6, 2025 12:24
Copy link
Member

@JelleZijlstra JelleZijlstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! It makes me a little sad to add this much complexity, but this does fix a real bug and the runtime overhead should be quite small. Let's add some more tests and run it through the buildbots to make sure there are no refleaks, and then we can land it.

@tom-pytel
Copy link
Contributor Author

tom-pytel commented Mar 1, 2025

I do want to state that the code owner of compile.c should really look at specifically how I did the __qualname__ peek just to make sure I didn't break any implicit rules or assumptions during compilation.

EDIT: To minimize changes the first method is preferable, not he one that is currently up.

@picnixz picnixz self-requested a review March 1, 2025 14:32
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some questions from the Devil's advocate.

Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you addressed most of my concerns. I can't find a direct attack vector for crashing the interpreter, but if I've some time, I'll try to do it on your PR or post-merge.

@tom-pytel
Copy link
Contributor Author

If worried about the extra compiler complexity, consider the first approach which is much simpler at the cost of no pickle of the TypeAlias type[T] = ... generic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants