Skip to content

Specify methods that are allowed for inference of partially initialized generics #1989

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vnmabus opened this issue May 1, 2025 · 1 comment
Labels
topic: feature Discussions about new features for Python's type annotations

Comments

@vnmabus
Copy link

vnmabus commented May 1, 2025

The inference of generic types for classes in most type-checkers assume that all the generic types can be inferred from the parameters to __init__. This is not always the case. For example, it is a common pattern to create an empty container class and add elements to it later, e.g.:

a = []
a.append("hello world")

Currently for builtin classes, such as list or dict, this idiom is understood by MyPy, but not by Pyright. Moreover, even MyPy is incapable of applying this type inference to custom types (see python/mypy#13134). As mentioned in that issue, the problem is not only with empty containers, but also with some widely used APIs, such as the Estimator API from scikit-learn and related projects, in which the type of the data used is only known when the fit method is called.

My proposal would be to add functionality to the type system to define which methods can be used to infer generic types, apart from __init__. This could be made for example with a decorator (e.g. allow_generic_inference), so that one can define a method such as:

@allow_generic_inference
def append(value: T):
    ...

In this case the value of the class generic parameter T could be inferred from append if it was undefined before.

I think this proposal may also be similar to the previously suggested TypeAssert mentioned by @erictraut in #1013 (comment). However I am not completely sure that TypeAssert could do what I explained here. I am also not sure that the proposed TypeAssert syntax is better than a decorator.

I hope you too consider this proposal useful, and I look forward to your feedback.

@vnmabus vnmabus added the topic: feature Discussions about new features for Python's type annotations label May 1, 2025
@erictraut
Copy link
Collaborator

erictraut commented May 1, 2025

In this case the value of the class generic parameter T could be inferred from append if it was undefined before.

I don't think that would work. The type of an object must be established at the time it is constructed. There's no such thing as "partial instantiation", and the type of an instantiated object cannot change over time. There are several reasons why it's important to enforce this rule.

When you access a method or attribute of a generic class using the . operator, that performs a "binding" operation. In your example above, a.append binds the append method to object a. Part of that binding operation involves partial specialization of the method (or attribute) where all class-scoped type variables are specialized (replaced) with the corresponding type arguments from the object's type. When a static type checker performs this binding operation, it also validates that the self or cls parameter is compatible with the object. This is important in cases where these parameters have an explicit type annotation — something seen often with overloaded methods. In other words, evaluating the type of a.append in your example above requires a type checker to know the type of a. If a is an instance of a generic class, then the values of its type arguments must be known at that time. This all occurs prior to the evaluation of the call expression — that is, the subexpression a.append is necessarily evaluated prior to the expression a.append("").

Another reason that object types must be established at the time of construction is that any time after construction, they can be aliased by another variable through assignment. Consider the following:

a = []
b = a
a.append("hello world")
b.append(1)

Type variable defaults (introduced in PEP 696) also make it important that class-scoped type variables receive their values at construction time.


Here's a potential solution that doesn't violate the rule above because it constructs a new object with a distinct type. It also doesn't require any new type system constructs. It leverages a sentinel to distinguish between an empty and non-empty container.

from typing import Iterable, Self, cast, final, overload

@final
class _Empty: ...

class Container[T = _Empty]:
    def __init__(self, i: Iterable[T] | None = None) -> None:
        if not i:
            self._is_empty = True
        else:
            self._is_empty = False
            self._items = [x for x in i]

    @overload
    def append[S](self: Container[_Empty], item: S) -> Container[S]: ...
    @overload
    def append(self, item: T) -> Self: ...

    def append[S](self, item: S) -> Self | Container[S]:
        if self._is_empty:
            # Construct a new container with a new type.
            return Container[S]([item])

        self._items.append(cast(T, item))
        return self

a = Container()
reveal_type(a)  # type: Container[Empty]
b = a

a = a.append("hello")
reveal_type(a)  # type: Container[str]

b = b.append(1)
reveal_type(a)  # type: Container[int]

A downside of this solution is that append returns a value that must be received by the caller. This is admittedly less ergonomic than the append method defined by the list class.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: feature Discussions about new features for Python's type annotations
Projects
None yet
Development

No branches or pull requests

2 participants