Skip to content

Introduce an Intersection #213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ilevkivskyi opened this issue May 6, 2016 · 255 comments
Open

Introduce an Intersection #213

ilevkivskyi opened this issue May 6, 2016 · 255 comments
Labels
topic: feature Discussions about new features for Python's type annotations

Comments

@ilevkivskyi
Copy link
Member

This question has already been discussed in #18 long time ago, but now I stumble on this in a practical question: How to annotate something that subclasses two ABC's. Currently, a workaround is to introduce a "mix" class:

from typing import Iterable, Container

class IterableContainer(Iterable[int], Container[int]):
    ...

def f(x: IterableContainer) -> None: ...

class Test(IterableContainer):
    def __iter__(self): ...
    def __contains__(self, item: int) -> bool: ...

f(Test())

but mypy complains about this

error: Argument 1 of "__contains__" incompatible with supertype "Container"

But then I have found this code snippet in #18

def assertIn(item: T, thing: Intersection[Iterable[T], Container[T]]) -> None:
    if item not in thing:
        # Debug output
        for it in thing:
            print(it)

Which is exactly what I want, and it is also much cleaner than introducing an auxiliary "mix" class. Maybe then introducing Intersection is a good idea, @JukkaL is it easy to implement it in mypy?

@JukkaL
Copy link
Contributor

JukkaL commented May 6, 2016

Mypy complains about your code because __contains__ should accept an argument of type object. It's debatable whether this is the right thing to do, but that's how it's specified in typeshed, and it allows Container to be covariant.

I'm worried that intersection types would be tricky to implement in mypy, though conceptually it should be feasible. I'd prefer supporting structural subtyping / protocols -- they would support your use case, as IterableContainer could be defined as a protocol (the final syntax might be different):

from typing import Iterable, Container

class IterableContainer(Iterable[int], Container[int], Protocol):
    ...

def f(x: IterableContainer) -> None: ...

class Test:
    def __iter__(self): ...
    def __contains__(self, item: int) -> bool: ...

f(Test())  # should be fine (except for the __contains__ argument type bit)

@ilevkivskyi
Copy link
Member Author

It would be really cool to implement protocols. Still, in this case intersection could be added as a "syntactic sugar", since there would be a certain asymmetry: Assume you want a type alias for something that implements either protocol, then you write:
IterableOrContainer = Union[Iterable[int], Container[int]]
But if you want a type alias for something that implements both, you would write:
class IterableContainer(Iterable[int], Container[int], Protocol): ...
I imagine such asymmetry could confuse a novice. Intersection could then be added (very roughly) as:

class _Intersection:
    def __getitem__(self, bases):
        full_bases = bases+(Protocol,)
        class Inter(*full_bases): ...
        return Inter

Intersection = _Intersection()

then one could write:
IterableContainer = Intersection[Iterable[int], Container[int]]

@JukkaL
Copy link
Contributor

JukkaL commented May 6, 2016

Intersection[...] gets tricky once you consider type variables, callable types and all the other more special types as items. An intersection type that only supports protocols would be too special purpose to include, as it's not even clear how useful protocols would be.

@ilevkivskyi
Copy link
Member Author

I understand what you mean. That could be indeed tricky in general case.

Concerning protocols, I think structural subtyping would be quite natural for Python users, but only practice could show whether it will be useful. I think it will be useful.

@gvanrossum
Copy link
Member

This keeps coming up, in particular when people have code that they want to support both sets and sequences -- there is no good common type, and many people believe Iterable is the solution, but it isn't (it doesn't support __len__).

@ilevkivskyi
Copy link
Member Author

I think Intersection is a very natural thing (at least if one thinks about types as sets, as I usually do). Also, it naturally appears when one wants to support several ABCs/interfaces/protocols.

I don't think that one needs to choose between protocols and Intersection, on the contrary they will work very well in combination. For example, if one wants to have something that supports either "old-style" reversible protocol (i.e. has __len__ and __iter__ methods) or "new-style" (3.6+) reversible protocol (i.e. has __reversed__ method), then the corresponding type is Union[Reversible, Intersection[Sized, Iterable]].

It is easy to add Intersection to PEP 484 (it is already mentioned in PEP 483) and to typing.py, the more difficult part is to implement it in mypy (although @JukkaL mentioned this is feasible).

@gvanrossum
Copy link
Member

For cross-reference from #2702, this would be useful for type variables, e.g. T = TypeVar('T', bound=Intersection[t1, t2]).

@jeffkaufman
Copy link

Intersection[FooClass, BarMixin] is something I found myself missing today

@matthiaskramm
Copy link
Contributor

If we had an intersection class in typing.py, what would we call it?

Intersection is linguistically symmetric with Union, but it's also rather long.
Intersect is shorter, but it's a verb. Meet is the type-theoretic version and also nice and short, but, again, you'd expect Union to be called Join if you call Intersection Meet.

@jeffkaufman
Copy link

As a data point, I first looked for Intersection in the docs.

@ilevkivskyi
Copy link
Member Author

Just as a random idea I was thinking about All (it would be more clear if Union would be called Any, but that name is already taken). In general, I don't think long name is a big issue, I have seen people writing from typing import Optional as Opt or even Optional as O depending on their taste. Also generic aliases help in such cases:

T = TypeVar('T')
CBack = Optional[Callable[[T], None]]

def process(data: bytes, on_error: CBack[bytes]) -> None:
    ...

@mitar
Copy link
Contributor

mitar commented Oct 18, 2017

I just opened #483 hoping for exactly the same thing. I literally named it the same. I would be all for Intersection or All to allow to require a list of base classes.

@ilevkivskyi
Copy link
Member Author

Requests for Intersection appear here and there, maybe we should go ahead and support it in mypy? It can be first put in mypy_extensions or typing_extensions. It is a large piece of work, but should not be too hard. @JukkaL @gvanrossum what do you think?

@gvanrossum
Copy link
Member

gvanrossum commented Oct 21, 2017 via email

@ilevkivskyi
Copy link
Member Author

ilevkivskyi commented Oct 22, 2017

@gvanrossum

I think we should note the use cases but not act on it immediately -- there are other tasks that IMO are more important.

OK, I will focus now on PEP 560 plus related changes to typing. Then later we can get back to Intersection, this can be a separate (mini-) PEP if necessary.

Btw, looking at the milestone "PEP 484 finalization", there are two important issues that need to be fixed soon: #208 (str/bytes/unicode) and #253 (semantics of @overload). The second will probably require some help from @JukkaL.

@JukkaL
Copy link
Contributor

JukkaL commented Oct 23, 2017

I agree that now's not the right time to add intersection types, but it may make sense later.

@Kentzo
Copy link

Kentzo commented Nov 14, 2017

(been redirected here from the mailing list)

I think the Not type needs to be added in addition to Intersection:

Intersection[Any, Not[None]]

Would mean anything but None.

@rinarakaki
Copy link

How about the expression Type1 | Type2 and Type1 & Type2 alternative to Union and Intersection respectively.

example:

x: int & Const = 42

@emmatyping
Copy link
Member

@rnarkk these have already been proposed many times, but have not been accepted.

@JelleZijlstra
Copy link
Member

JelleZijlstra commented Nov 15, 2017

The Not special form hasn't been proposed before to my knowledge. I suppose you could equivalently propose a Difference[Any, None] type.

What's the use case for that? It's not something I've ever missed in a medium-sized typed codebase at work and in lots of work on typeshed.

nhusung added a commit to OxiDD/oxidd that referenced this issue May 29, 2024
Switch back to protocols such that we can require multiple protocols (as
long as there are no
[intersection types](python/typing#213) and
`TypeVar`s cannot have multiple bounds).
nhusung added a commit to OxiDD/oxidd that referenced this issue May 29, 2024
Switch back to protocols such that we can require multiple protocols (as
long as there are no
[intersection types](python/typing#213) and
`TypeVar`s cannot have multiple bounds).
@tylerlaprade
Copy link

@mark-todd , ordered sounds far too easy to make mistakes with. What benefits does it bring?

@CarliJoy
Copy link

@tylerlaprade pls follow the discussions in this repo. This issue is not suitable for further discussions. Don't post here unless there is general progress in the intersection.

In short the ordered nature was thought about while handling the Any case. See this issue.

@LeeeeT
Copy link

LeeeeT commented Mar 23, 2025

For anyone curious, here's an encoding of general intersection types using continuation-passing style.

Turns out type checkers already have the capability of checking for intersections: checking if C is a subtype of the intersection of A and B is equivalent to checking if Callable[[Callable[[C], R]], R] is a subtype of Callable[[Callable[[A], R] | Callable[[B], R]], R].

from collections.abc import Callable
from typing import Protocol, reveal_type


# A ≈ (A -> R) -> R

# A & B -> R ≈ (A -> R) | (B -> R)

# A & B ≈ ((A & B) -> R) -> R ≈ (A -> R) | (B -> R) -> R


class Continuation[T](Protocol):
    def __call__[R](self, x: Callable[[T], R], /) -> R: ...


def intro[T](value: T) -> Continuation[T]:
    return lambda x: x(value)


class Intersection[A, B](Protocol):
    def __call__[R](self, x: Callable[[A], R] | Callable[[B], R], /) -> R: ...


def left[A](intersection: Intersection[A, object]) -> A:
    return intersection(lambda x: x)


def right[B](intersection: Intersection[object, B]) -> B:
    return intersection(lambda x: x)


class A:
    foo: int


class B:
    bar: str


class C(A, B):
    pass


def f(x: Intersection[A, B]) -> None:
    reveal_type(left(x).foo)  # int
    reveal_type(right(x).bar)  # str


f(intro(C()))

EDIT:
Huh, you can do the same thing in a much simpler way using tuples instead of CPS. Checking if C is a subtype of A & B is equivalent to checking if tuple[C, C] is a subtype of tuple[A, B].

def intro[T](value: T) -> tuple[T, T]:
    return value, value


class A:
    foo: int


class B:
    bar: str


class C(A, B):
    pass


def f(x: tuple[A, B]) -> None:
    reveal_type(x[0].foo)  # int
    reveal_type(x[1].bar)  # str


f(intro(C()))

Both approaches are based on the same idea: you represent A & B as two values A and B separate from each other. These values are the same object at run time, but for a type checker they are different. Yes, there's this inconvenient wrapping and unwrapping, but this is the best we've got.

@kmillikin
Copy link

@LeeeeT I don't think either of those encodings are true intersection types.

For the first encoding, it's not the case that A & B -> R ≈ (A -> R) | (B -> R) (unless I don't understand the squiggly equals). Try to prove that A & B -> R is a subtype of (A -> R) | (B -> R).

The subtype relation holds in the other direction, (A -> R) | (B -> R) is a subtype of A & B -> R.

This makes sense operationally: the implementation of a function of type A & B -> R might (probably does) require its argument to be both an A and a B. So you can't use that as either a function of type A -> R or one of type B -> R.

For the second encoding, you want A & B to be a subtype of A for all B, but it's clearly not the case that tuple[A, B] is a subtype of tuple[A, A] for all B. Again, the subtype relation holds in only one direction.

@LeeeeT
Copy link

LeeeeT commented Mar 24, 2025

@kmillikin

I don't think either of those encodings are true intersection types

Yes, these are not true intersection types, but they trigger the same kinda check in a type checker that you would expect for true intersection types and therefore can be useful to prevent certain bugs in your program.


For the first encoding, it's not the case that A & B -> R ≈ (A -> R) | (B -> R) (unless I don't understand the squiggly equals)

means that both sides are subtypes of each other.

  • A & B -> R is a subtype of (A -> R) | (B -> R). Given a function that takes an A or a function that takes a B you don't know exactly which one you're given, so you're forced to call it with something that is both an A and a B. (Unless you can pattern match on that union, which you can't with Python unions).
  • (A -> R) | (B -> R) is a subtype of A & B -> R. Obviously, if you can give me something that is both an A and a B, I can ask you for an A or I can ask you for a B.

This makes sense operationally: the implementation of a function of type A & B -> R might (probably does) require its argument to be both an A and a B. So you can't use that as either a function of type A -> R or one of type B -> R

So, let's talk about the case where you can pattern match on that union (for example, when implementing a union as an ADT). I'm not sure how to show that A & B -> R is a subtype of (A -> R) | (B -> R) (that's an interesting question), but I can show that the full intersection type (A & B) is a subtype of (A -> R) | (B -> R) -> R. Given a value of type A & B you want to work with it as it's an A and a B at the same time. You can do so by extracting the A part using left and extracting the B part using right.

You're not limited to calling only one of left and right, you can call both of them for the same value (observe both parts). That's what makes it different from the case where you only have one function (A -> R) | (B -> R) (in that case, you can't observe the B part after observing the A part and vice versa).


For the second encoding, you want A & B to be a subtype of A for all B, but it's clearly not the case that tuple[A, B] is a subtype of tuple[A, A] for all B

Yes, A & B is clearly a subtype of A & A but tuple[A, B] is not a subtype of tuple[A, A]. Although, we can transform a tuple[A, B] into a tuple[A, A], so it is compatible but not directly. The fact that tuple[...] does not directly enjoy commutativity and idempotency unlike & is a disadvantage of this approach. That's why one might want to stick with the first approach which is better in that regard. But the second approach still triggers the check in a type checker that might be useful to you.

@kmillikin
Copy link

@LeeeeT this is the wrong forum to discuss this and I don't want to get into a back and forth argument with you. I just didn't want the misleading encoding of intersections to stand.

Here's a semantic (i.e., types are sets of values, subtyping is subsetting, and unions types are set unions) proof that (A -> R) | (B -> R) is a subtype of A & B -> R: The union type (A -> R) | (B -> R) is the set union of the set of all functions of type A -> R and the set of all functions of type B -> R. Consider an arbitrary element f of this set. There are two cases:

Case 1: f is in the set of all functions of type A -> R. Then f has type A -> R. The type A -> R is a subtype of A & B -> R because A & B is a subtype of A and function subtyping is contravariant in the argument types. So therefore f is in the set of functions of type A & B -> R.

Case 2: f is not in the set of all functions of type A -> R. By the definition of the union type, it must be in the set of all functions of type B -> R. Similar reasoning to case 1 leads to the conclusion that f is i the set of functions of type A & B -> R.

Therefore, every function in the set of functions (A -> R) | (B -> R) is in the set of functions A & B -> R and so the former is a subtype of the latter.

A & B -> R is a subtype of (A -> R) | (B -> R). Given a function that takes an A or a function that takes a B you don't know exactly which one you're given, so you're forced to call it with something that is both an A and a B. (Unless you can pattern match on that union, which you can't with Python unions).

That's not a proof, its an informal argument and it shows the opposite of what you think it does. You're claiming that all the functions in (A -> R) | (B -> R) have the functional interface A & B -> R which is what I just demonstrated above. As you note, it has to be this way for soundness, otherwise you could call a function in A -> R with a non-A (and vice versa for B).

The question for if A & B -> R is a subtype of (A -> R) | (B -> R) is: is there anything in A & B -> R that is not in (A -> R) | (B -> R). The answer is yes. As above, everything in (A -> R) | (B -> R) is either in A -> R or in B -> R. Consider a function def f(x: A & B) -> R: .... This is not in A -> R (for the same reason that a function of type int -> None is not in object -> None: object is not a subtype of int, and A is not a subtype of A & B). Likewise, it is not in B -> R. So there exists functions in A & B -> R that are not in either of A -> R or B -> R.

I'm by no means an expert but that reasoning does seem correct to me.

@mark-todd
Copy link

@CarliJoy did add a post for this, but just pointing out - anyone feel free to open an issue here for discussion if there are points to be resolved: https://github.com/CarliJoy/intersection_examples/issues

There hasn't been much movement on intersection for the past year ish but would be cool to get something specified if interest has increased.

@LeeeeT
Copy link

LeeeeT commented Apr 13, 2025

@kmillikin There are many logics and type systems with different sets of laws. I think in Python specifically, given you can't pattern match on the union type, the equivalence between A & B -> R and (A -> R) | (B -> R) is sound. I can't imagine a situation where you assign a value of type A & B -> R to a variable of type (A -> R) | (B -> R) and get a run time error later on. Can you?

@mikeshardmind
Copy link

mikeshardmind commented Apr 13, 2025

@LeeeeT Those aren't equivalent. This has been gone over in great detail elsewhere. I'm not going to go link to each piece of the discussion nor fully re-explain it, as you're responding to someone who already did explain. If you don't understand the given explanation, the simplest mental model, without touching the nuances around gradual typing that have to be preserved is "Given an intersection A & B, things assignable to the intersection must be usable as if they are both A and B, whereas a union between A and B only asks that they are usable as at least one of A or B"

I'm preparing a draft that builds on the years of discussion, for intersections to be added to the type system, By far, the longest section in it is handling method assignability, and why various shortcuts are not equivalent. (many shortcuts were attempted along the way to make this easier to implement, none of them work)

@kmillikin
Copy link

@LeeeeT is saying that even though A & B -> R and (A -> R) | (B -> R) are definitely not equivalent semantically (the second one is a proper subtype of the first one), we could actually add the rule A & B -> R <: (A -> R) | (B -> R) to Python with intersections and the type system would still be sound.

This is a property of Python's union and arrow types --- you can't distinguish dynamically if you have a value of type A -> R.

The reason we might do such a thing is so that we can encode intersection types as union types by CPS transforming the types and using this "equivalence" to get a union instead of intersection.

Of course, we don't have a complete formal semantics of Python typing so we can't actually prove soundness of Python with intersections and this extra rule, so I'd be a little uncomfortable about it. The trick is cute, and maybe it might answer some questions about intersections (because you can ask the question of unions instead, which might have more intuitive answers). I'm skeptical of that too, because I have a feeling that there are interesting questions exactly in the gap between A & B -> R and (A -> R) | (B -> R). Because note: the values missing from the latter type are exactly the functions whose domain actually needs to have an intersection type.

@tylerlaprade
Copy link

@LeeeeT

class A:
    aaa: int

class B:
    bbb: int

def f(x: A & B) -> int:
    return x.aaa + x.bbb

@mikeshardmind
Copy link

mikeshardmind commented Apr 14, 2025

@kmillikin We already have answers to this here: CarliJoy/intersection_examples#22 (comment)

These answers are also consistent with the ongoing work to add gradual typing to Elixir: https://elixir-lang.org/blog/2023/09/20/strong-arrows-gradual-typing/

Thinking in terms of the Union in CPS form gives us the safe applicative domain of the function, the return type requires further analysis, and is more precise the more information we have at the time. Thus, a given call has more information than all possible calls of the function. We can examine and negate function domains without tranforming to CPS style. That interacts better with the language, especially if overloads exist in the operands, and gives more precise information.

My draft proposal will be ready this week barring catastrophe. While I'd normally encourage further discussion to iron out details, it may be more productive to wait to discuss a specific draft proposal.

@mikeshardmind
Copy link

Status update on the above: This is almost ready to propose officially. Details, and a place for anyone who wants to comment on specific things about the latest draft can be found here

Currently soliciting feedback on which examples make the most sense to highlight to make the case for inclusion the strongest, but if you have other concerns about the proposal, feel free to comment on them as well there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: feature Discussions about new features for Python's type annotations
Projects
None yet
Development

No branches or pull requests