Skip to content

extmod: Implement a minimal typing module. #15911

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

stinos
Copy link
Contributor

@stinos stinos commented Sep 25, 2024

Summary

The compiler already ignores type hints but to improve runtime support a minimal typing (and abc and typing_extensions) module is implemented in C because code-size wise that is the cheapest to do. Also see micropython/micropython-lib#584 for background.

Note @Josverl asked to support __future__ and collections.abc as well but I did not (yet) do that because:

  • __future__ is easy to add as an alias but the result would be that anything you ask is possible,i.e. if __future__.just_inventing_something: would pass. Unclear if we want that.
  • I'm not sure how to implement collections.abc correctly in the presence of micropython-lib's collections module

Testing

A basic test is added.

Trade-offs and Alternatives

  • a Python implementation; see typing: Add typing module micropython-lib#584, and the Python code mentioned in modtyping.c. Works as well, clearer, but even when frozen the code size is larger.
  • ignoring typing-related imports in the compiler: might even be smaller in code size, possibly faster as well, but it doesn't practically seem possible to make type aliases (Nested = Iterable[Tuple[MyAlias, ...]]) work because the compiler cannot distinguish them from normal Python code.
  • ignoring everything typing-related in the compile as long as it's compatible with Python 3.12: since Python 3.12 got support for the type statement which is easy to distinguish so the compiler could ignore that, and hence type aliases using that syntax become possible. I'd say this is the way forward, but it's just too soon for that.

Note I really have no idea how many people would actually use this so maybe makes more sense to disable this by default in all but the PC ports.

Also note that dir(typing) or REPL completion on typing. will list all qstrs :)

Copy link

codecov bot commented Sep 25, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.57%. Comparing base (17d8234) to head (090a41a).

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #15911   +/-   ##
=======================================
  Coverage   98.57%   98.57%           
=======================================
  Files         164      165    +1     
  Lines       21336    21344    +8     
=======================================
+ Hits        21031    21039    +8     
  Misses        305      305           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented Sep 25, 2024

Code size report:


@stinos stinos force-pushed the builtintypingmodule branch from e63ca72 to 80a1dfa Compare September 25, 2024 12:52
@stinos
Copy link
Contributor Author

stinos commented Sep 25, 2024

Test for failing Codecov added.

@stinos stinos force-pushed the builtintypingmodule branch from 80a1dfa to 320c464 Compare September 25, 2024 13:07
@dhalbert
Copy link
Contributor

We added from __future__ import annotations (and nothing else) in CircuitPython a while ago: adafruit#6117. We had to implement it as a native module for the reasons mentioned in the PR.

@Josverl
Copy link
Contributor

Josverl commented Sep 25, 2024

Agree that currently the most important use case for __futures__ is indeed annotations.

I see a lot of downloads of the type stubs, and it's growing steadily. Thr use will only increase when the stubs are integrated with the documentation. So I would prefer this to be enabled on the major ports, and disabled on the memory constrained ports.

For collections.abc, could that be a .mpy module that imports the singleton from typing ?

@stinos stinos force-pushed the builtintypingmodule branch 2 times, most recently from 23816a9 to 090a41a Compare October 2, 2024 13:37
@stinos
Copy link
Contributor Author

stinos commented Oct 2, 2024

  • __future__ is easy to add as an alias but the result would be that anything you ask is possible,i.e. if __future__.just_inventing_something: would pass. Unclear if we want that.

On second thought I don't think __future__ is ever used like that anyway, so added it as well.

  • I'm not sure how to implement collections.abc correctly in the presence of micropython-lib's collections module
    For collections.abc, could that be a .mpy module that imports the singleton from typing ?

Essentially, yes. It requires that collections/abc.py is added in micropython-lib and contains just from typing import __getattr__. If micropython-lib/collections is not used at all, having the option MICROPY_MODULE_BUILTIN_SUBPACKAGES allows collections.abc with just one extra entry in collection's globals (see updated implementation).

Likewise abc should be removed from micropython-lib or adjusted.

@Josverl
Copy link
Contributor

Josverl commented Oct 7, 2024

Hi @stinos, I build and tested a RPI_PICO image with

// enable typing module in C
#define MICROPY_PY_TYPING           (1)
#define MICROPY_PY_TYPING_EXTRA_MODULES (1)

I can the same notebook with tests that I used to validate the .py versions ( that have some changes compared to your originals)
and I find that modtyping.c and its aliasses do help , but they are significantly less compatible with the typing.mpy implementation.

a short breakdown:

category .mpy versions modtyping.c comment
Pass 42 28
Fail 0 6 failing runtime checks : isinstance , reveal_type , no_type_check, NewType
collection.abc 0 9 similar tests to abc, expect that they can be made to pass
Unsupported, python 3.12 Fail 5 5

of the failing runtime checks I think that reveal_type is not that important,
I think

  • isinstance is a must have
  • NewType and @no_type_check would be very good to have.

for complete test results see : verify_type_checking.ipynb

for the tests that have different results between CPython , the .mpy amd the .c implementation I have added the results in
comments. One such example is :

# %%micropython --reset
from typing import TYPE_CHECKING


from abc import ABC, abstractmethod
from math import pi


class Shape(ABC):
    @abstractmethod
    def get_area(self) -> float:
        pass

    @abstractmethod
    def get_perimeter(self) -> float:
        pass


class Circle(Shape):
    def __init__(self, radius) -> None:
        self.radius = radius

    def get_area(self) -> float:
        return pi * self.radius**2

    def get_perimeter(self) -> float:
        return 2 * pi * self.radius


class Square(Shape):
    def __init__(self, side) -> None:
        self.side = side

    def get_area(self) -> float:
        return self.side**2

    def get_perimeter(self) -> float:
        return 4 * self.side


c1 = Circle(5)
s1 = Square(5)

for shape in [c1, s1]:
    a = shape.get_area()
    p = shape.get_perimeter()

    print(a, p)
    print(f"{type(a)=}")
    print(f"{type(p)=}")

    assert isinstance(a, (float, int)), "Area should be a float"
    assert isinstance(p, (float, int)), "Perimeter should be a float"

# CPython
# ----------------------
# 78.53981633974483 31.41592653589793
# type(a)=<class 'float'>
# type(p)=<class 'float'>
# 25 20
# type(a)=<class 'int'>
# type(p)=<class 'int'>

# MicroPython typing.py
# ----------------------
# 78.53982 31.41593
# type(a)=<class 'float'>
# type(p)=<class 'float'>
# 25 20
# type(a)=<class 'int'>
# type(p)=<class 'int'>

# Micropython - modtyping.c
# ----------------------
# <any_call> <any_call>
# type(a)=<class 'any_call'>
# type(p)=<class 'any_call'>
# WARNING |Traceback (most recent call last):
# WARNING |  File "<stdin>", line 52, in <module>
# ERROR   |AssertionError: Area should be a float
#

@projectgus projectgus self-requested a review October 8, 2024 06:48
@stinos
Copy link
Contributor Author

stinos commented Oct 8, 2024

Great, thanks for testing! Do you mind checking the code size the .c implementation takes (pico isn't included in the ci build code size report)? Thing is still: the closer it gets to just freezing the .py implementations (and for whatever reason on the unix build it seems already way over it), the less sense it makes to follow this path.

for complete test results see : verify_type_checking.ipynb

I think that repository is private since I get a 404 trying to access it.

isinstance is a must have

Well, it's a bit more than just 'isinstance does not work', this is not good.. I didn't even test that because I had no idea the ability to derive from a native type could even be broken and assumed it would first look in the derived class for attributes and only then in the base class.

edit implementation details: core reason is that from typing import ABC returns an instance of any_call, of which Circle is then derived. That's obviously no good, it should instead return the type of any_call, plus it's might not be possible to get away with always returning a singleton any_call instance, because then all instances of Circle would have the same object as base.

collection.abc

The implementation is likely not active; either you have to make sure there's no collections module like the one from micropython-lib is in your module search path and #define MICROPY_MODULE_BUILTIN_SUBPACKAGES ( 1 ) or you need a collections/abc.py which has a from typing import __getattr__

@Josverl
Copy link
Contributor

Josverl commented Oct 8, 2024

  • 404 🤐, indeed private , now public. I assumed it was all along
  • While adding the test cases I did many iterations on trying to match the CPython runtime behavior (within reason)
    for abc the simplest I have found is from typing import __Ignore as ABC and same in in collections.abc
  • there are a few more tweaks
    ( I use __Ignore as I find that a bit clearer than AnyCall and is saves a few bytes compared to _TypingIgnore

Firmware size.
Im not quite sure which figures I should compare,
and when I compare the size of the added frozen modules (547 bytes total) , to the growth of the firmware.elf (892) - it does not match , so Im probably looking at the incorrect number, or there is additional overhead / module
here is wat I collected:

##### No Typing
   text    data     bss     dec     hex filename
 332704       0    5524  338228   52934 /home/builder/micropython/ports/rp2/build-RPI_PICO/firmware.elf

#### C typing
   text    data     bss     dec     hex filename
 332704       0    5524  338228   52934 /home/builder/micropython/ports/rp2/build-RPI_PICO/firmware.elf
 
##### .py typing - frozen 

   text    data     bss     dec     hex filename
 334048       0    5524  339572   52e74 /home/builder/micropython/ports/rp2/build-RPI_PICO/firmware.elf

##### .py typing - frozen -O3 

   text    data     bss     dec     hex filename
 333792       0    5524  339316   52d74 /home/builder/micropython/ports/rp2/build-RPI_PICO_T/firmware.elf

##### .py typing - frozen -O3 - reduced 
no abc but including collections.abc
removed the less useful reveal_type, get_origin, get_args from typing.py

   text    data     bss     dec     hex filename
 333596       0    5524  339120   52cb0 /home/builder/micropython/ports/rp2/build-RPI_PICO_T/firmware.elf

Signed-off-by: stijn <stijn@ignitron.net>
@stinos stinos force-pushed the builtintypingmodule branch from 090a41a to a32e6d6 Compare October 9, 2024 10:40
@stinos
Copy link
Contributor Author

stinos commented Oct 9, 2024

for abc the simplest I have found is from typing import __Ignore as ABC and same in in collections.abc

Yes that works, but you'd have to repeat that for every single type, whereas from typing import __getattr__ would do that automatically I think?

Firmware size. Im not quite sure which figures I should compare,

Essentially what you're posted, except that 'No Typing' and 'C typing' have the same size so they're both 'C typing' probably?

Anyway while fixing the implementation (code sample you posted works now) I found a blocker, see your no_type_check test for instance: it is not possible to distinguish between any_call (or TypeIgnore for that matter) being called as a decorator/function (because @decorator is merely syntactic sugar for func = decorator(func), and because @decorator(1) is func = decorator(1)(func)) or being called to create a new instance.

So that means our typing implementation, whatever it is, must have all possible typing-related decorators implemented explicitly (and they definitely do not currently) otherwise the runtime behavior of code using an unlisted decorator is incorrect because the current implementations which make the module's __getattr__ the fallback will turn the decorated thing into a TypeIgnore instance, and the decorated thing vanishes as if it were never even there.

No-one probably really thought this through so far, but this is a rather serious issue.
So serious, that I now think that our current implementations might simply be too dangerous to use (even if we list all decorators now, we'd have to be constantly checking CPythone doesn't introduce new ones). And reverting to an implementation without __getattr__ but instead listing all types manually, so that at least if something isn't known it cannot run, is the only sane thing to do.

@Josverl
Copy link
Contributor

Josverl commented Oct 10, 2024

whereas from typing import getattr would do that automatically I think?

No __getattr__ will produce the instance __ignore rather than the Type ___Ignore
we could add logic though to check for Uppercase in the name, but a few classes/types may use lowercase
and yes , I did think that through , hence the testing 🤔

'No Typing' and 'C typing' have the same size so they're both 'C typing' probably?

yeah, noticed that, may have fumbled the definitions, here are numbers for Variants I created of the PIMORONI_PICOLIPO-FLASH

 text    data     bss     dec     hex filename
 330824       0    5524  336348   521dc /home/jos/micropython/ports/rp2/build-PIMORONI_PICOLIPO-FLASH_16M/firmware.elf
 331096       0    5524  336620   522ec /home/jos/micropython/ports/rp2/build-PIMORONI_PICOLIPO-TYPING_C/firmware.elf
 331968       0    5524  337492   52654 /home/jos/micropython/ports/rp2/build-PIMORONI_PICOLIPO-TYPING_PY/firmware.elf 
  • .c increase = 272
  • .mpy increate = 1.144 for a .mpy payload increase of 700

The .c implementation , even if it is for part of the functionality, and currently does not pass all tests, is significantly lower.

.mpy Package size breakdown :

module size (B)
__future__.mpy 50
abc.mpy 70
typing_extensions.mpy 66
typing.mpy 394
collections/__init__.mpy 38
collections/abc.mpy 82
TOTAL 700
mysterious overhead 444

implementations might simply be too dangerous to use

I think the risk is acceptable, as with the implementation of the typing @decorators, we are actually 'unbreaking' code, rather than breaking it.
And to further reduce the risk, I think the best ways it to have the typing modules as:
optional , easy to add (mpremote mip installable) python packages, even at the cost of 700 bytes on the filesystem (and ram)
those who want to freeze them to reduce RAM use still can.

The way I have been using __getattr__ and __Ignore aka _Any_Call are to optimize for the size, to reduce from the 1.685 bytes for just typing.mpy otherwise.
But fully agree that we need some guardrails to protect against unwanted scenarios and tests to validate these, even for an opt-in approach.

@stinos
Copy link
Contributor Author

stinos commented Oct 10, 2024

Thanks for the elaborate info.

No getattr will produce the instance __ignore rather than the Type ___Ignore

Right, I changed that already in the most recent version though.

The .c implementation , even if it is for part of the functionality, and currently does not pass all tests, is significantly lower.

Yeah, I kind of expected this.

I think the risk is acceptable

I have to say I'm not convinced, at least not for our production code :)

easy to add (mpremote mip installable) python packages,

Ok but if that is a hard requirement, the C implementation is a no-go?

tests to validate these

I made a start with implementing it in C the other way around, so no __getattr__ but explicitly adding all types, mostly just to see where it goes code-size wise. In doing that I'm going to add a test for every single type/function. Once that is done, its going to be a lot easier to validate implementations.

@Josverl
Copy link
Contributor

Josverl commented Oct 11, 2024

easy to add (mpremote mip installable) python packages,

Ok but if that is a hard requirement, the C implementation is a no-go?

The general rule in MicroPython is that if something can be reasonable implemented in Python, then it should. that is why there are several _something in .c combined with something.py modules in micropython.lib
That is actually also true for the CPython version of the _typing and typing module(s) .

As static and runtime typing in Python is still rapidly developing, my worry is that a size optimized C-implementation will be harder to maintain and extend, than a Python based solution. Also the mip installable approach adds a degree of freedom, as it can be installed on (almost) any port/board/family of micropython.

perhaps we need to rank/ weight what is, or we find, important, in order to make a good-enough choice wrt to the implementation:

  • type checking: support type annotations [ list of typing features] with documented exceptions, verified with pyright and mypy
  • runtime : adherance to CPython runtime behaviour , with documented exceptions
  • flash size overhead
  • maintanability
  • simple to add to a port/board

and then score the different solutions.

@dpgeorge dpgeorge added the extmod Relates to extmod/ directory in source label Jan 23, 2025
@dpgeorge dpgeorge added this to the release-1.27.0 milestone Aug 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extmod Relates to extmod/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants