Skip to content

gh-137530: generate an __annotate__ function for dataclasses __init__ #137711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

DavidCEllis
Copy link
Contributor

@DavidCEllis DavidCEllis commented Aug 13, 2025

This is one approach to resolving #137530

It generates a new __annotate__ function to attach to the __init__ method and removes the in-line annotations that are now unused.

It's possible to keep the original source annotations if necessary, but they provide a second possibly incorrect source of the information and they will probably also cause additional issues with #135228 as they may keep the original class around when ForwardRef values are generated.

The generated __annotate__ will also keep these references and need to be regenerated. I can extend this or follow up to handle these references but I didn't want to do too much in one PR.

elif isinstance(v, annotationlib.ForwardRef):
string_annos[k] = v.evaluate(format=annotationlib.Format.STRING)
else:
string_annos[k] = annotationlib.type_repr(v)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit unfortunate because we could get better STRING annotations by calling with the STRING format on the original class (and its base classes).

An approach to do this would be to make __init__.__annotate__ delegate to calling get_annotations() on the class itself, and on any base classes that are contributing fields, with the appropriate format, then processing the result to add "return": None and filter out any fields that don't correspond to __init__ parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think about that, but wasn't certain about the extra complexity, but I've since noticed that if there's a hint that contains a ForwardRef (like list[undefined]) the result leaks all the ForwardRef details.

I've added a test for that example to demonstrate why this is necessary.

I've taken a slightly different approach, in that I'm gathering all inherited annotations and using them to update the values from the fields. The original source annotation logic this replaces wasn't __init__ specific so I've tried to keep the __annotate__ generator non-specific too.


There are additional commits as this change meant that __annotate__ now (always) has a reference to the original class which broke #135228 - this now also includes an additional test and fix for the issue I brought up there as I already needed to update the fields for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also unintentionally discovered that the __annotate__ functions CPython generates for methods have an incorrect __qualname__ while fixing the __qualname__ for the function generated here.

@DavidCEllis
Copy link
Contributor Author

On doing some other work, I think the logic for STRING is also needed for FORWARDREF. I'm not sure how to resolve VALUE.

Problem for both:

from annotationlib import get_annotations
from dataclasses import dataclass

@dataclass
class Example:
    a: list[defined_late]

defined_late = int

print(get_annotations(Example.__init__))
{'a': list[ForwardRef('defined_late', is_class=True, owner=<class '__main__.Example'>)], 'return': None}

For Format.FORWARDREF this is resolvable in the same way as STRING. However for VALUE it doesn't quite work out to do the same thing, as the annotations on the class could contain a name that doesn't exist yet - but if init=False that name may not be required for the __init__ annotations.

@DavidCEllis
Copy link
Contributor Author

Ah, while this is resolvable for FORWARDREF - it requires VALUE to fail or get_annotations will use VALUE instead anyway.

I'm leaning towards making VALUE fail if any of the class annotations fail with VALUE. I can't currently think of a clean way otherwise to both make FORWARDREF evaluate as far as possible and not leak ForwardRef values into VALUE annotations.

Also split out the single large test into multiple smaller tests
@DavidCEllis
Copy link
Contributor Author

DavidCEllis commented Aug 15, 2025

Thoughts:

  • With the change to collect class annotations, all of the values from the annotations argument are no longer used apart from the return type.
    • Perhaps _make_annotate_function should take (cls, annotation_fields, return_type) instead of (cls, annotations), where annotation_fields is just a list of relevant field names?
    • _FuncBuilder.add_fn would regain return_type and replace annotations with annotation_fields
  • I'd like to make the VALUE annotations work in test_init_false_forwardref but I don't currently see a clean way to do so.

Edit: I realised changing the signature would make _add_slots simpler and is probably worthwhile.

Use `__class__` as the first argument name so `_update_func_cell_for__class__` can update it.
@DavidCEllis
Copy link
Contributor Author

Ok, I'm going to try to leave this alone for some review.

The only functional1 thing I know I'm not completely happy with is this example from the tests2:

from dataclasses import dataclass, field
from annotationlib import get_annotations

@dataclass
class F:
    not_in_init: list[undefined] = field(init=False, default=None)
    in_init: int

annos = annotationlib.get_annotations(F.__init__)  # NameError 

I don't have a good approach for this. I don't think you can use FORWARDREF annotations as they are without potentially leaking actual ForwardRefs?

Footnotes

  1. The first parameter of _make_annotate_function being named __class__ in order to reuse the cell update function could maybe be cleaner.

  2. Ignore the incorrect default type, initially the annotation had | None and I forgot to remove the default when I removed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants