Skip to content

gh-130907: Treat all module-level annotations as conditional #131550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 28, 2025

Conversation

JelleZijlstra
Copy link
Member

@JelleZijlstra JelleZijlstra commented Mar 21, 2025

@JelleZijlstra JelleZijlstra changed the title gh-130907: Error when accessing __annotations__ on a partially defined module gh-130907: Treat all module-level annotations as conditional Apr 9, 2025
@JelleZijlstra
Copy link
Member Author

This version makes it so all module-level annotations are treated as conditional and annotations are not cached if accessed while the module is still initializing. This means that if you access annotations in the middle of module execution, you'll see all annotations that have been executed so far, and if you access it after execution finishes, you'll see all annotations. This mostly restores the pre-3.14 behavior, except that it's no longer the same dictionary being updated (so if you store the annotations dict halfway through, it won't get updated with annotations that are added later).

This also required moving the code that sets __annotate__ to the beginning of the module code, which required some new functionality to splice the instruction sequences together. Not the most elegant code but I couldn't think of anything better.

@@ -383,9 +383,10 @@ def test_var_annot_simple_exec(self):
gns = {}; lns = {}
exec("'docstring'\n"
"x: int = 5\n", gns, lns)
self.assertNotIn('__annotate__', gns)

gns.update(lns) # __annotate__ looks at globals
self.assertEqual(lns["__annotate__"](annotationlib.Format.VALUE), {'x': int})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test started failing because __annotate__ now always accesses __conditional_annotations__ from the globals, and the separate globals and locals namespaces break that. This would have already failed if the annotations referred to any name defined in the module (e.g. type x = int; y: x), so I don't think that's a problem.

@@ -66,6 +66,8 @@ _PyJumpTargetLabel _PyInstructionSequence_NewLabel(_PyInstructionSequence *seq);
int _PyInstructionSequence_ApplyLabelMap(_PyInstructionSequence *seq);
int _PyInstructionSequence_InsertInstruction(_PyInstructionSequence *seq, int pos,
int opcode, int oparg, _Py_SourceLocation loc);
int _PyInstructionSequence_PrependSequence(_PyInstructionSequence *seq, int pos,
_PyInstructionSequence *nested);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re naming:

  • if you give a pos then it's not prepend but something like inject.
  • we already use "nested sequence" for something else in this space.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rename the function to "InjectSequence" and the parameter to "injected". Thanks for the review!

Python/codegen.c Outdated
@@ -792,6 +792,17 @@ codegen_process_deferred_annotations(compiler *c, location loc)
return SUCCESS;
}

instr_sequence *old_instr_seq = INSTR_SEQUENCE(c);
instr_sequence *nested_instr_seq = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we can't reorder the code in this function so that we first emit the instructions for annotations and then for the code (so we don't need to do this injection)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, it's stuff from outside this function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I considered an alternative where we generate the code first, then later edit the LOAD_CONST instruction to the correct code object we generate later, but that seemed like it'd be worse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could merge the annotation code into the main code in _PyCfg_OptimizedCfgToInstructionSequence.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or rather, _PyCfg_FromInstructionSequence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Wasn't too bad after all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. So now we shouldn't need _PyCompile_SetInstrSequence and nested_instr_seq, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need it right now because we need to make the extra instrseq that we can later stitch in the right place. The only other way to set the compiler's instrseq appears to be _PyCompile_EnterScope, but we don't want to enter a new scope here; this code is in the parent scope.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I think I would have moved this into the compiler. Instead of adding _PyCompile_SetInstrSequence, add functions to "begin annotations block" and "end annotations block" and let the compiler manage its data structures by itself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@AlexWaygood AlexWaygood removed their request for review April 10, 2025 10:49
if (nested_instr_seq == NULL) {
bool need_separate_block = scope_type == COMPILE_SCOPE_MODULE;
if (need_separate_block) {
if (_PyCompile_StartAnnotationSetup(c) == ERROR) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great.

We could go one step further: Call these functions _PyCompile_EnterAnnotationsSubscope/_PyCompile_ExitAnnotationsSubscope and let the compiler check whether we are in COMPILE_SCOPE_MODULE (so need to stash) or not. Up to you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll likely drop the module condition soon for another (more minor) issue, so I'm going to skip that for now. Thanks!

@JelleZijlstra JelleZijlstra merged commit 922049b into python:main Apr 28, 2025
66 of 68 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants