Skip to content

Why do we have two counters for function versions? #116916

Closed
@gvanrossum

Description

@gvanrossum

In the interpreter state there are two separate counters that are used to generate unique versions for code and function objects. Both counters share the following properties:

  • The counter is an unsigned 32-bit int initialized to 1
  • When a new version is needed, the current value of the counter is used, and unless the counter is zero, the counter is incremented
  • When the counter overflows to 0, it remains zero, and no new versions can be allocated (in this case the specializer will just give up -- hopefully this never happens in real code, creating four billion code objects is just hard to imagine)

There are two ways that a function object obtains a version:

  • The usual path is when a function object is created by the MAKE_FUNCTION bytecode. In this case it gets the version from the code object. (See below.)
  • When an assignment is made to func.__code__ or a few other attributes, the function version is reset, and if the specializer later needs a function version, a new version number is taken from interp->func_state.next_version by _PyFunction_GetVersionForCurrentState().

Code objects obtain a version when they are created from interp->next_func_version.

I believe there's a scenario where two unrelated function objects could end up with the same version:

  • one function is created with a version derived from its code object, i.e. from interp->next_func_version;
  • the other has had its __code__ attribute modified, and then obtains a fresh version from interp->func_state.next_version, which somehow has the same value as the other counter had -- this should be possible because the counters are managed independently.

I looked a bit into the history, and it looks like in 3.11 there was only a static global next_func_version in funcobject.c, which was used by _PyFunction_GetVersionForCurrentState(). In 3.12 the current structure exists, except next_func_version is still a static global (func_state however is an interpreter member). It seems that this was introduced by gh-98525 (commit: fb713b2). Was it intentional to use two separate counters? If so, what was the rationale? (If it had to do with deepfreeze, that rationale is now gone, as we no longer use it.)

Linked PRs

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions