Skip to content

bpo-46409: Make generators in bytecode #30633

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jan 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions Doc/library/dis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -942,6 +942,13 @@ All of the following opcodes use their arguments.
Set bytecode counter to *target*.


.. opcode:: JUMP_NO_INTERRUPT (target)

Set bytecode counter to *target*. Do not check for interrupts.

.. versionadded:: 3.11


.. opcode:: FOR_ITER (delta)

TOS is an :term:`iterator`. Call its :meth:`~iterator.__next__` method. If
Expand Down Expand Up @@ -1220,6 +1227,14 @@ All of the following opcodes use their arguments.
.. versionadded:: 3.11


.. opcode:: RETURN_GENERATOR

Create a generator, coroutine, or async generator from the current frame.
Clear the current frame and return the newly created generator.

.. versionadded:: 3.11


.. opcode:: HAVE_ARGUMENT

This is not really an opcode. It identifies the dividing line between
Expand Down
1 change: 0 additions & 1 deletion Include/cpython/genobject.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ extern "C" {
and coroutine objects. */
#define _PyGenObject_HEAD(prefix) \
PyObject_HEAD \
/* Note: gi_frame can be NULL if the generator is "finished" */ \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about gi_iframe. Below (L31) it is defined as an array of length 1 of object pointers. But everywhere it's used, it is cast to InterpreterFrame *. That's not an object or object pointer AFAICT. So what's going on here? Is the declaration wrong? An intentional lie?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a lie.

We don't want to expose InterpreterFrame in public headers, so we can't do the sensible thing and declare gi_frame as:

    InterpreterFrame gi_frame;

as C won't allow incomplete types in structs, even as the last member.

We could improve this by breaking up the header, so the public API sees a different definition. Still a lie, but a more elegant one.
Public header:

typedef struct {
    _PyGenObject_HEAD(gi)
} PyGenObject;

Private header:

typedef struct {
    _PyGenObject_HEAD(gi)
    InterpreterFrame gi_frame;
} PyGenObject;

Probably best done in a different PR, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Any type other than PyObject* would be better though :-). And if you replace the casts with a macro it’s easier to fix later. I have an idea for the macro but it’s too painful to type on a phone.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first field of InterpreterFrame is PyFunctionObject * and the first few fields are all PyObject *, so declaring gi_frame as PyObject * gi_frame[1] seemed safe. Do let me know what your macro is, once you at a PC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do let me know what your macro is, once you at a PC.

See https://bugs.python.org/issue40120#msg365465

/* The code object backing the generator */ \
PyCodeObject *prefix##_code; \
/* List of weak reference. */ \
Expand Down
4 changes: 2 additions & 2 deletions Include/internal/pycore_frame.h
Original file line number Diff line number Diff line change
Expand Up @@ -41,12 +41,12 @@ typedef struct _interpreter_frame {
PyObject *f_locals; /* Strong reference, may be NULL */
PyCodeObject *f_code; /* Strong reference */
PyFrameObject *frame_obj; /* Strong reference, may be NULL */
PyObject *generator; /* Borrowed reference, may be NULL */
struct _interpreter_frame *previous;
int f_lasti; /* Last instruction if called */
int stacktop; /* Offset of TOS from localsplus */
PyFrameState f_state; /* What state the frame is in */
bool is_entry; // Whether this is the "root" frame for the current CFrame.
bool is_generator;
PyObject *localsplus[1];
} InterpreterFrame;

Expand Down Expand Up @@ -100,10 +100,10 @@ _PyFrame_InitializeSpecials(
frame->f_locals = Py_XNewRef(locals);
frame->stacktop = nlocalsplus;
frame->frame_obj = NULL;
frame->generator = NULL;
frame->f_lasti = -1;
frame->f_state = FRAME_CREATED;
frame->is_entry = false;
frame->is_generator = false;
}

/* Gets the pointer to the locals array
Expand Down
28 changes: 15 additions & 13 deletions Include/opcode.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion Lib/importlib/_bootstrap_external.py
Original file line number Diff line number Diff line change
Expand Up @@ -380,6 +380,7 @@ def _write_atomic(path, data, mode=0o666):
# Python 3.11a4 3472 (bpo-46009: replace GEN_START with POP_TOP)
# Python 3.11a4 3473 (Add POP_JUMP_IF_NOT_NONE/POP_JUMP_IF_NONE opcodes)
# Python 3.11a4 3474 (Add RESUME opcode)
# Python 3.11a5 3475 (Add RETURN_GENERATOR opcode)

# Python 3.12 will start with magic number 3500

Expand All @@ -393,7 +394,7 @@ def _write_atomic(path, data, mode=0o666):
# Whenever MAGIC_NUMBER is changed, the ranges in the magic_values array
# in PC/launcher.c must also be updated.

MAGIC_NUMBER = (3474).to_bytes(2, 'little') + b'\r\n'
MAGIC_NUMBER = (3475).to_bytes(2, 'little') + b'\r\n'
_RAW_MAGIC_NUMBER = int.from_bytes(MAGIC_NUMBER, 'little') # For import.c

_PYCACHE = '__pycache__'
Expand Down
12 changes: 6 additions & 6 deletions Lib/inspect.py
Original file line number Diff line number Diff line change
Expand Up @@ -1819,11 +1819,11 @@ def getgeneratorstate(generator):
"""
if generator.gi_running:
return GEN_RUNNING
if generator.gi_suspended:
return GEN_SUSPENDED
if generator.gi_frame is None:
return GEN_CLOSED
if generator.gi_frame.f_lasti == -1:
return GEN_CREATED
return GEN_SUSPENDED
return GEN_CREATED


def getgeneratorlocals(generator):
Expand Down Expand Up @@ -1861,11 +1861,11 @@ def getcoroutinestate(coroutine):
"""
if coroutine.cr_running:
return CORO_RUNNING
if coroutine.cr_suspended:
return CORO_SUSPENDED
if coroutine.cr_frame is None:
return CORO_CLOSED
if coroutine.cr_frame.f_lasti == -1:
return CORO_CREATED
return CORO_SUSPENDED
return CORO_CREATED


def getcoroutinelocals(coroutine):
Expand Down
3 changes: 2 additions & 1 deletion Lib/opcode.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ def jabs_op(name, op):

def_op('GET_AWAITABLE', 73)
def_op('LOAD_ASSERTION_ERROR', 74)
def_op('RETURN_GENERATOR', 75)

def_op('LIST_TO_TUPLE', 82)
def_op('RETURN_VALUE', 83)
Expand Down Expand Up @@ -155,7 +156,7 @@ def jabs_op(name, op):

def_op('MAKE_FUNCTION', 132) # Flags
def_op('BUILD_SLICE', 133) # Number of items

jabs_op('JUMP_NO_INTERRUPT', 134) # Target byte offset from beginning of code
def_op('MAKE_CELL', 135)
hasfree.append(135)
def_op('LOAD_CLOSURE', 136)
Expand Down
4 changes: 2 additions & 2 deletions Lib/test/test_compile.py
Original file line number Diff line number Diff line change
Expand Up @@ -954,7 +954,7 @@ def return_genexp():
x
in
y)
genexp_lines = [None, 1, 3, 1]
genexp_lines = [1, 3, 1]

genexp_code = return_genexp.__code__.co_consts[1]
code_lines = [ None if line is None else line-return_genexp.__code__.co_firstlineno
Expand All @@ -967,7 +967,7 @@ async def test(aseq):
async for i in aseq:
body

expected_lines = [None, 0, 1, 2, 1]
expected_lines = [0, 1, 2, 1]
code_lines = [ None if line is None else line-test.__code__.co_firstlineno
for (_, _, line) in test.__code__.co_lines() ]
self.assertEqual(expected_lines, code_lines)
Expand Down
2 changes: 1 addition & 1 deletion Lib/test/test_generators.py
Original file line number Diff line number Diff line change
Expand Up @@ -897,7 +897,7 @@ def b():
>>> type(i)
<class 'generator'>
>>> [s for s in dir(i) if not s.startswith('_')]
['close', 'gi_code', 'gi_frame', 'gi_running', 'gi_yieldfrom', 'send', 'throw']
['close', 'gi_code', 'gi_frame', 'gi_running', 'gi_suspended', 'gi_yieldfrom', 'send', 'throw']
>>> from test.support import HAVE_DOCSTRINGS
>>> print(i.__next__.__doc__ if HAVE_DOCSTRINGS else 'Implement next(self).')
Implement next(self).
Expand Down
4 changes: 2 additions & 2 deletions Lib/test/test_sys.py
Original file line number Diff line number Diff line change
Expand Up @@ -1386,7 +1386,7 @@ class C(object): pass
def func():
return sys._getframe()
x = func()
check(x, size('3Pi3c8P2ic?P'))
check(x, size('3Pi3c7P2ic??P'))
# function
def func(): pass
check(func, size('14Pi'))
Expand All @@ -1403,7 +1403,7 @@ def bar(cls):
check(bar, size('PP'))
# generator
def get_gen(): yield 1
check(get_gen(), size('P2P4P4c8P2ic?P'))
check(get_gen(), size('P2P4P4c7P2ic??P'))
# iterator
check(iter('abc'), size('lP'))
# callable-iterator
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Add new ``RETURN_GENERATOR`` bytecode to make generators.
Simplifies calling Python functions in the VM, as they no
longer any need to special case generator functions.

Also add ``JUMP_NO_INTERRUPT`` bytecode that acts like
``JUMP_ABSOLUTE``, but does not check for interrupts.
10 changes: 7 additions & 3 deletions Objects/frameobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,7 @@ mark_stacks(PyCodeObject *code_obj, int len)
break;
}
case JUMP_ABSOLUTE:
case JUMP_NO_INTERRUPT:
j = get_arg(code, i);
assert(j < len);
if (stacks[j] == UNINITIALIZED && j < i) {
Expand Down Expand Up @@ -625,7 +626,7 @@ frame_dealloc(PyFrameObject *f)
{
/* It is the responsibility of the owning generator/coroutine
* to have cleared the generator pointer */
assert(f->f_frame->generator == NULL);
assert(!f->f_frame->is_generator);

if (_PyObject_GC_IS_TRACKED(f)) {
_PyObject_GC_UNTRACK(f);
Expand Down Expand Up @@ -698,8 +699,11 @@ frame_clear(PyFrameObject *f, PyObject *Py_UNUSED(ignored))
"cannot clear an executing frame");
return NULL;
}
if (f->f_frame->generator) {
_PyGen_Finalize(f->f_frame->generator);
if (f->f_frame->is_generator) {
assert(!f->f_owns_frame);
size_t offset_in_gen = offsetof(PyGenObject, gi_iframe);
PyObject *gen = (PyObject *)(((char *)f->f_frame) - offset_in_gen);
_PyGen_Finalize(gen);
}
(void)frame_tp_clear(f);
Py_RETURN_NONE;
Expand Down
Loading