Skip to content

bpo-46841: Use *inline* caching for BINARY_OP #31543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 28 additions & 7 deletions Doc/library/dis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,12 @@ interpreter.
Use 2 bytes for each instruction. Previously the number of bytes varied
by instruction.

.. versionchanged:: 3.11
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe don't document this until we are sure it's what we want?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep it, and change it if/when dis changes (which is easy enough). That way we don't forget to document it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Once we've started doing this we will need to complete it before the beta release anyway.

Some instructions are accompanied by one or more inline cache entries,
which take the form of :opcode:`CACHE` instructions. These instructions
are hidden by default, but can be shown by passing ``show_caches=True`` to
any :mod:`dis` utility.


Example: Given the function :func:`myfunc`::

Expand Down Expand Up @@ -54,7 +60,7 @@ The bytecode analysis API allows pieces of Python code to be wrapped in a
:class:`Bytecode` object that provides easy access to details of the compiled
code.

.. class:: Bytecode(x, *, first_line=None, current_offset=None)
.. class:: Bytecode(x, *, first_line=None, current_offset=None, show_caches=False)


Analyse the bytecode corresponding to a function, generator, asynchronous
Expand All @@ -74,7 +80,7 @@ code.
disassembled code. Setting this means :meth:`.dis` will display a "current
instruction" marker against the specified opcode.

.. classmethod:: from_traceback(tb)
.. classmethod:: from_traceback(tb, *, show_caches=False)

Construct a :class:`Bytecode` instance from the given traceback, setting
*current_offset* to the instruction responsible for the exception.
Expand All @@ -100,6 +106,9 @@ code.
.. versionchanged:: 3.7
This can now handle coroutine and asynchronous generator objects.

.. versionchanged:: 3.11
Added the ``show_caches`` parameter.

Example::

>>> bytecode = dis.Bytecode(myfunc)
Expand Down Expand Up @@ -153,7 +162,7 @@ operation is being performed, so the intermediate analysis object isn't useful:
Added *file* parameter.


.. function:: dis(x=None, *, file=None, depth=None)
.. function:: dis(x=None, *, file=None, depth=None, show_caches=False)

Disassemble the *x* object. *x* can denote either a module, a class, a
method, a function, a generator, an asynchronous generator, a coroutine,
Expand Down Expand Up @@ -183,8 +192,11 @@ operation is being performed, so the intermediate analysis object isn't useful:
.. versionchanged:: 3.7
This can now handle coroutine and asynchronous generator objects.

.. versionchanged:: 3.11
Added the ``show_caches`` parameter.


.. function:: distb(tb=None, *, file=None)
.. function:: distb(tb=None, *, file=None, show_caches=False)

Disassemble the top-of-stack function of a traceback, using the last
traceback if none was passed. The instruction causing the exception is
Expand All @@ -196,9 +208,12 @@ operation is being performed, so the intermediate analysis object isn't useful:
.. versionchanged:: 3.4
Added *file* parameter.

.. versionchanged:: 3.11
Added the ``show_caches`` parameter.


.. function:: disassemble(code, lasti=-1, *, file=None)
disco(code, lasti=-1, *, file=None)
.. function:: disassemble(code, lasti=-1, *, file=None, show_caches=False)
disco(code, lasti=-1, *, file=None, show_caches=False)

Disassemble a code object, indicating the last instruction if *lasti* was
provided. The output is divided in the following columns:
Expand All @@ -220,8 +235,11 @@ operation is being performed, so the intermediate analysis object isn't useful:
.. versionchanged:: 3.4
Added *file* parameter.

.. versionchanged:: 3.11
Added the ``show_caches`` parameter.


.. function:: get_instructions(x, *, first_line=None)
.. function:: get_instructions(x, *, first_line=None, show_caches=False)

Return an iterator over the instructions in the supplied function, method,
source code string or code object.
Expand All @@ -236,6 +254,9 @@ operation is being performed, so the intermediate analysis object isn't useful:

.. versionadded:: 3.4

.. versionchanged:: 3.11
Added the ``show_caches`` parameter.


.. function:: findlinestarts(code)

Expand Down
2 changes: 1 addition & 1 deletion Include/cpython/code.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
/* Each instruction in a code object is a fixed-width value,
* currently 2 bytes: 1-byte opcode + 1-byte oparg. The EXTENDED_ARG
* opcode allows for larger values but the current limit is 3 uses
* of EXTENDED_ARG (see Python/wordcode_helpers.h), for a maximum
* of EXTENDED_ARG (see Python/compile.c), for a maximum
* 32-bit value. This aligns with the note in Python/compile.c
* (compiler_addop_i_line) indicating that the max oparg value is
* 2**32 - 1, rather than INT_MAX.
Expand Down
9 changes: 8 additions & 1 deletion Include/internal/pycore_code.h
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,13 @@ typedef union {

#define INSTRUCTIONS_PER_ENTRY (sizeof(SpecializedCacheEntry)/sizeof(_Py_CODEUNIT))

typedef struct {
_Py_CODEUNIT counter;
} _PyBinaryOpCache;

#define INLINE_CACHE_ENTRIES_BINARY_OP \
(sizeof(_PyBinaryOpCache) / sizeof(_Py_CODEUNIT))

/* Maximum size of code to quicken, in code units. */
#define MAX_SIZE_TO_QUICKEN 5000

Expand Down Expand Up @@ -276,7 +283,7 @@ int _Py_Specialize_Call(PyObject *callable, _Py_CODEUNIT *instr, int nargs,
int _Py_Specialize_Precall(PyObject *callable, _Py_CODEUNIT *instr, int nargs,
PyObject *kwnames, SpecializedCacheEntry *cache, PyObject *builtins);
void _Py_Specialize_BinaryOp(PyObject *lhs, PyObject *rhs, _Py_CODEUNIT *instr,
SpecializedCacheEntry *cache);
int oparg);
void _Py_Specialize_CompareOp(PyObject *lhs, PyObject *rhs, _Py_CODEUNIT *instr, SpecializedCacheEntry *cache);
void _Py_Specialize_UnpackSequence(PyObject *seq, _Py_CODEUNIT *instr,
SpecializedCacheEntry *cache);
Expand Down
145 changes: 75 additions & 70 deletions Include/opcode.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading