Skip to content

Commit 404bca3

Browse files
brandtbucherasvetlov
authored andcommitted
bpo-46841: Use *inline* caching for BINARY_OP (GH-31543)
1 parent ed1e6d6 commit 404bca3

19 files changed

+429
-351
lines changed

Doc/library/dis.rst

+28-7
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,12 @@ interpreter.
2424
Use 2 bytes for each instruction. Previously the number of bytes varied
2525
by instruction.
2626

27+
.. versionchanged:: 3.11
28+
Some instructions are accompanied by one or more inline cache entries,
29+
which take the form of :opcode:`CACHE` instructions. These instructions
30+
are hidden by default, but can be shown by passing ``show_caches=True`` to
31+
any :mod:`dis` utility.
32+
2733

2834
Example: Given the function :func:`myfunc`::
2935

@@ -54,7 +60,7 @@ The bytecode analysis API allows pieces of Python code to be wrapped in a
5460
:class:`Bytecode` object that provides easy access to details of the compiled
5561
code.
5662

57-
.. class:: Bytecode(x, *, first_line=None, current_offset=None)
63+
.. class:: Bytecode(x, *, first_line=None, current_offset=None, show_caches=False)
5864

5965

6066
Analyse the bytecode corresponding to a function, generator, asynchronous
@@ -74,7 +80,7 @@ code.
7480
disassembled code. Setting this means :meth:`.dis` will display a "current
7581
instruction" marker against the specified opcode.
7682

77-
.. classmethod:: from_traceback(tb)
83+
.. classmethod:: from_traceback(tb, *, show_caches=False)
7884

7985
Construct a :class:`Bytecode` instance from the given traceback, setting
8086
*current_offset* to the instruction responsible for the exception.
@@ -100,6 +106,9 @@ code.
100106
.. versionchanged:: 3.7
101107
This can now handle coroutine and asynchronous generator objects.
102108

109+
.. versionchanged:: 3.11
110+
Added the ``show_caches`` parameter.
111+
103112
Example::
104113

105114
>>> bytecode = dis.Bytecode(myfunc)
@@ -153,7 +162,7 @@ operation is being performed, so the intermediate analysis object isn't useful:
153162
Added *file* parameter.
154163

155164

156-
.. function:: dis(x=None, *, file=None, depth=None)
165+
.. function:: dis(x=None, *, file=None, depth=None, show_caches=False)
157166

158167
Disassemble the *x* object. *x* can denote either a module, a class, a
159168
method, a function, a generator, an asynchronous generator, a coroutine,
@@ -183,8 +192,11 @@ operation is being performed, so the intermediate analysis object isn't useful:
183192
.. versionchanged:: 3.7
184193
This can now handle coroutine and asynchronous generator objects.
185194

195+
.. versionchanged:: 3.11
196+
Added the ``show_caches`` parameter.
197+
186198

187-
.. function:: distb(tb=None, *, file=None)
199+
.. function:: distb(tb=None, *, file=None, show_caches=False)
188200

189201
Disassemble the top-of-stack function of a traceback, using the last
190202
traceback if none was passed. The instruction causing the exception is
@@ -196,9 +208,12 @@ operation is being performed, so the intermediate analysis object isn't useful:
196208
.. versionchanged:: 3.4
197209
Added *file* parameter.
198210

211+
.. versionchanged:: 3.11
212+
Added the ``show_caches`` parameter.
213+
199214

200-
.. function:: disassemble(code, lasti=-1, *, file=None)
201-
disco(code, lasti=-1, *, file=None)
215+
.. function:: disassemble(code, lasti=-1, *, file=None, show_caches=False)
216+
disco(code, lasti=-1, *, file=None, show_caches=False)
202217
203218
Disassemble a code object, indicating the last instruction if *lasti* was
204219
provided. The output is divided in the following columns:
@@ -220,8 +235,11 @@ operation is being performed, so the intermediate analysis object isn't useful:
220235
.. versionchanged:: 3.4
221236
Added *file* parameter.
222237

238+
.. versionchanged:: 3.11
239+
Added the ``show_caches`` parameter.
240+
223241

224-
.. function:: get_instructions(x, *, first_line=None)
242+
.. function:: get_instructions(x, *, first_line=None, show_caches=False)
225243

226244
Return an iterator over the instructions in the supplied function, method,
227245
source code string or code object.
@@ -236,6 +254,9 @@ operation is being performed, so the intermediate analysis object isn't useful:
236254

237255
.. versionadded:: 3.4
238256

257+
.. versionchanged:: 3.11
258+
Added the ``show_caches`` parameter.
259+
239260

240261
.. function:: findlinestarts(code)
241262

Include/cpython/code.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
/* Each instruction in a code object is a fixed-width value,
66
* currently 2 bytes: 1-byte opcode + 1-byte oparg. The EXTENDED_ARG
77
* opcode allows for larger values but the current limit is 3 uses
8-
* of EXTENDED_ARG (see Python/wordcode_helpers.h), for a maximum
8+
* of EXTENDED_ARG (see Python/compile.c), for a maximum
99
* 32-bit value. This aligns with the note in Python/compile.c
1010
* (compiler_addop_i_line) indicating that the max oparg value is
1111
* 2**32 - 1, rather than INT_MAX.

Include/internal/pycore_code.h

+8-1
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,13 @@ typedef union {
6464

6565
#define INSTRUCTIONS_PER_ENTRY (sizeof(SpecializedCacheEntry)/sizeof(_Py_CODEUNIT))
6666

67+
typedef struct {
68+
_Py_CODEUNIT counter;
69+
} _PyBinaryOpCache;
70+
71+
#define INLINE_CACHE_ENTRIES_BINARY_OP \
72+
(sizeof(_PyBinaryOpCache) / sizeof(_Py_CODEUNIT))
73+
6774
/* Maximum size of code to quicken, in code units. */
6875
#define MAX_SIZE_TO_QUICKEN 5000
6976

@@ -276,7 +283,7 @@ int _Py_Specialize_Call(PyObject *callable, _Py_CODEUNIT *instr, int nargs,
276283
int _Py_Specialize_Precall(PyObject *callable, _Py_CODEUNIT *instr, int nargs,
277284
PyObject *kwnames, SpecializedCacheEntry *cache, PyObject *builtins);
278285
void _Py_Specialize_BinaryOp(PyObject *lhs, PyObject *rhs, _Py_CODEUNIT *instr,
279-
SpecializedCacheEntry *cache);
286+
int oparg);
280287
void _Py_Specialize_CompareOp(PyObject *lhs, PyObject *rhs, _Py_CODEUNIT *instr, SpecializedCacheEntry *cache);
281288
void _Py_Specialize_UnpackSequence(PyObject *seq, _Py_CODEUNIT *instr,
282289
SpecializedCacheEntry *cache);

Include/opcode.h

+75-70
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)