From 01bed137e425db966adb81e2ba00e0f5534ed45e Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 6 Aug 2025 18:04:21 +0200 Subject: [PATCH 1/9] Initial rewording; add table of ASCII source characters Co-authored-by: Blaise Pabon Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> --- Doc/reference/lexical_analysis.rst | 179 ++++++++++++++++++++++++++--- 1 file changed, 163 insertions(+), 16 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index e320eedfa67a27..d60ea8608d6366 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1351,27 +1351,25 @@ Formally, imaginary literals are described by the following lexical definition: imagnumber: (`floatnumber` | `digitpart`) ("j" | "J") +.. _delimiters: .. _operators: -Operators -========= +Operators and delimiters +======================== .. index:: single: operators -The following tokens are operators: +The following tokens are operators -- they can be put between two +:ref:`expressions ` (or in front of an expression) +to make a larger expression. .. code-block:: none - + - * ** / // % @ - << >> & | ^ ~ := + + - * ** / // % + << >> & | ^ ~ < > <= >= == != - - -.. _delimiters: - -Delimiters -========== + @ := . .. index:: single: delimiters @@ -1380,9 +1378,11 @@ The following tokens serve as delimiters in the grammar: .. code-block:: none ( ) [ ] { } - , : ! . ; @ = + , : ! ; = -> + . @ -The period can also occur in floating-point and imaginary literals. +The period can also occur in floating-point and imaginary literals; +the period and the ``@`` can also serve as operators. .. _lexical-ellipsis: @@ -1393,13 +1393,13 @@ A sequence of three periods has a special meaning as an ... -The following *augmented assignment operators* serve +The following :ref:`augmented assignment ` operators serve lexically as delimiters, but also perform an operation: .. code-block:: none - -> += -= *= /= //= %= - @= &= |= ^= >>= <<= **= + += -= *= **= /= //= %= + <<= >>= &= |= ^= @= The following printing ASCII characters have special meaning as part of other tokens or are otherwise significant to the lexical analyzer: @@ -1415,3 +1415,150 @@ occurrence outside string literals and comments is an unconditional error: $ ? ` + +Summary of source characters +============================ + +.. list-table:: + :widths: auto + :header-rows: 1 + + * * No. + * Symbol + * Meaning in Python + * * 0 + * ``'\x00'`` + * not a source character + * * 1-8 + * + * control characters + * * 9 + * ``'\t'`` + * whitespace + * * 10 + * ``'\n'`` + * newline + * * 11 + * + * control character + * * 12 + * ``'\f'`` + * whitespace + * * 13 + * ``'\r'`` + * newline + * * 14-31 + * + * control characters + * * 32 + * space + * whitespace + * * 33 + * ``!`` + * part of ``!=`` + * * 34 + * ``"`` + * string quote + * * 35 + * ``#`` + * comment + * * 36 + * ``$`` + * unused symbol + * * 37 + * ``%`` + * operator + * * 38 + * ``&`` + * operator + * * 39 + * ``'`` + * string quote + * * 40 + * ``(`` + * delimiter + * * 41 + * ``)`` + * delimiter + * * 42 + * ``*`` + * operator + * * 43 + * ``+`` + * operator + * * 44 + * ``,`` + * delimiter + * * 45 + * ``-`` + * operator + * * 46 + * ``.`` + * operator, delimiter, part of number syntax + * * 47 + * ``/`` + * operator + * * 48-57 + * ``0`` to ``9`` + * part of number syntax, part of name syntax + * * 58 + * ``:`` + * delimiter + * * 59 + * ``;`` + * delimiter + * * 60 + * ``<`` + * operator + * * 61 + * ``=`` + * operator + * * 62 + * ``>`` + * operator + * * 63 + * ``?`` + * unused symbol + * * 64 + * ``@`` + * operator, delimiter + * * 65-90 + * ``A`` to ``Z`` + * part of name syntax + * * 91 + * ``[`` + * delimiter + * * 92 + * ``\`` + * operator + * * 93 + * ``]`` + * delimiter + * * 94 + * ``^`` + * operator + * * 95 + * ``_`` + * part of name syntax + * * 96 + * .. this uses zero-width joiner characters to get a + literal backtick: + + ``‍`‍`` + + * unused symbol + * * 97-122 + * ``a`` to ``z`` + * part of name syntax + * * 123 + * ``{`` + * delimiter + * * 124 + * ``|`` + * operator + * * 125 + * ``}`` + * delimiter + * * 126 + * ``~`` + * operator From 5efbd335bd34d16dbd786878b75398c7bd51e094 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 13 Aug 2025 16:29:04 +0200 Subject: [PATCH 2/9] Remove the Summary of source characters --- Doc/reference/lexical_analysis.rst | 148 ----------------------------- 1 file changed, 148 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index d60ea8608d6366..e8a47d4d49a853 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1414,151 +1414,3 @@ occurrence outside string literals and comments is an unconditional error: .. code-block:: none $ ? ` - - -Summary of source characters -============================ - -.. list-table:: - :widths: auto - :header-rows: 1 - - * * No. - * Symbol - * Meaning in Python - * * 0 - * ``'\x00'`` - * not a source character - * * 1-8 - * - * control characters - * * 9 - * ``'\t'`` - * whitespace - * * 10 - * ``'\n'`` - * newline - * * 11 - * - * control character - * * 12 - * ``'\f'`` - * whitespace - * * 13 - * ``'\r'`` - * newline - * * 14-31 - * - * control characters - * * 32 - * space - * whitespace - * * 33 - * ``!`` - * part of ``!=`` - * * 34 - * ``"`` - * string quote - * * 35 - * ``#`` - * comment - * * 36 - * ``$`` - * unused symbol - * * 37 - * ``%`` - * operator - * * 38 - * ``&`` - * operator - * * 39 - * ``'`` - * string quote - * * 40 - * ``(`` - * delimiter - * * 41 - * ``)`` - * delimiter - * * 42 - * ``*`` - * operator - * * 43 - * ``+`` - * operator - * * 44 - * ``,`` - * delimiter - * * 45 - * ``-`` - * operator - * * 46 - * ``.`` - * operator, delimiter, part of number syntax - * * 47 - * ``/`` - * operator - * * 48-57 - * ``0`` to ``9`` - * part of number syntax, part of name syntax - * * 58 - * ``:`` - * delimiter - * * 59 - * ``;`` - * delimiter - * * 60 - * ``<`` - * operator - * * 61 - * ``=`` - * operator - * * 62 - * ``>`` - * operator - * * 63 - * ``?`` - * unused symbol - * * 64 - * ``@`` - * operator, delimiter - * * 65-90 - * ``A`` to ``Z`` - * part of name syntax - * * 91 - * ``[`` - * delimiter - * * 92 - * ``\`` - * operator - * * 93 - * ``]`` - * delimiter - * * 94 - * ``^`` - * operator - * * 95 - * ``_`` - * part of name syntax - * * 96 - * .. this uses zero-width joiner characters to get a - literal backtick: - - ``‍`‍`` - - * unused symbol - * * 97-122 - * ``a`` to ``z`` - * part of name syntax - * * 123 - * ``{`` - * delimiter - * * 124 - * ``|`` - * operator - * * 125 - * ``}`` - * delimiter - * * 126 - * ``~`` - * operator From df5a1d273f94f530017d58ec813e2273db5d8638 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 13 Aug 2025 16:36:55 +0200 Subject: [PATCH 3/9] More rewording --- Doc/reference/lexical_analysis.rst | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index e8a47d4d49a853..c8303291712588 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1359,9 +1359,8 @@ Operators and delimiters .. index:: single: operators -The following tokens are operators -- they can be put between two -:ref:`expressions ` (or in front of an expression) -to make a larger expression. +The following tokens are :dfn:`operators` -- they are used to combine +:ref:`expressions `. .. code-block:: none @@ -1369,11 +1368,12 @@ to make a larger expression. + - * ** / // % << >> & | ^ ~ < > <= >= == != - @ := . + . @ := .. index:: single: delimiters -The following tokens serve as delimiters in the grammar: +The following tokens are :dfn:`delimiters` -- simple tokens that +are not operators: .. code-block:: none @@ -1381,20 +1381,23 @@ The following tokens serve as delimiters in the grammar: , : ! ; = -> . @ -The period can also occur in floating-point and imaginary literals; -the period and the ``@`` can also serve as operators. +The period (``.``) and at-sign (``@``) can serve either as operators +or delimiters. + +The period can also occur in :ref:`floating-point ` and +:ref:`imaginary` literals. .. _lexical-ellipsis: -A sequence of three periods has a special meaning as an -:py:data:`Ellipsis` literal: +A sequence of three periods (without whitespace between them) has a special +meaning as an :py:data:`Ellipsis` literal: .. code-block:: none ... -The following :ref:`augmented assignment ` operators serve -lexically as delimiters, but also perform an operation: +The following tokens are :ref:`augmented assignment ` operators: +they serve lexically as delimiters, but also perform an operation: .. code-block:: none From 9a2301703af23f12f6f0d3c8b5ad50ddd93820a3 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 13 Aug 2025 16:57:36 +0200 Subject: [PATCH 4/9] Add notes about where else symbols appear --- Doc/reference/lexical_analysis.rst | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index c8303291712588..f0a2a5331e005c 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1370,6 +1370,9 @@ The following tokens are :dfn:`operators` -- they are used to combine < > <= >= == != . @ := +Plus (``+``) and minus (``-``) signs can also occur in +:ref:`floating-point ` and :ref:`imaginary` literals. + .. index:: single: delimiters The following tokens are :dfn:`delimiters` -- simple tokens that @@ -1387,6 +1390,10 @@ or delimiters. The period can also occur in :ref:`floating-point ` and :ref:`imaginary` literals. +The symbols ``{``, ``}``, ``!`` and ``:`` have special meaning in +:ref:`formatted string literals ` and +:ref:`template string literals `. + .. _lexical-ellipsis: A sequence of three periods (without whitespace between them) has a special @@ -1404,6 +1411,10 @@ they serve lexically as delimiters, but also perform an operation: += -= *= **= /= //= %= <<= >>= &= |= ^= @= +See :ref:`operator and delimiter tokens ` +in the :mod:`!token` module documentation for names of the operator and +delimiter tokens. + The following printing ASCII characters have special meaning as part of other tokens or are otherwise significant to the lexical analyzer: From 58a414f65e50163888ac5c3a779fcfdde7e11cf6 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 20 Aug 2025 17:32:10 +0200 Subject: [PATCH 5/9] Reword the OP section once again --- Doc/reference/lexical_analysis.rst | 111 ++++++++++++----------------- 1 file changed, 47 insertions(+), 64 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index f0a2a5331e005c..70b1578b8e847f 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1353,78 +1353,61 @@ Formally, imaginary literals are described by the following lexical definition: .. _delimiters: .. _operators: +.. _lexical-ellipsis: Operators and delimiters ======================== -.. index:: single: operators - -The following tokens are :dfn:`operators` -- they are used to combine -:ref:`expressions `. - -.. code-block:: none - - - + - * ** / // % - << >> & | ^ ~ - < > <= >= == != - . @ := - -Plus (``+``) and minus (``-``) signs can also occur in -:ref:`floating-point ` and :ref:`imaginary` literals. - -.. index:: single: delimiters - -The following tokens are :dfn:`delimiters` -- simple tokens that -are not operators: - -.. code-block:: none - - ( ) [ ] { } - , : ! ; = -> - . @ - -The period (``.``) and at-sign (``@``) can serve either as operators -or delimiters. - -The period can also occur in :ref:`floating-point ` and -:ref:`imaginary` literals. - -The symbols ``{``, ``}``, ``!`` and ``:`` have special meaning in -:ref:`formatted string literals ` and -:ref:`template string literals `. - -.. _lexical-ellipsis: - -A sequence of three periods (without whitespace between them) has a special -meaning as an :py:data:`Ellipsis` literal: - -.. code-block:: none - - ... +.. index:: + single: operators + single: delimiters -The following tokens are :ref:`augmented assignment ` operators: -they serve lexically as delimiters, but also perform an operation: +The following grammar defines :dfn:`operator` and :dfn:`delimiter` tokens, +that is, the generic :data:`~token.OP` token type: -.. code-block:: none +.. grammar-snippet:: + :group: python-grammar - += -= *= **= /= //= %= - <<= >>= &= |= ^= @= + OP: + | arithmetic_operator + | bitwise_operator + | comparison_operator + | enclosing_delimiter + | other_delimiter + | assignment_operator + | other_op + | "..." + + arithmetic_operator: "+" | "-" | "*" | "**" | "/" | "//" | "%" + bitwise_operator: "&" | "^" | "~" | "<<" | ">>" + assignment_operator: "+=" | "-=" | "*=" | "**=" | "/=" | "//=" | "%=" | + "&=" | "|=" | "^=" | "<<=" | ">>=" | "@=" | ":=" + comparison_operator: "<" | ">" | "<=" | ">=" | "==" | "!=" + enclosing_delimiter: "(" | ")" | "[" | "]" | "{" | "}" + other_delimiter: "," | ":" | "!" | ";" | "=" | "->" + other_op: "." | "@" + +.. note:: + + Generally, *operators* are used to combine :ref:`expressions `, + while *delimiters* serve other purposes. + However, there is no clear, formal distinction between the two categories. + + Some tokens can serve as either operators or delimiters, depending on usage. + For example, ``*`` is both the multiplication operator and a delimiter used + for sequence unpacking, and ``@`` is both the matrix multiplication and + a delimiter that introduces decorators. + + For some tokens, the distinction is unclear. + For example, some people consider ``. ( )`` to be delimiters, while others + see the :py:func:`getattr` operator and the function call operator(s). + + Some of Python's operators, like ``and``, ``or``, and ``not in``, use + :ref:`keyword ` tokens rather than "symbols" (operator tokens). + +A sequence of three consecutive periods (``...``) has a special +meaning as an :py:data:`Ellipsis` literal. See :ref:`operator and delimiter tokens ` in the :mod:`!token` module documentation for names of the operator and delimiter tokens. - -The following printing ASCII characters have special meaning as part of other -tokens or are otherwise significant to the lexical analyzer: - -.. code-block:: none - - ' " # \ - -The following printing ASCII characters are not used in Python. Their -occurrence outside string literals and comments is an unconditional error: - -.. code-block:: none - - $ ? ` From 15bf2602c442dfc7e2abf0ce767809ac0df9f5c7 Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 27 Aug 2025 16:16:47 +0200 Subject: [PATCH 6/9] Move the reference to alternative list up --- Doc/reference/lexical_analysis.rst | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 70b1578b8e847f..7e348c6be69be8 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1363,7 +1363,9 @@ Operators and delimiters single: delimiters The following grammar defines :dfn:`operator` and :dfn:`delimiter` tokens, -that is, the generic :data:`~token.OP` token type: +that is, the generic :data:`~token.OP` token type. +A :ref:`list of these tokens and their names ` +is also available in the :mod:`!token` module documentation. .. grammar-snippet:: :group: python-grammar @@ -1408,6 +1410,3 @@ that is, the generic :data:`~token.OP` token type: A sequence of three consecutive periods (``...``) has a special meaning as an :py:data:`Ellipsis` literal. -See :ref:`operator and delimiter tokens ` -in the :mod:`!token` module documentation for names of the operator and -delimiter tokens. From 9de89614b26afbe561621a3829c6c9df046ecdaa Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 27 Aug 2025 16:19:57 +0200 Subject: [PATCH 7/9] Add forgotten operator --- Doc/reference/lexical_analysis.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 7e348c6be69be8..98abe6c043ccb3 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1381,7 +1381,7 @@ is also available in the :mod:`!token` module documentation. | "..." arithmetic_operator: "+" | "-" | "*" | "**" | "/" | "//" | "%" - bitwise_operator: "&" | "^" | "~" | "<<" | ">>" + bitwise_operator: "&" | "|" | "^" | "~" | "<<" | ">>" assignment_operator: "+=" | "-=" | "*=" | "**=" | "/=" | "//=" | "%=" | "&=" | "|=" | "^=" | "<<=" | ">>=" | "@=" | ":=" comparison_operator: "<" | ">" | "<=" | ">=" | "==" | "!=" From 59019bb1b10f5d8b5f8b9c90463d1caf7593b29d Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 3 Sep 2025 17:10:25 +0200 Subject: [PATCH 8/9] Update Doc/reference/lexical_analysis.rst Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com> --- Doc/reference/lexical_analysis.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 98abe6c043ccb3..a98adfacfa441a 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1401,7 +1401,7 @@ is also available in the :mod:`!token` module documentation. a delimiter that introduces decorators. For some tokens, the distinction is unclear. - For example, some people consider ``. ( )`` to be delimiters, while others + For example, some people consider ``.``, ``(``, and ``)`` to be delimiters, while others see the :py:func:`getattr` operator and the function call operator(s). Some of Python's operators, like ``and``, ``or``, and ``not in``, use From d0911f35cb18045795e2dc65642154f5da24fb4e Mon Sep 17 00:00:00 2001 From: Petr Viktorin Date: Wed, 3 Sep 2025 17:15:47 +0200 Subject: [PATCH 9/9] Reorder the tokens to put longer ones first --- Doc/reference/lexical_analysis.rst | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index a98adfacfa441a..83db7646f1673f 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -1371,22 +1371,22 @@ is also available in the :mod:`!token` module documentation. :group: python-grammar OP: - | arithmetic_operator + | assignment_operator | bitwise_operator | comparison_operator | enclosing_delimiter | other_delimiter - | assignment_operator - | other_op + | arithmetic_operator | "..." + | other_op - arithmetic_operator: "+" | "-" | "*" | "**" | "/" | "//" | "%" - bitwise_operator: "&" | "|" | "^" | "~" | "<<" | ">>" assignment_operator: "+=" | "-=" | "*=" | "**=" | "/=" | "//=" | "%=" | "&=" | "|=" | "^=" | "<<=" | ">>=" | "@=" | ":=" - comparison_operator: "<" | ">" | "<=" | ">=" | "==" | "!=" + bitwise_operator: "&" | "|" | "^" | "~" | "<<" | ">>" + comparison_operator: "<=" | ">=" | "<" | ">" | "==" | "!=" enclosing_delimiter: "(" | ")" | "[" | "]" | "{" | "}" other_delimiter: "," | ":" | "!" | ";" | "=" | "->" + arithmetic_operator: "+" | "-" | "**" | "*" | "//" | "/" | "%" other_op: "." | "@" .. note::