From 9c0ef2654cc79f915d4e07ab0bea24b45578154b Mon Sep 17 00:00:00 2001 From: jb2170 Date: Fri, 11 Apr 2025 19:20:43 +0100 Subject: [PATCH 01/19] First main commit to PEP 791 --- peps/pep-0791.rst | 360 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 360 insertions(+) create mode 100644 peps/pep-0791.rst diff --git a/peps/pep-0791.rst b/peps/pep-0791.rst new file mode 100644 index 00000000000..f9718208d39 --- /dev/null +++ b/peps/pep-0791.rst @@ -0,0 +1,360 @@ +PEP: 791 +Title: Precision and Modulo-Precision Flag format specifiers for integer fields +Author: Jay Berry +Sponsor: Pending +Status: Draft +Type: Standards Track +Created: 04-Apr-2025 +Python-Version: Pending +Post-History: `14-Feb-2025 `__ + + +Abstract +======== + +This PEP proposes implementing the standard format specifier ``.`` of :pep:`3101` as "precision" for integers formatted with the binary, octal, decimal, and hexadecimal presentation types, and implementing the standard format specifier ``z`` as a "modulo" flag for integers formatted with precision and the binary, octal, and hexadecimal presentation types, which first reduces the integer into ``range(base ** precision)``, resulting in a predictable two's complement style formatting. + +Both "precision" (``.``), and the "modulo-precision" flag (``z``) are presented in this PEP, as the alternative rejected implementations entail combinations of both. + +This PEP amends the clause of :pep:`3101` which states "[t]he precision is ignored for integer conversions". + + +Rationale +========= + +When string formatting integers in binary octal and hexadecimal, one often desires the resulting string to contain a guaranteed minimum number of digits. For unsigned integers of known machine-width bounds (eg 8 bit bytes) this often also ends up the exact resulting number of digits. This has previously been implemented in the old-style ``%`` formatting using the ``.`` "precision" format specifier, closely related to that of the C programming language. + +.. code-block:: python + + >>> "0x%.2x" % 15 + '0x0f' # two hex digits, ideal for displaying an unsigned byte + >>> "0o%.3o" % 18 + '0o022' # three octal digits, ideal for displaying a umask or file permissions + +When :pep:`3101` new-style formatting was first introduced, used in ``str.format`` and ``f``\ strings, the `format specification `_ was simple enough that the behavior of "precision" could be trivially emulated with the ``width`` format specifier. Precision therefore was left unimplemented and forbidden for ``int`` fields. However, as time has progressed and new format specifiers have been added, whose interactions with ``width`` noticeably diverge its behavior away from emulating precision, the readmission of precision as its own format specifier, ``.``, is sufficiently warranted. + +The ``width`` format specifier guarantees a minimum length of the entire replacement field, not just the number of digits in a formatted integer. For example, the wonderful ``#`` specifier that prepends the prefix of the corresponding presentation type consumes from ``width``: + +.. code-block:: python + + >>> x = 12 + >>> f"0x{x:02x}" # manually specifying '0x' prefix + '0x0c' # two hex digits :) + >>> f"{x:#02x}" # use '#' format specifier to output '0x' automatically + '0xc' # only one hex digit :( + >>> f"{x:#08b}" + '0b001100' # we wanted 8 bits, not 6 :( + +One could attempt to argue that since the length of a prefix is known to always be 2, it can be accounted for manually by adding 2 to the desired number of digits. Consider however the following demonstrations of why this is a bad idea: + +* By correcting the second example to ``f"{x:#04x}"``, at a glance this looks like it may produce four hex digits, but it only produces two. This is bad for readability. ``4`` is thus too much of a 'magic number', and trying to counter that by being overly explicit with ``f"{x:#0{2+2}x}"`` looks ridiculous. +* In the future it is possible that a type specifier may be added with a prefix not of length 2, meaning the programmer has to calculate the prefix length, rather than Python's internal string formatting code handling that automatically. +* Things get more complicated when using the ``sign`` format specifier, ``f"{x: #0{1+2+2}x}"`` required to produce ``' 0x0c'``. +* Things get *even more* complicated when introducing a ``grouping_option``, for example formatting an integer into ``k`` 'word' segments joined by ``_``: ``x = 3735928559; k = 2; f"{x: #0{1+2+4*k+(k - 1)}_x}"`` is required to produce ``' 0xdead_beef'``. Surely this would be easier to write with precision as ``f"{x: #_.8x}"``? + +It is clear at this point that the reduction of complexity that would be provided by precision's implementation for ``int`` fields would be beneficial to any user. Nor is this proposal a new special-case behavior being demanded exclusively at the behest of ``int`` fields: the precision token ``.`` is already implemented as prescribed in :pep:`3101` for ``str`` data to truncate the field's length, and for ``float`` data to ensure that there are a fixed number of digits after the decimal point, eg ``f"{0.1+0.2: .4f}"`` producing ``' 0.3000'``. Thus no new tokens need adding to the `format specification `_ because of this proposal, maintaining its modest size. + +For the sake of completion, and lack of any reasonable objection, we propose that precision shall work also in decimal, base 10. Explicitly, the integer presentation types laid out in :pep:`3101` that are permitted to implement precision are ``'b'``, ``'d'``, ``'o'``, ``'x'``, ``'X'``, ``'n'``, and ``'' (None)``. The only presentation type not permitted is ``c`` ('character'), whose purpose is to format an integer to a single Unicode character, or an appropriate replacement for non-printable characters, for which it does not make sense to implement precision. In the event that new integer presentation types are added in the future, such as ``'B'`` and ``'O'`` which mutatis-mutandis could provide the same behavior as ``'X'`` (that is a capitalized prefix and digits), their addition should appropriately consider whether precision should be implemented or not. In the case of ``'B'`` and ``'O'`` as described here it would be correct to implement precision. A ``ValueError`` shall be raised when precision is attempted to be used for invalid integer presentation types. + + +Precision For Negative Numbers +------------------------------ + +So far in this PEP we have cautiously avoided talking about the formatting of negative numbers with precision, which we shall now discuss. + + +Short Verdict +''''''''''''' + +We desire two behaviors, which motivates the implementation of a flag ``z`` to toggle on the latter's behavior: + +* For precision without the ``z`` flag, a negative integer ``x`` shall be formatted with a negative sign and the digits of ``-x``'s formatting. This is the same friendly behavior as old-style ``%`` formatting. + + For example ``f"{-12:#.2x}"`` shall produce ``'-0x0c'``, equivalent to ``"%#.2x" % -12``. + +* For precision with the ``z`` flag, ``q, r = divmod(x, base ** n)`` is first taken when formatting ``f"{x:z.{n}{base_char}}"``, and ``r`` is passed on to precision, the resulting string being equivalent to ``f"{r:.{n}{base_char}}"``. Because ``r`` is in ``range(base ** n)`` the number of digits will always be exactly ``n``, resulting in a predictable two's complement style formatting, which is useful to the end user in environments that deal with machine-width oriented integers such as :mod:`struct`. + + For example in formatting ``f"{-1:z#.2x}"``, ``-1`` is reduced modulo ``256`` via ``-1, 255 = divmod(-1, 256)``, the resulting string being equivalent to ``f"{255:#.2x}"``, which is ``'0xff'``. + + The ``z`` flag shall only be implemented for presentation types corresponding to bases that are powers of two, specifically at present binary, octal, and hexadecimal. Whilst reduction of integers modulo by powers of ten is computationally possible, a 'ten's complement?' has no demand and so precision is unimplemented for decimal presentation types. The ``z`` flag shall work for all integers, not just negatives. + + The syntax choice of ``z`` is again out of respect for maintaining the modest size of the `format specification `_. ``z`` was introduced to the format specification in :pep:`682` as a flag for normalizing negative zero to positive zero for the ``float`` and ``Decimal`` types. It is currently unimplemented for the ``int`` type, and since integers never have a 'negative zero' situation it seems uncontroversial to repurpose ``z``, again as a flag. If one squints hard enough, the ``z`` looks like a ``2`` for two's complement! + + +Long Introspection +'''''''''''''''''' + +We first present some observations about the binary representations of *signed* integers in two's complement. This leads us to a couple of alternative formulations of formatting negative numbers. + +Observe that one can always extend a signed number's binary representation by extending the the leading digit as a prefix: + +.. code-block:: text + + 45 (8 bit) 00101101 + 45 (9 bit) 000101101 + -19 (8 bit) 11101101 + -19 (9 bit) 111101101 + +For non-negative numbers this is obvious. For negative numbers this is because the erstwhile leading column of an ``n`` bit representation goes from having a value of ``-2 ** (n-1)``, to ``+2 ** (n-1)``, with a new ``n+1``\ th column of value ``-2 ** n`` prefixed on, the overall sum unaffected. + +This is what C's ``printf`` does, working with powers of two as the numbers of digits: + +.. code-block:: C + + printf("%#hhb\n", -19); // 0b11101101 + printf("%#hho\n", -19); // 0355 + printf("%#hhx\n", -19); // 0xed + + printf("%#b\n", -19); // 0b11111111111111111111111111101101 + printf("%#o\n", -19); // 037777777755 + printf("%#x\n", -19); // 0xffffffed + +Conversely it should be clear that one can losslessly truncate a signed number's binary representation to have only one leading ``0`` if it is non-negative, and one leading ``1`` if it is negative: + +.. code-block:: text + + 45 (8 bit) 00101101 + 45 (7 bit) 0101101 + -19 (8 bit) 11101101 + -19 (7 bit) 1101101 + +If one were to truncate another digit off of these examples, then both would end up as ``101101``, 45 indistinguishable from -19 when using only 6 binary digits because they are both the same modulo ``2 ** 6 = 64``. Therefore to losslessly and unambiguously represent a signed integer ``x`` as a binary string which is rendered to the end user, we have a defacto 'minimal width' representation convention, using ``n`` digits, where ``n`` is the smallest integer such that ``x`` is in ``range(-2 ** (n-1), 2 ** (n-1))``. + +For rendering octal and hexadecimal strings one has to extend the definition of the 'minimal width' representation convention to be sufficiently unambiguous. 383's minimal width binary string is ``0101111111``, and -129's is ``101111111``, a suffix of the former's. A naive, incorrect, implementation of hexadecimal string formatting would render both as ``'0x17f'`` by *padding* both binary representations to ``000101111111``. The method was correct to desire a number of binary digits (12) that is divisible by the number of bits in the base (4 bits in base 16) so that the binary representation can be segmented up into (hex) digits, but it was incorrect in *padding*; the method should have instead *extended* as we have observed previously, 383 extended to ``000101111111``, and -129 extended to ``111101111111``, whence 383 is rendered as ``'0x17f'`` and -129 as ``0xf7f``. + +Thus the generalized definition of our 'minimal width' representation convention is: for an integer ``x`` to rendered in base ``base``, produce ``n`` digits, where ``n`` is the smallest integer such that ``x`` is in ``range(-base ** n / 2, base ** n / 2)``. + +This leads onto the rejected alternatives. + + +Rejected Alternatives +===================== + +Behavior of ``z`` +----------------- + +The desired implementation of ``z``, the two's complement style formatting flag, has split into two main camps of opinions, disagreeing over lossless vs lossy presentation. The lossless camp believes that the formatted strings corresponding to integers should all be distinct from each other, uniqueness preserved by the minimal width representation convention; precision with ``z`` enabled should still be only a *minimum* number of digits requested, as it is without ``z``. The lossy camp believes that precision with ``z`` enabled should first reduce the integer using modular arithmetic, which then produces *exactly* the number of digits requested, equivalent to left-truncating the minimal width representation string. + +We endeavor to conclude in the following section that the former camp, lossless formatting, has no use cases, and is thus a rejected idea, whence this PEP proposes the latter, lossy, behavior. + + +Minimal Width Representation Convention +''''''''''''''''''''''''''''''''''''''' + +This idea was fiercely entertained only due to its lossless behavior, however it is a obstacle to ergonomics in every candidate use case. These arguments about the aesthetics of string rendering are not irrational or about personal taste, but rather they are crucial in how information is communicated to the end user. + +In a program in which signed-ness of integers is critical to communicate, any implementation of ``z`` should not be used, as the average user will be expecting to see a negative sign ``-``. The alternative of using minimal width representation convention requires one to be uncomfortably vigilant looking for leading digits of numbers belonging to the upper half of the base's range whenever a negative number is present (``1`` for binary, ``4-7`` for octal, and ``8-f`` for hex). Any end user that is not aware of this defacto convention, and even those who are but are not expecting it to be present in a program, would have a hard time: + +The formatting of 128 and -128 using ``f"{x:z#.2x}"`` would produce ``'0x080'`` and ``'0x80'`` respectively. It is the PEP author's opinion that there is a 0% chance that ``'0x80'`` is being read as *negative* 128 under normal conditions. Furthermore the hideous rendering of positive 128 as ``'0x080'`` is useless for a program that should produce a uniformly spaced hexdump of bytes, agnostic of whether they are signed or unsigned; all bytes should be rendered in the form ``'0xNN'``. See the `examples <#modulo-precision>`__ section on how modulo-precision handles bytes in the correct sign-agnostic way. + +Contrapositively therefore ``z``'s purpose is to be used in environments where signed-ness is *not* critical, and more likely than not where it is even encouraged to treat the integers with respect to the modular arithmetic that arises in two's complement hardware of fixed register sizes. In the example above 128 and -128 are the same modulo 256, and the respectable rendering is ``'0x80'``. In general the purpose of ``z`` is to treat integers modulo ``base ** precision`` as the same. So too 255 and -1 should both be rendered as ``'0xff'``, not ``'0x0ff'`` and ``'0xff'`` respectively; the truncation is not a hindrance, but the desired behavior. Formally we may say that the formatting should be a well defined bijection between the equivalence classes of ``Z/(base ** precision)Z`` and strings with ``precision`` digits. + +The remaining question is "[sic] is there no chance to communicate this truncation to user?" as a concern for the 'loss of information' arising from the effectively left-truncated strings. We reject this question's premise that there ever is such a case of unintentional loss of information, considering the two cases of hardware-aware integers and otherwise: + +So far we have played around with examples of bytes in ``range(-128, 256)``, the union of the signed and unsigned ranges, with respect to which the virtues of formatting ``x`` and ``x - 256`` as the same are clearly established. In the hardware-aware contexts that one expects to find ``z``, any integers corresponding to bytes that lie outside that range are likely a programming error. For example if a library sets a pixel brightness integer to be 257, and prints out ``'0x01'`` instead of ``'0x101'`` via ``f"{x:z#.2x}"``, that's not our problem or doing; string formatting shouldn't raise an exception, or even a ``SyntaxWarning`` as an invalid escape sequence ``"\y"`` would, because ``ValueError: bytes must be in range(0, 256)`` will be raised by ``bytes`` when trying to serialize that integer via ``bytes([257])``; let the appropriate 'layer' of code raise the exception, as that is more indicative of a defect in the library, not our string formatting. + +In the case of non-hardware aware integers one would have to intentionally opt to use ``z``, in which modular arithmetic is the chosen desired effect. It is for this reason also that we shall not raise a ``SyntaxWarning`` or ``ValueError`` for integers lying outside of ``range(-base ** precision / 2, base ** precision)``. + +.. + XXX Give a good example of non-hardware aware use of modular arithmetic formatting like Minecraft buried treasure always being at 8,8 within a chunk. + +Thus we have defended the lossy behavior of ``z`` implemented as modulo-precision, and we have exhausted all reasonable use cases of lossless behavior. + +A final compromise to consider and reject is implementing ``z`` not as a flag *dependent* on ``.``, but as a flag that can be *combined* with ``.``. Specifically: ``z`` without ``.`` would turn on two's complement mode to render the minimal width representation of the formatted integer, ``.`` without ``z`` would implement precision as already explained, a minimum number of digits in the magnitude and a sign if necessary, and ``z`` combined with ``.`` would turn on the left-truncating modulo-precision. This labyrinth of combinations does not seem useful to anyone, as we have already discredited the ergonomics of minimal width representation convention, whence ``z`` would rarely be used on its own, and this behavior of two options that individually render a *minimum* number of digits combining together to render an *exact* number of digits seems counterintuitive. + + +Infinite Length Indication +'''''''''''''''''''''''''' + +Another, less popular, rejected alternative was for ``z`` to directly acknowledge the infinite prefix of ``0``\ s or ``1``\ s that precede a non-negative or negative number respectively. For example + +.. code-block:: python + + >>> f"{-1:z#.8b}" + '0b[...1]11111111' + >>> f"{300:z#.8b}" + '0b[...0]100101100' + +This is effectively the minimal width representation convention with an 'infinite' prefix attached to it. + +In the C programming language the machine-width dependent two's complement formatting of ``int`` data with precision exhibits excessive lengths of prefixes that arise from negative numbers, even those with small magnitude: + +.. code-block:: C + + printf("%#.2x\n", -19); // 0xffffffed + printf("%#.2llx\n", (long long unsigned int)-19); // 0xffffffffffffffed + +This prefix could continue on indefinitely if it were not limited by a maximum machine-width! + +Python's ``int`` type is indeed not limited by a maximum machine-width. Thus to avoid printing infinitely long two's complement strings we could use a similar approach to that of the builtin ``list``'s string formatting for printing a list that contains itself: + +.. code-block:: python + + >>> l = [] + >>> l.append(l) + >>> l + [[...]] + + >>> y = -1 + >>> f"{y:z#.8b}" + '0b[...1]11111111' + +This may have been useful to educate beginner users on how bitwise binary operations work, for example showing how ``-1 & x`` is always trivially equal to ``x``, or how the binary representation of the negation of a number can be obtained by adding one to its bitwise complement: + +.. code-block:: python + + >>> x = 42 + >>> f"{x:z#.8b}" + '0b[...0]00101010' + >>> f"{~x:z#.8b}" + '0b[...1]11010101' + >>> f"{x|~x:z#.8b}" + '0b[...1]11111111' + # x | ~x == -1 + # x | ~x == x + ~x because of their disjoint bitwise representations + # thus x + ~x == -1 + # thus -x == ~x + 1 + >>> y = ~x + 1 + >>> f"{y:z#.8b}" + '0b[...1]11010110' + >>> y == -x + True + +Its use case is just too narrow, and modulo-precision outshines it. + + +General +------- + +* What about ones's complement, or other binary representations? + + Two's complement is so dominant that no one really considers other representations. GCC only supports two's complement. + +* Could we do nothing? + + Programmers continue to hobble on using the ``width`` format specifier with ad-hoc corrections to mimic precision. This is intolerable, and the rationale of this PEP makes conclusive arguments for the addition and implementation choices of precision. + + Refusing to implement precision for integer fields using ``.`` reserves ``.`` for possible future uses. However in the ~20 year timespan since :pep:`3101` no alternatives have been accepted, and any alternate use of ``.`` takes it further out of sync with both old-style ``%`` formatting, and the C programming language. + + +Syntax +------ + +* ``!`` instead of ``z.`` for precision with modulo-precision, mutually exclusive with ``.``. + + Pros: + + - ``!`` is graphically related to ``.``, an extension if you will. Precision with the modulo-precision flag set is indeed an extension of precision. + - ``!`` in the English language is often used for imperative, commanding sentences. So too modulo-precision commands the *exact* number of digits to which its input shall be formatted, whereas precision is the *minimum* number of digits. This is idiomatic. + - ``!`` is only one symbol as opposed to ``z.``. This coupled with ``!`` being mutually exclusive with ``.`` leaves the overall length of one's written code unaffected when switching on modulo-precision. + - Using a new ``!`` symbol reserves ``z`` for other future uses, whatever that may be. + + Cons: + + - ``z.`` also conveys a sense of extension from ``.``, a flag attached to ``.``, and lexicographically flows left to right as 'modulo' (``z``) 'precision' (``.``). + - ``.`` and ``!`` being mutually exclusive to each other may give a beginner programmer analysis-paralysis over which to choose when looking at the `format specification `_ documentation. + - ``!`` would be another addition to the format specification for a single purpose. It would not have any implementation for ``str``, ``float``, or any other type. + - There also already exists a ``["!" conversion]`` "explicit conversion flag" in the `format string syntax `_ as laid out in :pep:`3101`. For example in ``f"{s!r}"`` the ``!r`` calls ``repr`` on ``s``. This would *not* syntactically clash with a ``!`` format specifier, the format specifiers ``[":" format_spec]`` being separated by a well-defined preceding colon, however users unfamiliar with the new modulo-precision mode may glance over format strings containing ``!`` and expect different behavior. + + Verdict: + + - Whilst graphically attractive, ``!`` would add too much more clutter to the format specification for a purpose that can be achieved by overloading the preexisting ``z`` flag. + + +Backwards Compatibility +======================= + +To quote :pep:`682` + + The new formatting behavior is opt-in, so numerical formatting of existing programs will not be affected + +unless someone out there is specifically relying upon ``.`` raising a ``ValueError`` for integers as it currently does, but to quote :pep:`475` + + The authors of this PEP don't think that such applications exist + + +Examples And Teaching +===================== + +Precision +--------- + +Documentation and tutorials in the Python sphere of influence should encourage the adoption of ``.``, precision, as the default format specifier for formatting ``int`` fields as opposed to ``width``, when it is clear a minimum number of *digits* is required, not a minimum length of the *whole replacement field*. + +Since the concept of precision is common in other languages such as C, and was already present in Python's old-style ``%`` formatting, we don't need to go *too* overboard, but a decent few examples as below may demonstrate its uses. + +.. code-block:: python + + >>> def hexdump(b: bytes) -> str: + ... return " ".join(f"{c:#.2x}" for c in b) + + >>> hexdump(b"GET /\r\n\r\n") + '0x47 0x45 0x54 0x20 0x2f 0x0d 0x0a 0x0d 0x0a' + # observe the CR and LF bytes padded to precision 2 + # in this basic HTTP/0.9 request + + >>> def unicode_dump(s: str) -> str: + ... return " ".join(f"U+{ord(c):.4X}" for c in s) + + >>> unicode_dump("USA 🦅") + 'U+0055 U+0053 U+0041 U+0020 U+1F985' + # observe the last character's Unicode codepoint has 5 digits; + # precision is only the minimum number of digits + + +Modulo-Precision +---------------- + +The clear area for encouraging the use of modulo-precision is when dealing with machine-width oriented integers such as those packed and unpacked by :mod:`struct`. We give an example of the consistent predictable two's complement formatting of signed and unsigned integers. + +.. code-block:: python + + >>> import struct + + >>> my_struct = b"\xff" + >>> (t,) = struct.unpack('b', my_struct) # signed char + >>> print(t, f"{t:#.2x}", f"{t:z#.2x}") + '-1 -0x01 0xff' + >>> (t,) = struct.unpack('B', my_struct) # unsigned char + >>> print(t, f"{t:#.2x}", f"{t:z#.2x}") + '255 0xff 0xff' + + # observe in both the signed and unsigned unpacking the modulo-precision flag 'z' + # produces a predictable two's complement formatting + + +Thanks +====== + +Thank you to + +* Sergey B Kirpichev, for discussions and implementation code. +* Raymond Hettinger, for the initial suggestion of the two's complement behavior. + + +Copyright +========= + +This document is placed in the public domain or under the +CC0-1.0-Universal license, whichever is more permissive. + + +TODO AND REMOVE BEFORE MERGE +============================ + +* Format all lines to ~80 characters. I've left this formatting until we're happy with the contents. +* RFC 2119 Style Specification? After all is said and done here. +* Add Sergey and Raymond to the authors field, or is Thanks enough? +* Give a good example of non-hardware aware use of modular arithmetic formatting, my brain has gone blank... +* Possibly (probably not) add one more feature to the PEP: + + Loosening :pep:`682`\ 's strict ordering of ``[#][z]`` as they appear in the `format specification `_ for ``int`` fields for readability. (Or is this just my taste?: debate this) + + The existing `format specification `_ mandates that if both ``z`` and ``#`` are to be used, they must appear in that order, leading to ``z#.``, with ``z`` separated from its ``.``, however this could be changed to be more permissible if there are no syntax clashes, to permit ``#z.``, or is this just my taste? :pep:`682` proposed / uses ``[sign][z]`` instead of ``[sign[z]]``, which has given us the opportunity to reuse ``z``, and really has no strict need to be ``[sign][z][#]...[.precision]``, or are we doing too much voodoo by allowing ``z`` and ``#`` to commute with each other, even if it's just for ``int`` fields. I'm starting to think this might be too much voodoo. + + +.. + FOOTNOTES + +.. _formatstrings: https://docs.python.org/3/library/string.html#formatstrings +.. _formatspec: https://docs.python.org/3/library/string.html#formatspec From 4e0e75e4e03a864facd88116f0c2e6b2e4a61d39 Mon Sep 17 00:00:00 2001 From: Adam Turner <9087854+aa-turner@users.noreply.github.com> Date: Tue, 6 May 2025 22:13:42 +0100 Subject: [PATCH 02/19] 791 -> 786 --- peps/{pep-0791.rst => pep-0786.rst} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename peps/{pep-0791.rst => pep-0786.rst} (99%) diff --git a/peps/pep-0791.rst b/peps/pep-0786.rst similarity index 99% rename from peps/pep-0791.rst rename to peps/pep-0786.rst index f9718208d39..cacca806114 100644 --- a/peps/pep-0791.rst +++ b/peps/pep-0786.rst @@ -1,4 +1,4 @@ -PEP: 791 +PEP: 786 Title: Precision and Modulo-Precision Flag format specifiers for integer fields Author: Jay Berry Sponsor: Pending From bd628a075640b2235dc60b3c100d82687449eaae Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 8 May 2025 01:42:48 +0100 Subject: [PATCH 03/19] Add Alyssa as the sponsor of PEP 786 --- peps/pep-0786.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index cacca806114..6ebff6d36c0 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -1,7 +1,7 @@ PEP: 786 Title: Precision and Modulo-Precision Flag format specifiers for integer fields Author: Jay Berry -Sponsor: Pending +Sponsor: Alyssa Coghlan Status: Draft Type: Standards Track Created: 04-Apr-2025 From ac5024db885b574c0c31067316ba54a0462bc397 Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 8 May 2025 01:59:21 +0100 Subject: [PATCH 04/19] Update .github/CODEOWNERS adding @ncoghlan for PEP 786 --- .github/CODEOWNERS | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index a4254d43e5a..78c3bc20ca4 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -663,6 +663,7 @@ peps/pep-0782.rst @vstinner peps/pep-0783.rst @hoodmane @ambv peps/pep-0784.rst @gpshead peps/pep-0785.rst @gpshead +peps/pep-0786.rst @ncoghlan # ... peps/pep-0787.rst @ncoghlan peps/pep-0788.rst @ZeroIntensity @vstinner From 3ca612051846f6389ba227487932d4b7729dc8fc Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 8 May 2025 03:29:22 +0100 Subject: [PATCH 05/19] Address AA-Turner's notes --- peps/pep-0786.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index 6ebff6d36c0..f74a09a24fc 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -5,8 +5,8 @@ Sponsor: Alyssa Coghlan Status: Draft Type: Standards Track Created: 04-Apr-2025 -Python-Version: Pending -Post-History: `14-Feb-2025 `__ +Python-Version: 3.15 +Post-History: `14-Feb-2025 `__, Abstract @@ -353,8 +353,8 @@ TODO AND REMOVE BEFORE MERGE The existing `format specification `_ mandates that if both ``z`` and ``#`` are to be used, they must appear in that order, leading to ``z#.``, with ``z`` separated from its ``.``, however this could be changed to be more permissible if there are no syntax clashes, to permit ``#z.``, or is this just my taste? :pep:`682` proposed / uses ``[sign][z]`` instead of ``[sign[z]]``, which has given us the opportunity to reuse ``z``, and really has no strict need to be ``[sign][z][#]...[.precision]``, or are we doing too much voodoo by allowing ``z`` and ``#`` to commute with each other, even if it's just for ``int`` fields. I'm starting to think this might be too much voodoo. -.. - FOOTNOTES +Footnotes +========= .. _formatstrings: https://docs.python.org/3/library/string.html#formatstrings .. _formatspec: https://docs.python.org/3/library/string.html#formatspec From bd29ea463001810179d7280ee1489203b4ab71c6 Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 8 May 2025 03:35:35 +0100 Subject: [PATCH 06/19] Remove TODO bullet point for `z` and `#` commutativity It's better to give the formatspec one canonical ordering than permit an overly liberal rearrangeability. If commutativity were added, then as well as the messy description required for the docs for the particular case of `int` data, two people could write two different format spec that result in the same output and not realise it or agree, because they've written different things, leading to confusion etc. --- peps/pep-0786.rst | 5 ----- 1 file changed, 5 deletions(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index f74a09a24fc..667e58733c3 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -346,11 +346,6 @@ TODO AND REMOVE BEFORE MERGE * RFC 2119 Style Specification? After all is said and done here. * Add Sergey and Raymond to the authors field, or is Thanks enough? * Give a good example of non-hardware aware use of modular arithmetic formatting, my brain has gone blank... -* Possibly (probably not) add one more feature to the PEP: - - Loosening :pep:`682`\ 's strict ordering of ``[#][z]`` as they appear in the `format specification `_ for ``int`` fields for readability. (Or is this just my taste?: debate this) - - The existing `format specification `_ mandates that if both ``z`` and ``#`` are to be used, they must appear in that order, leading to ``z#.``, with ``z`` separated from its ``.``, however this could be changed to be more permissible if there are no syntax clashes, to permit ``#z.``, or is this just my taste? :pep:`682` proposed / uses ``[sign][z]`` instead of ``[sign[z]]``, which has given us the opportunity to reuse ``z``, and really has no strict need to be ``[sign][z][#]...[.precision]``, or are we doing too much voodoo by allowing ``z`` and ``#`` to commute with each other, even if it's just for ``int`` fields. I'm starting to think this might be too much voodoo. Footnotes From d7d9cf0f00677bfae4a5361d4ab15e3f4e7af9b9 Mon Sep 17 00:00:00 2001 From: jb2170 Date: Fri, 16 May 2025 07:47:19 +0100 Subject: [PATCH 07/19] Apply the more trivial suggestions from code review I'll address the other ones in a separate commit(s) Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com> --- .github/CODEOWNERS | 1 - peps/pep-0786.rst | 18 +++++++++--------- 2 files changed, 9 insertions(+), 10 deletions(-) diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index 78c3bc20ca4..038b2b35741 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -664,7 +664,6 @@ peps/pep-0783.rst @hoodmane @ambv peps/pep-0784.rst @gpshead peps/pep-0785.rst @gpshead peps/pep-0786.rst @ncoghlan -# ... peps/pep-0787.rst @ncoghlan peps/pep-0788.rst @ZeroIntensity @vstinner # ... diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index 667e58733c3..2a11990785f 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -1,5 +1,5 @@ PEP: 786 -Title: Precision and Modulo-Precision Flag format specifiers for integer fields +Title: Precision and modulo-precision flag format specifiers for integer fields Author: Jay Berry Sponsor: Alyssa Coghlan Status: Draft @@ -22,7 +22,7 @@ This PEP amends the clause of :pep:`3101` which states "[t]he precision is ignor Rationale ========= -When string formatting integers in binary octal and hexadecimal, one often desires the resulting string to contain a guaranteed minimum number of digits. For unsigned integers of known machine-width bounds (eg 8 bit bytes) this often also ends up the exact resulting number of digits. This has previously been implemented in the old-style ``%`` formatting using the ``.`` "precision" format specifier, closely related to that of the C programming language. +When string formatting integers in binary octal and hexadecimal, one often desires the resulting string to contain a guaranteed minimum number of digits. For unsigned integers of known machine-width bounds (for example, 8-bit bytes) this often also ends up the exact resulting number of digits. This has previously been implemented in the old-style ``%`` formatting using the ``.`` "precision" format specifier, closely related to that of the C programming language. .. code-block:: python @@ -31,7 +31,7 @@ When string formatting integers in binary octal and hexadecimal, one often desir >>> "0o%.3o" % 18 '0o022' # three octal digits, ideal for displaying a umask or file permissions -When :pep:`3101` new-style formatting was first introduced, used in ``str.format`` and ``f``\ strings, the `format specification `_ was simple enough that the behavior of "precision" could be trivially emulated with the ``width`` format specifier. Precision therefore was left unimplemented and forbidden for ``int`` fields. However, as time has progressed and new format specifiers have been added, whose interactions with ``width`` noticeably diverge its behavior away from emulating precision, the readmission of precision as its own format specifier, ``.``, is sufficiently warranted. +When :pep:`3101` new-style formatting was first introduced, used in ``str.format`` and f-strings, the `format specification `_ was simple enough that the behavior of "precision" could be trivially emulated with the ``width`` format specifier. Precision therefore was left unimplemented and forbidden for ``int`` fields. However, as time has progressed and new format specifiers have been added, whose interactions with ``width`` noticeably diverge its behavior away from emulating precision, the readmission of precision as its own format specifier, ``.``, is sufficiently warranted. The ``width`` format specifier guarantees a minimum length of the entire replacement field, not just the number of digits in a formatted integer. For example, the wonderful ``#`` specifier that prepends the prefix of the corresponding presentation type consumes from ``width``: @@ -54,7 +54,7 @@ One could attempt to argue that since the length of a prefix is known to always It is clear at this point that the reduction of complexity that would be provided by precision's implementation for ``int`` fields would be beneficial to any user. Nor is this proposal a new special-case behavior being demanded exclusively at the behest of ``int`` fields: the precision token ``.`` is already implemented as prescribed in :pep:`3101` for ``str`` data to truncate the field's length, and for ``float`` data to ensure that there are a fixed number of digits after the decimal point, eg ``f"{0.1+0.2: .4f}"`` producing ``' 0.3000'``. Thus no new tokens need adding to the `format specification `_ because of this proposal, maintaining its modest size. -For the sake of completion, and lack of any reasonable objection, we propose that precision shall work also in decimal, base 10. Explicitly, the integer presentation types laid out in :pep:`3101` that are permitted to implement precision are ``'b'``, ``'d'``, ``'o'``, ``'x'``, ``'X'``, ``'n'``, and ``'' (None)``. The only presentation type not permitted is ``c`` ('character'), whose purpose is to format an integer to a single Unicode character, or an appropriate replacement for non-printable characters, for which it does not make sense to implement precision. In the event that new integer presentation types are added in the future, such as ``'B'`` and ``'O'`` which mutatis-mutandis could provide the same behavior as ``'X'`` (that is a capitalized prefix and digits), their addition should appropriately consider whether precision should be implemented or not. In the case of ``'B'`` and ``'O'`` as described here it would be correct to implement precision. A ``ValueError`` shall be raised when precision is attempted to be used for invalid integer presentation types. +For the sake of completion, and lack of any reasonable objection, we propose that precision shall work also in decimal, base 10. Explicitly, the integer presentation types laid out in :pep:`3101` that are permitted to implement precision are ``'b'``, ``'d'``, ``'o'``, ``'x'``, ``'X'``, ``'n'``, and ``''`` (``None``). The only presentation type not permitted is ``c`` ('character'), whose purpose is to format an integer to a single Unicode character, or an appropriate replacement for non-printable characters, for which it does not make sense to implement precision. In the event that new integer presentation types are added in the future, such as ``'B'`` and ``'O'`` which mutatis-mutandis could provide the same behavior as ``'X'`` (that is a capitalized prefix and digits), their addition should appropriately consider whether precision should be implemented or not. In the case of ``'B'`` and ``'O'`` as described here it would be correct to implement precision. A ``ValueError`` shall be raised when precision is attempted to be used for invalid integer presentation types. Precision For Negative Numbers @@ -166,7 +166,7 @@ A final compromise to consider and reject is implementing ``z`` not as a flag *d Infinite Length Indication '''''''''''''''''''''''''' -Another, less popular, rejected alternative was for ``z`` to directly acknowledge the infinite prefix of ``0``\ s or ``1``\ s that precede a non-negative or negative number respectively. For example +Another, less popular, rejected alternative was for ``z`` to directly acknowledge the infinite prefix of ``0``\ s or ``1``\ s that precede a non-negative or negative number respectively. For example: .. code-block:: python @@ -199,7 +199,7 @@ Python's ``int`` type is indeed not limited by a maximum machine-width. Thus to >>> f"{y:z#.8b}" '0b[...1]11111111' -This may have been useful to educate beginner users on how bitwise binary operations work, for example showing how ``-1 & x`` is always trivially equal to ``x``, or how the binary representation of the negation of a number can be obtained by adding one to its bitwise complement: +This may have been useful to educate beginners on how bitwise binary operations work, for example showing how ``-1 & x`` is always trivially equal to ``x``, or how the binary representation of the negation of a number can be obtained by adding one to its bitwise complement: .. code-block:: python @@ -264,11 +264,11 @@ Syntax Backwards Compatibility ======================= -To quote :pep:`682` +To quote :pep:`682`: - The new formatting behavior is opt-in, so numerical formatting of existing programs will not be affected + The new formatting behavior is opt-in, so numerical formatting of existing programs will not be affected. -unless someone out there is specifically relying upon ``.`` raising a ``ValueError`` for integers as it currently does, but to quote :pep:`475` +unless someone out there is specifically relying upon ``.`` raising a ``ValueError`` for integers as it currently does, but to quote :pep:`475`: The authors of this PEP don't think that such applications exist From d1e47e47f4bdfaa4a007c913a6bc9b887e107d8b Mon Sep 17 00:00:00 2001 From: jb2170 Date: Fri, 16 May 2025 08:04:47 +0100 Subject: [PATCH 08/19] Hyphenate 8-bit etc when appropriate --- peps/pep-0786.rst | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index 2a11990785f..2b6b485b7e3 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -90,12 +90,12 @@ Observe that one can always extend a signed number's binary representation by ex .. code-block:: text - 45 (8 bit) 00101101 - 45 (9 bit) 000101101 - -19 (8 bit) 11101101 - -19 (9 bit) 111101101 + 45 (8-bit) 00101101 + 45 (9-bit) 000101101 + -19 (8-bit) 11101101 + -19 (9-bit) 111101101 -For non-negative numbers this is obvious. For negative numbers this is because the erstwhile leading column of an ``n`` bit representation goes from having a value of ``-2 ** (n-1)``, to ``+2 ** (n-1)``, with a new ``n+1``\ th column of value ``-2 ** n`` prefixed on, the overall sum unaffected. +For non-negative numbers this is obvious. For negative numbers this is because the erstwhile leading column of an ``n``\ -bit representation goes from having a value of ``-2 ** (n-1)``, to ``+2 ** (n-1)``, with a new ``n+1``\ th column of value ``-2 ** n`` prefixed on, the overall sum unaffected. This is what C's ``printf`` does, working with powers of two as the numbers of digits: @@ -113,10 +113,10 @@ Conversely it should be clear that one can losslessly truncate a signed number's .. code-block:: text - 45 (8 bit) 00101101 - 45 (7 bit) 0101101 - -19 (8 bit) 11101101 - -19 (7 bit) 1101101 + 45 (8-bit) 00101101 + 45 (7-bit) 0101101 + -19 (8-bit) 11101101 + -19 (7-bit) 1101101 If one were to truncate another digit off of these examples, then both would end up as ``101101``, 45 indistinguishable from -19 when using only 6 binary digits because they are both the same modulo ``2 ** 6 = 64``. Therefore to losslessly and unambiguously represent a signed integer ``x`` as a binary string which is rendered to the end user, we have a defacto 'minimal width' representation convention, using ``n`` digits, where ``n`` is the smallest integer such that ``x`` is in ``range(-2 ** (n-1), 2 ** (n-1))``. From be6f1dd48745202cdbedc11c7aed9eea87ae6e93 Mon Sep 17 00:00:00 2001 From: jb2170 Date: Fri, 16 May 2025 08:06:46 +0100 Subject: [PATCH 09/19] Correct 'defacto' -> 'de facto' --- peps/pep-0786.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index 2b6b485b7e3..2a47dba316d 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -118,7 +118,7 @@ Conversely it should be clear that one can losslessly truncate a signed number's -19 (8-bit) 11101101 -19 (7-bit) 1101101 -If one were to truncate another digit off of these examples, then both would end up as ``101101``, 45 indistinguishable from -19 when using only 6 binary digits because they are both the same modulo ``2 ** 6 = 64``. Therefore to losslessly and unambiguously represent a signed integer ``x`` as a binary string which is rendered to the end user, we have a defacto 'minimal width' representation convention, using ``n`` digits, where ``n`` is the smallest integer such that ``x`` is in ``range(-2 ** (n-1), 2 ** (n-1))``. +If one were to truncate another digit off of these examples, then both would end up as ``101101``, 45 indistinguishable from -19 when using only 6 binary digits because they are both the same modulo ``2 ** 6 = 64``. Therefore to losslessly and unambiguously represent a signed integer ``x`` as a binary string which is rendered to the end user, we have a de facto 'minimal width' representation convention, using ``n`` digits, where ``n`` is the smallest integer such that ``x`` is in ``range(-2 ** (n-1), 2 ** (n-1))``. For rendering octal and hexadecimal strings one has to extend the definition of the 'minimal width' representation convention to be sufficiently unambiguous. 383's minimal width binary string is ``0101111111``, and -129's is ``101111111``, a suffix of the former's. A naive, incorrect, implementation of hexadecimal string formatting would render both as ``'0x17f'`` by *padding* both binary representations to ``000101111111``. The method was correct to desire a number of binary digits (12) that is divisible by the number of bits in the base (4 bits in base 16) so that the binary representation can be segmented up into (hex) digits, but it was incorrect in *padding*; the method should have instead *extended* as we have observed previously, 383 extended to ``000101111111``, and -129 extended to ``111101111111``, whence 383 is rendered as ``'0x17f'`` and -129 as ``0xf7f``. @@ -143,7 +143,7 @@ Minimal Width Representation Convention This idea was fiercely entertained only due to its lossless behavior, however it is a obstacle to ergonomics in every candidate use case. These arguments about the aesthetics of string rendering are not irrational or about personal taste, but rather they are crucial in how information is communicated to the end user. -In a program in which signed-ness of integers is critical to communicate, any implementation of ``z`` should not be used, as the average user will be expecting to see a negative sign ``-``. The alternative of using minimal width representation convention requires one to be uncomfortably vigilant looking for leading digits of numbers belonging to the upper half of the base's range whenever a negative number is present (``1`` for binary, ``4-7`` for octal, and ``8-f`` for hex). Any end user that is not aware of this defacto convention, and even those who are but are not expecting it to be present in a program, would have a hard time: +In a program in which signed-ness of integers is critical to communicate, any implementation of ``z`` should not be used, as the average user will be expecting to see a negative sign ``-``. The alternative of using minimal width representation convention requires one to be uncomfortably vigilant looking for leading digits of numbers belonging to the upper half of the base's range whenever a negative number is present (``1`` for binary, ``4-7`` for octal, and ``8-f`` for hex). Any end user that is not aware of this de facto convention, and even those who are but are not expecting it to be present in a program, would have a hard time: The formatting of 128 and -128 using ``f"{x:z#.2x}"`` would produce ``'0x080'`` and ``'0x80'`` respectively. It is the PEP author's opinion that there is a 0% chance that ``'0x80'`` is being read as *negative* 128 under normal conditions. Furthermore the hideous rendering of positive 128 as ``'0x080'`` is useless for a program that should produce a uniformly spaced hexdump of bytes, agnostic of whether they are signed or unsigned; all bytes should be rendered in the form ``'0xNN'``. See the `examples <#modulo-precision>`__ section on how modulo-precision handles bytes in the correct sign-agnostic way. From b5c3b244382ef0a30c37f07309bb76cbeaf7e725 Mon Sep 17 00:00:00 2001 From: jb2170 Date: Fri, 16 May 2025 08:14:59 +0100 Subject: [PATCH 10/19] Reword the verdict of '!' syntax to not sound like there already is clutter in the format specification --- peps/pep-0786.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index 2a47dba316d..f2f0b5cfec5 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -258,7 +258,7 @@ Syntax Verdict: - - Whilst graphically attractive, ``!`` would add too much more clutter to the format specification for a purpose that can be achieved by overloading the preexisting ``z`` flag. + - Whilst graphically attractive, ``!`` would clutter the format specification for a single purpose that can be achieved by overloading the preexisting ``z`` flag. Backwards Compatibility From 17a5532f6ea047b9975b8dcf75814fd6d6e3a3a5 Mon Sep 17 00:00:00 2001 From: jb2170 Date: Fri, 16 May 2025 08:24:20 +0100 Subject: [PATCH 11/19] Remove TODO bullet point for authorship I'm the author of this PEP. Sergey and Raymond shouldn't be pestered over it. They are in the appropriate 'Thanks' section. --- peps/pep-0786.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index f2f0b5cfec5..d0bb734256c 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -344,7 +344,6 @@ TODO AND REMOVE BEFORE MERGE * Format all lines to ~80 characters. I've left this formatting until we're happy with the contents. * RFC 2119 Style Specification? After all is said and done here. -* Add Sergey and Raymond to the authors field, or is Thanks enough? * Give a good example of non-hardware aware use of modular arithmetic formatting, my brain has gone blank... From a06e9824943719eb7299190b8b09846fa85da482 Mon Sep 17 00:00:00 2001 From: jb2170 Date: Fri, 16 May 2025 09:06:20 +0100 Subject: [PATCH 12/19] Reword 45 and -19 being indistinguishable modulo 64, '101101' --- peps/pep-0786.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index d0bb734256c..d36594403dd 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -118,7 +118,7 @@ Conversely it should be clear that one can losslessly truncate a signed number's -19 (8-bit) 11101101 -19 (7-bit) 1101101 -If one were to truncate another digit off of these examples, then both would end up as ``101101``, 45 indistinguishable from -19 when using only 6 binary digits because they are both the same modulo ``2 ** 6 = 64``. Therefore to losslessly and unambiguously represent a signed integer ``x`` as a binary string which is rendered to the end user, we have a de facto 'minimal width' representation convention, using ``n`` digits, where ``n`` is the smallest integer such that ``x`` is in ``range(-2 ** (n-1), 2 ** (n-1))``. +If one were to truncate another digit off of these examples, then both would end up as ``101101``, 45 being indistinguishable from -19 when using only 6 binary digits because they are both the same modulo ``2 ** 6 = 64``. Therefore to losslessly and unambiguously represent a signed integer ``x`` as a binary string which is rendered to the end user, we have a de facto 'minimal width' representation convention, using ``n`` digits, where ``n`` is the smallest integer such that ``x`` is in ``range(-2 ** (n-1), 2 ** (n-1))``. For rendering octal and hexadecimal strings one has to extend the definition of the 'minimal width' representation convention to be sufficiently unambiguous. 383's minimal width binary string is ``0101111111``, and -129's is ``101111111``, a suffix of the former's. A naive, incorrect, implementation of hexadecimal string formatting would render both as ``'0x17f'`` by *padding* both binary representations to ``000101111111``. The method was correct to desire a number of binary digits (12) that is divisible by the number of bits in the base (4 bits in base 16) so that the binary representation can be segmented up into (hex) digits, but it was incorrect in *padding*; the method should have instead *extended* as we have observed previously, 383 extended to ``000101111111``, and -129 extended to ``111101111111``, whence 383 is rendered as ``'0x17f'`` and -129 as ``0xf7f``. From 2086547af696a30557c38efb3d8bb015a247cb1c Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 22 May 2025 08:02:32 +0100 Subject: [PATCH 13/19] Remove Sergey from Thanks as per his request --- peps/pep-0786.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index d36594403dd..fd797eff66c 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -328,7 +328,6 @@ Thanks Thank you to -* Sergey B Kirpichev, for discussions and implementation code. * Raymond Hettinger, for the initial suggestion of the two's complement behavior. From d477f4cea14bac021ba40f0bacc8fbbe505016ba Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 22 May 2025 10:39:17 +0100 Subject: [PATCH 14/19] Restructure Abstract and remove mentions to divmod --- peps/pep-0786.rst | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index fd797eff66c..9f87d38c2c7 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -12,9 +12,11 @@ Post-History: `14-Feb-2025 `__, Abstract ======== -This PEP proposes implementing the standard format specifier ``.`` of :pep:`3101` as "precision" for integers formatted with the binary, octal, decimal, and hexadecimal presentation types, and implementing the standard format specifier ``z`` as a "modulo" flag for integers formatted with precision and the binary, octal, and hexadecimal presentation types, which first reduces the integer into ``range(base ** precision)``, resulting in a predictable two's complement style formatting. +This PEP proposes implementing the standard format specifiers ``.`` and ``z`` of :pep:`3101` for integer fields as "precision" and "modulo-precision" respectively. Both are presented together in this PEP as the alternative rejected implementations entail intertwined combinations of both. -Both "precision" (``.``), and the "modulo-precision" flag (``z``) are presented in this PEP, as the alternative rejected implementations entail combinations of both. +``.`` ("precision") shall format an integer to a specified *minimum* number of digits, identical to the behavior of old-style ``%`` formatting. This shall be implemented for all integer presentation types except ``'c'``. + +``z`` ("modulo-precision") shall be permitted as an optional "modulo" flag when formatting an integer with precision and one of the binary, octal, or hexadecimal presentation types (bases that are powers of two). This first reduces the integer into ``range(base ** precision)`` using the ``%`` operator. The result is a predictable two's complement style formatting with the *exact* number of digits equal to the precision. This PEP amends the clause of :pep:`3101` which states "[t]he precision is ignored for integer conversions". @@ -72,9 +74,9 @@ We desire two behaviors, which motivates the implementation of a flag ``z`` to t For example ``f"{-12:#.2x}"`` shall produce ``'-0x0c'``, equivalent to ``"%#.2x" % -12``. -* For precision with the ``z`` flag, ``q, r = divmod(x, base ** n)`` is first taken when formatting ``f"{x:z.{n}{base_char}}"``, and ``r`` is passed on to precision, the resulting string being equivalent to ``f"{r:.{n}{base_char}}"``. Because ``r`` is in ``range(base ** n)`` the number of digits will always be exactly ``n``, resulting in a predictable two's complement style formatting, which is useful to the end user in environments that deal with machine-width oriented integers such as :mod:`struct`. +* For precision with the ``z`` flag, ``r = x % base ** n`` is first taken when formatting ``f"{x:z.{n}{base_char}}"``, and ``r`` is passed on to precision, the resulting string being equivalent to ``f"{r:.{n}{base_char}}"``. Because ``r`` is in ``range(base ** n)`` the number of digits will always be exactly ``n``, resulting in a predictable two's complement style formatting, which is useful to the end user in environments that deal with machine-width oriented integers such as :mod:`struct`. - For example in formatting ``f"{-1:z#.2x}"``, ``-1`` is reduced modulo ``256`` via ``-1, 255 = divmod(-1, 256)``, the resulting string being equivalent to ``f"{255:#.2x}"``, which is ``'0xff'``. + For example in formatting ``f"{-1:z#.2x}"``, ``-1`` is reduced modulo ``256`` via ``255 = -1 % 256``, the resulting string being equivalent to ``f"{255:#.2x}"``, which is ``'0xff'``. The ``z`` flag shall only be implemented for presentation types corresponding to bases that are powers of two, specifically at present binary, octal, and hexadecimal. Whilst reduction of integers modulo by powers of ten is computationally possible, a 'ten's complement?' has no demand and so precision is unimplemented for decimal presentation types. The ``z`` flag shall work for all integers, not just negatives. From e49ebdf5194fc90ddf11d380fb777e26d42ae719 Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 22 May 2025 10:48:59 +0100 Subject: [PATCH 15/19] Fix quoting style; when quoting a whole sentence one should keep the capitalisation --- peps/pep-0786.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index 9f87d38c2c7..1a199228652 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -18,7 +18,7 @@ This PEP proposes implementing the standard format specifiers ``.`` and ``z`` of ``z`` ("modulo-precision") shall be permitted as an optional "modulo" flag when formatting an integer with precision and one of the binary, octal, or hexadecimal presentation types (bases that are powers of two). This first reduces the integer into ``range(base ** precision)`` using the ``%`` operator. The result is a predictable two's complement style formatting with the *exact* number of digits equal to the precision. -This PEP amends the clause of :pep:`3101` which states "[t]he precision is ignored for integer conversions". +This PEP amends the clause of :pep:`3101` which states "The precision is ignored for integer conversions". Rationale From 9cd0077cb7d50b5aaaadb87c97ae1aca06a2a31c Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 22 May 2025 10:52:22 +0100 Subject: [PATCH 16/19] Remove TODO bullet point about RFC 2119 The new Abstract given by d477f4cea14bac021ba40f0bacc8fbbe505016ba seems sufficient. --- peps/pep-0786.rst | 1 - 1 file changed, 1 deletion(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index 1a199228652..d37fe12ee65 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -344,7 +344,6 @@ TODO AND REMOVE BEFORE MERGE ============================ * Format all lines to ~80 characters. I've left this formatting until we're happy with the contents. -* RFC 2119 Style Specification? After all is said and done here. * Give a good example of non-hardware aware use of modular arithmetic formatting, my brain has gone blank... From 565466d836fa7f4efac4a75d6ee4af5ce2935f4f Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 22 May 2025 11:57:59 +0100 Subject: [PATCH 17/19] Format to ~80 chars --- peps/pep-0786.rst | 378 ++++++++++++++++++++++++++++++++++++---------- 1 file changed, 301 insertions(+), 77 deletions(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index d37fe12ee65..e91aeb26057 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -12,19 +12,36 @@ Post-History: `14-Feb-2025 `__, Abstract ======== -This PEP proposes implementing the standard format specifiers ``.`` and ``z`` of :pep:`3101` for integer fields as "precision" and "modulo-precision" respectively. Both are presented together in this PEP as the alternative rejected implementations entail intertwined combinations of both. +This PEP proposes implementing the standard format specifiers ``.`` and ``z`` +of :pep:`3101` for integer fields as "precision" and "modulo-precision" +respectively. Both are presented together in this PEP as the alternative +rejected implementations entail intertwined combinations of both. -``.`` ("precision") shall format an integer to a specified *minimum* number of digits, identical to the behavior of old-style ``%`` formatting. This shall be implemented for all integer presentation types except ``'c'``. +``.`` ("precision") shall format an integer to a specified *minimum* number of +digits, identical to the behavior of old-style ``%`` formatting. This shall be +implemented for all integer presentation types except ``'c'``. -``z`` ("modulo-precision") shall be permitted as an optional "modulo" flag when formatting an integer with precision and one of the binary, octal, or hexadecimal presentation types (bases that are powers of two). This first reduces the integer into ``range(base ** precision)`` using the ``%`` operator. The result is a predictable two's complement style formatting with the *exact* number of digits equal to the precision. +``z`` ("modulo-precision") shall be permitted as an optional "modulo" flag +when formatting an integer with precision and one of the binary, octal, or +hexadecimal presentation types (bases that are powers of two). This first +reduces the integer into ``range(base ** precision)`` using the ``%`` operator. +The result is a predictable two's complement style formatting with the *exact* +number of digits equal to the precision. -This PEP amends the clause of :pep:`3101` which states "The precision is ignored for integer conversions". +This PEP amends the clause of :pep:`3101` which states "The precision is +ignored for integer conversions". Rationale ========= -When string formatting integers in binary octal and hexadecimal, one often desires the resulting string to contain a guaranteed minimum number of digits. For unsigned integers of known machine-width bounds (for example, 8-bit bytes) this often also ends up the exact resulting number of digits. This has previously been implemented in the old-style ``%`` formatting using the ``.`` "precision" format specifier, closely related to that of the C programming language. +When string formatting integers in binary octal and hexadecimal, one often +desires the resulting string to contain a guaranteed minimum number of digits. +For unsigned integers of known machine-width bounds (for example, 8-bit bytes) +this often also ends up the exact resulting number of digits. This has +previously been implemented in the old-style ``%`` formatting using the +``.`` "precision" format specifier, closely related to that of the C +programming language. .. code-block:: python @@ -33,9 +50,19 @@ When string formatting integers in binary octal and hexadecimal, one often desir >>> "0o%.3o" % 18 '0o022' # three octal digits, ideal for displaying a umask or file permissions -When :pep:`3101` new-style formatting was first introduced, used in ``str.format`` and f-strings, the `format specification `_ was simple enough that the behavior of "precision" could be trivially emulated with the ``width`` format specifier. Precision therefore was left unimplemented and forbidden for ``int`` fields. However, as time has progressed and new format specifiers have been added, whose interactions with ``width`` noticeably diverge its behavior away from emulating precision, the readmission of precision as its own format specifier, ``.``, is sufficiently warranted. +When :pep:`3101` new-style formatting was first introduced, used in +``str.format`` and f-strings, the `format specification `_ was +simple enough that the behavior of "precision" could be trivially emulated with +the ``width`` format specifier. Precision therefore was left unimplemented and +forbidden for ``int`` fields. However, as time has progressed and new format +specifiers have been added, whose interactions with ``width`` noticeably +diverge its behavior away from emulating precision, the readmission of +precision as its own format specifier, ``.``, is sufficiently warranted. -The ``width`` format specifier guarantees a minimum length of the entire replacement field, not just the number of digits in a formatted integer. For example, the wonderful ``#`` specifier that prepends the prefix of the corresponding presentation type consumes from ``width``: +The ``width`` format specifier guarantees a minimum length of the entire +replacement field, not just the number of digits in a formatted integer. +For example, the wonderful ``#`` specifier that prepends the prefix of the +corresponding presentation type consumes from ``width``: .. code-block:: python @@ -47,48 +74,109 @@ The ``width`` format specifier guarantees a minimum length of the entire replace >>> f"{x:#08b}" '0b001100' # we wanted 8 bits, not 6 :( -One could attempt to argue that since the length of a prefix is known to always be 2, it can be accounted for manually by adding 2 to the desired number of digits. Consider however the following demonstrations of why this is a bad idea: - -* By correcting the second example to ``f"{x:#04x}"``, at a glance this looks like it may produce four hex digits, but it only produces two. This is bad for readability. ``4`` is thus too much of a 'magic number', and trying to counter that by being overly explicit with ``f"{x:#0{2+2}x}"`` looks ridiculous. -* In the future it is possible that a type specifier may be added with a prefix not of length 2, meaning the programmer has to calculate the prefix length, rather than Python's internal string formatting code handling that automatically. -* Things get more complicated when using the ``sign`` format specifier, ``f"{x: #0{1+2+2}x}"`` required to produce ``' 0x0c'``. -* Things get *even more* complicated when introducing a ``grouping_option``, for example formatting an integer into ``k`` 'word' segments joined by ``_``: ``x = 3735928559; k = 2; f"{x: #0{1+2+4*k+(k - 1)}_x}"`` is required to produce ``' 0xdead_beef'``. Surely this would be easier to write with precision as ``f"{x: #_.8x}"``? - -It is clear at this point that the reduction of complexity that would be provided by precision's implementation for ``int`` fields would be beneficial to any user. Nor is this proposal a new special-case behavior being demanded exclusively at the behest of ``int`` fields: the precision token ``.`` is already implemented as prescribed in :pep:`3101` for ``str`` data to truncate the field's length, and for ``float`` data to ensure that there are a fixed number of digits after the decimal point, eg ``f"{0.1+0.2: .4f}"`` producing ``' 0.3000'``. Thus no new tokens need adding to the `format specification `_ because of this proposal, maintaining its modest size. - -For the sake of completion, and lack of any reasonable objection, we propose that precision shall work also in decimal, base 10. Explicitly, the integer presentation types laid out in :pep:`3101` that are permitted to implement precision are ``'b'``, ``'d'``, ``'o'``, ``'x'``, ``'X'``, ``'n'``, and ``''`` (``None``). The only presentation type not permitted is ``c`` ('character'), whose purpose is to format an integer to a single Unicode character, or an appropriate replacement for non-printable characters, for which it does not make sense to implement precision. In the event that new integer presentation types are added in the future, such as ``'B'`` and ``'O'`` which mutatis-mutandis could provide the same behavior as ``'X'`` (that is a capitalized prefix and digits), their addition should appropriately consider whether precision should be implemented or not. In the case of ``'B'`` and ``'O'`` as described here it would be correct to implement precision. A ``ValueError`` shall be raised when precision is attempted to be used for invalid integer presentation types. +One could attempt to argue that since the length of a prefix is known to +always be 2, it can be accounted for manually by adding 2 to the desired +number of digits. Consider however the following demonstrations of why this is +a bad idea: + +* By correcting the second example to ``f"{x:#04x}"``, at a glance this looks + like it may produce four hex digits, but it only produces two. This is bad + for readability. ``4`` is thus too much of a 'magic number', and trying to + counter that by being overly explicit with ``f"{x:#0{2+2}x}"`` looks ridiculous. +* In the future it is possible that a type specifier may be added with a prefix + not of length 2, meaning the programmer has to calculate the prefix length, + rather than Python's internal string formatting code handling that automatically. +* Things get more complicated when using the ``sign`` format specifier, + ``f"{x: #0{1+2+2}x}"`` required to produce ``' 0x0c'``. +* Things get *even more* complicated when introducing a ``grouping_option``, + for example formatting an integer into ``k`` 'word' segments joined by ``_``: + ``x = 3735928559; k = 2; f"{x: #0{1+2+4*k+(k - 1)}_x}"`` is required to + produce ``' 0xdead_beef'``. Surely this would be easier to write + with precision as ``f"{x: #_.8x}"``? + +It is clear at this point that the reduction of complexity that would be +provided by precision's implementation for ``int`` fields would be beneficial +to any user. Nor is this proposal a new special-case behavior being demanded +exclusively at the behest of ``int`` fields: the precision token ``.`` is +already implemented as prescribed in :pep:`3101` for ``str`` data to truncate +the field's length, and for ``float`` data to ensure that there are a fixed +number of digits after the decimal point, eg ``f"{0.1+0.2: .4f}"`` producing +``' 0.3000'``. Thus no new tokens need adding to the `format specification `_ +because of this proposal, maintaining its modest size. + +For the sake of completion, and lack of any reasonable objection, we propose +that precision shall work also in decimal, base 10. Explicitly, the integer +presentation types laid out in :pep:`3101` that are permitted to implement +precision are ``'b'``, ``'d'``, ``'o'``, ``'x'``, ``'X'``, ``'n'``, +and ``''`` (``None``). The only presentation type not permitted is +``c`` ('character'), whose purpose is to format an integer to a single Unicode +character, or an appropriate replacement for non-printable characters, for +which it does not make sense to implement precision. In the event that new +integer presentation types are added in the future, such as ``'B'`` and ``'O'`` +which mutatis-mutandis could provide the same behavior as ``'X'`` (that is a +capitalized prefix and digits), their addition should appropriately consider +whether precision should be implemented or not. In the case of ``'B'`` and ``'O'`` +as described here it would be correct to implement precision. A ``ValueError`` +shall be raised when precision is attempted to be used for invalid integer +presentation types. Precision For Negative Numbers ------------------------------ -So far in this PEP we have cautiously avoided talking about the formatting of negative numbers with precision, which we shall now discuss. +So far in this PEP we have cautiously avoided talking about the formatting of +negative numbers with precision, which we shall now discuss. Short Verdict ''''''''''''' -We desire two behaviors, which motivates the implementation of a flag ``z`` to toggle on the latter's behavior: +We desire two behaviors, which motivates the implementation of a flag ``z`` to +toggle on the latter's behavior: -* For precision without the ``z`` flag, a negative integer ``x`` shall be formatted with a negative sign and the digits of ``-x``'s formatting. This is the same friendly behavior as old-style ``%`` formatting. +* For precision without the ``z`` flag, a negative integer ``x`` shall be + formatted with a negative sign and the digits of ``-x``'s formatting. This is + the same friendly behavior as old-style ``%`` formatting. For example ``f"{-12:#.2x}"`` shall produce ``'-0x0c'``, equivalent to ``"%#.2x" % -12``. -* For precision with the ``z`` flag, ``r = x % base ** n`` is first taken when formatting ``f"{x:z.{n}{base_char}}"``, and ``r`` is passed on to precision, the resulting string being equivalent to ``f"{r:.{n}{base_char}}"``. Because ``r`` is in ``range(base ** n)`` the number of digits will always be exactly ``n``, resulting in a predictable two's complement style formatting, which is useful to the end user in environments that deal with machine-width oriented integers such as :mod:`struct`. - - For example in formatting ``f"{-1:z#.2x}"``, ``-1`` is reduced modulo ``256`` via ``255 = -1 % 256``, the resulting string being equivalent to ``f"{255:#.2x}"``, which is ``'0xff'``. - - The ``z`` flag shall only be implemented for presentation types corresponding to bases that are powers of two, specifically at present binary, octal, and hexadecimal. Whilst reduction of integers modulo by powers of ten is computationally possible, a 'ten's complement?' has no demand and so precision is unimplemented for decimal presentation types. The ``z`` flag shall work for all integers, not just negatives. - - The syntax choice of ``z`` is again out of respect for maintaining the modest size of the `format specification `_. ``z`` was introduced to the format specification in :pep:`682` as a flag for normalizing negative zero to positive zero for the ``float`` and ``Decimal`` types. It is currently unimplemented for the ``int`` type, and since integers never have a 'negative zero' situation it seems uncontroversial to repurpose ``z``, again as a flag. If one squints hard enough, the ``z`` looks like a ``2`` for two's complement! +* For precision with the ``z`` flag, ``r = x % base ** n`` is first taken when + formatting ``f"{x:z.{n}{base_char}}"``, and ``r`` is passed on to precision, + the resulting string being equivalent to ``f"{r:.{n}{base_char}}"``. Because + ``r`` is in ``range(base ** n)`` the number of digits will always be exactly + ``n``, resulting in a predictable two's complement style formatting, which is + useful to the end user in environments that deal with machine-width oriented + integers such as :mod:`struct`. + + For example in formatting ``f"{-1:z#.2x}"``, ``-1`` is reduced modulo ``256`` + via ``255 = -1 % 256``, the resulting string being equivalent to ``f"{255:#.2x}"``, + which is ``'0xff'``. + + The ``z`` flag shall only be implemented for presentation types corresponding + to bases that are powers of two, specifically at present binary, octal, and + hexadecimal. Whilst reduction of integers modulo by powers of ten is computationally + possible, a 'ten's complement?' has no demand and so precision is unimplemented + for decimal presentation types. The ``z`` flag shall work for all integers, + not just negatives. + + The syntax choice of ``z`` is again out of respect for maintaining the modest + size of the `format specification `_. ``z`` was introduced to the + format specification in :pep:`682` as a flag for normalizing negative zero to + positive zero for the ``float`` and ``Decimal`` types. It is currently + unimplemented for the ``int`` type, and since integers never have a 'negative zero' + situation it seems uncontroversial to repurpose ``z``, again as a flag. If one + squints hard enough, the ``z`` looks like a ``2`` for two's complement! Long Introspection '''''''''''''''''' -We first present some observations about the binary representations of *signed* integers in two's complement. This leads us to a couple of alternative formulations of formatting negative numbers. +We first present some observations about the binary representations of *signed* +integers in two's complement. This leads us to a couple of alternative formulations +of formatting negative numbers. -Observe that one can always extend a signed number's binary representation by extending the the leading digit as a prefix: +Observe that one can always extend a signed number's binary representation by +extending the the leading digit as a prefix: .. code-block:: text @@ -97,7 +185,10 @@ Observe that one can always extend a signed number's binary representation by ex -19 (8-bit) 11101101 -19 (9-bit) 111101101 -For non-negative numbers this is obvious. For negative numbers this is because the erstwhile leading column of an ``n``\ -bit representation goes from having a value of ``-2 ** (n-1)``, to ``+2 ** (n-1)``, with a new ``n+1``\ th column of value ``-2 ** n`` prefixed on, the overall sum unaffected. +For non-negative numbers this is obvious. For negative numbers this is because +the erstwhile leading column of an ``n``\ -bit representation goes from having a +value of ``-2 ** (n-1)``, to ``+2 ** (n-1)``, with a new ``n+1``\ th column of +value ``-2 ** n`` prefixed on, the overall sum unaffected. This is what C's ``printf`` does, working with powers of two as the numbers of digits: @@ -111,7 +202,9 @@ This is what C's ``printf`` does, working with powers of two as the numbers of d printf("%#o\n", -19); // 037777777755 printf("%#x\n", -19); // 0xffffffed -Conversely it should be clear that one can losslessly truncate a signed number's binary representation to have only one leading ``0`` if it is non-negative, and one leading ``1`` if it is negative: +Conversely it should be clear that one can losslessly truncate a signed number's +binary representation to have only one leading ``0`` if it is non-negative, and +one leading ``1`` if it is negative: .. code-block:: text @@ -120,11 +213,31 @@ Conversely it should be clear that one can losslessly truncate a signed number's -19 (8-bit) 11101101 -19 (7-bit) 1101101 -If one were to truncate another digit off of these examples, then both would end up as ``101101``, 45 being indistinguishable from -19 when using only 6 binary digits because they are both the same modulo ``2 ** 6 = 64``. Therefore to losslessly and unambiguously represent a signed integer ``x`` as a binary string which is rendered to the end user, we have a de facto 'minimal width' representation convention, using ``n`` digits, where ``n`` is the smallest integer such that ``x`` is in ``range(-2 ** (n-1), 2 ** (n-1))``. - -For rendering octal and hexadecimal strings one has to extend the definition of the 'minimal width' representation convention to be sufficiently unambiguous. 383's minimal width binary string is ``0101111111``, and -129's is ``101111111``, a suffix of the former's. A naive, incorrect, implementation of hexadecimal string formatting would render both as ``'0x17f'`` by *padding* both binary representations to ``000101111111``. The method was correct to desire a number of binary digits (12) that is divisible by the number of bits in the base (4 bits in base 16) so that the binary representation can be segmented up into (hex) digits, but it was incorrect in *padding*; the method should have instead *extended* as we have observed previously, 383 extended to ``000101111111``, and -129 extended to ``111101111111``, whence 383 is rendered as ``'0x17f'`` and -129 as ``0xf7f``. - -Thus the generalized definition of our 'minimal width' representation convention is: for an integer ``x`` to rendered in base ``base``, produce ``n`` digits, where ``n`` is the smallest integer such that ``x`` is in ``range(-base ** n / 2, base ** n / 2)``. +If one were to truncate another digit off of these examples, then both would +end up as ``101101``, 45 being indistinguishable from -19 when using only 6 binary +digits because they are both the same modulo ``2 ** 6 = 64``. Therefore to +losslessly and unambiguously represent a signed integer ``x`` as a binary string +which is rendered to the end user, we have a de facto 'minimal width' representation +convention, using ``n`` digits, where ``n`` is the smallest integer such that +``x`` is in ``range(-2 ** (n-1), 2 ** (n-1))``. + +For rendering octal and hexadecimal strings one has to extend the definition of +the 'minimal width' representation convention to be sufficiently unambiguous. +383's minimal width binary string is ``0101111111``, and -129's is ``101111111``, +a suffix of the former's. A naive, incorrect, implementation of hexadecimal +string formatting would render both as ``'0x17f'`` by *padding* both binary +representations to ``000101111111``. The method was correct to desire a number +of binary digits (12) that is divisible by the number of bits in the base +(4 bits in base 16) so that the binary representation can be segmented up into +(hex) digits, but it was incorrect in *padding*; the method should have instead +*extended* as we have observed previously, 383 extended to ``000101111111``, +and -129 extended to ``111101111111``, whence 383 is rendered as ``'0x17f'`` +and -129 as ``0xf7f``. + +Thus the generalized definition of our 'minimal width' representation convention +is: for an integer ``x`` to rendered in base ``base``, produce ``n`` digits, +where ``n`` is the smallest integer such that ``x`` is in +``range(-base ** n / 2, base ** n / 2)``. This leads onto the rejected alternatives. @@ -135,40 +248,109 @@ Rejected Alternatives Behavior of ``z`` ----------------- -The desired implementation of ``z``, the two's complement style formatting flag, has split into two main camps of opinions, disagreeing over lossless vs lossy presentation. The lossless camp believes that the formatted strings corresponding to integers should all be distinct from each other, uniqueness preserved by the minimal width representation convention; precision with ``z`` enabled should still be only a *minimum* number of digits requested, as it is without ``z``. The lossy camp believes that precision with ``z`` enabled should first reduce the integer using modular arithmetic, which then produces *exactly* the number of digits requested, equivalent to left-truncating the minimal width representation string. +The desired implementation of ``z``, the two's complement style formatting flag, +has split into two main camps of opinions, disagreeing over lossless vs lossy +presentation. The lossless camp believes that the formatted strings corresponding +to integers should all be distinct from each other, uniqueness preserved by the +minimal width representation convention; precision with ``z`` enabled should still +be only a *minimum* number of digits requested, as it is without ``z``. The lossy +camp believes that precision with ``z`` enabled should first reduce the integer +using modular arithmetic, which then produces *exactly* the number of digits +requested, equivalent to left-truncating the minimal width representation string. -We endeavor to conclude in the following section that the former camp, lossless formatting, has no use cases, and is thus a rejected idea, whence this PEP proposes the latter, lossy, behavior. +We endeavor to conclude in the following section that the former camp, lossless +formatting, has no use cases, and is thus a rejected idea, whence this PEP +proposes the latter, lossy, behavior. Minimal Width Representation Convention ''''''''''''''''''''''''''''''''''''''' -This idea was fiercely entertained only due to its lossless behavior, however it is a obstacle to ergonomics in every candidate use case. These arguments about the aesthetics of string rendering are not irrational or about personal taste, but rather they are crucial in how information is communicated to the end user. - -In a program in which signed-ness of integers is critical to communicate, any implementation of ``z`` should not be used, as the average user will be expecting to see a negative sign ``-``. The alternative of using minimal width representation convention requires one to be uncomfortably vigilant looking for leading digits of numbers belonging to the upper half of the base's range whenever a negative number is present (``1`` for binary, ``4-7`` for octal, and ``8-f`` for hex). Any end user that is not aware of this de facto convention, and even those who are but are not expecting it to be present in a program, would have a hard time: - -The formatting of 128 and -128 using ``f"{x:z#.2x}"`` would produce ``'0x080'`` and ``'0x80'`` respectively. It is the PEP author's opinion that there is a 0% chance that ``'0x80'`` is being read as *negative* 128 under normal conditions. Furthermore the hideous rendering of positive 128 as ``'0x080'`` is useless for a program that should produce a uniformly spaced hexdump of bytes, agnostic of whether they are signed or unsigned; all bytes should be rendered in the form ``'0xNN'``. See the `examples <#modulo-precision>`__ section on how modulo-precision handles bytes in the correct sign-agnostic way. - -Contrapositively therefore ``z``'s purpose is to be used in environments where signed-ness is *not* critical, and more likely than not where it is even encouraged to treat the integers with respect to the modular arithmetic that arises in two's complement hardware of fixed register sizes. In the example above 128 and -128 are the same modulo 256, and the respectable rendering is ``'0x80'``. In general the purpose of ``z`` is to treat integers modulo ``base ** precision`` as the same. So too 255 and -1 should both be rendered as ``'0xff'``, not ``'0x0ff'`` and ``'0xff'`` respectively; the truncation is not a hindrance, but the desired behavior. Formally we may say that the formatting should be a well defined bijection between the equivalence classes of ``Z/(base ** precision)Z`` and strings with ``precision`` digits. - -The remaining question is "[sic] is there no chance to communicate this truncation to user?" as a concern for the 'loss of information' arising from the effectively left-truncated strings. We reject this question's premise that there ever is such a case of unintentional loss of information, considering the two cases of hardware-aware integers and otherwise: - -So far we have played around with examples of bytes in ``range(-128, 256)``, the union of the signed and unsigned ranges, with respect to which the virtues of formatting ``x`` and ``x - 256`` as the same are clearly established. In the hardware-aware contexts that one expects to find ``z``, any integers corresponding to bytes that lie outside that range are likely a programming error. For example if a library sets a pixel brightness integer to be 257, and prints out ``'0x01'`` instead of ``'0x101'`` via ``f"{x:z#.2x}"``, that's not our problem or doing; string formatting shouldn't raise an exception, or even a ``SyntaxWarning`` as an invalid escape sequence ``"\y"`` would, because ``ValueError: bytes must be in range(0, 256)`` will be raised by ``bytes`` when trying to serialize that integer via ``bytes([257])``; let the appropriate 'layer' of code raise the exception, as that is more indicative of a defect in the library, not our string formatting. - -In the case of non-hardware aware integers one would have to intentionally opt to use ``z``, in which modular arithmetic is the chosen desired effect. It is for this reason also that we shall not raise a ``SyntaxWarning`` or ``ValueError`` for integers lying outside of ``range(-base ** precision / 2, base ** precision)``. +This idea was fiercely entertained only due to its lossless behavior, however it +is a obstacle to ergonomics in every candidate use case. These arguments about +the aesthetics of string rendering are not irrational or about personal taste, +but rather they are crucial in how information is communicated to the end user. + +In a program in which signed-ness of integers is critical to communicate, any +implementation of ``z`` should not be used, as the average user will be expecting +to see a negative sign ``-``. The alternative of using minimal width representation +convention requires one to be uncomfortably vigilant looking for leading digits +of numbers belonging to the upper half of the base's range whenever a negative +number is present (``1`` for binary, ``4-7`` for octal, and ``8-f`` for hex). +Any end user that is not aware of this de facto convention, and even those who +are but are not expecting it to be present in a program, would have a hard time: + +The formatting of 128 and -128 using ``f"{x:z#.2x}"`` would produce ``'0x080'`` +and ``'0x80'`` respectively. It is the PEP author's opinion that there is a 0% +chance that ``'0x80'`` is being read as *negative* 128 under normal conditions. +Furthermore the hideous rendering of positive 128 as ``'0x080'`` is useless for +a program that should produce a uniformly spaced hexdump of bytes, agnostic of +whether they are signed or unsigned; all bytes should be rendered in the form +``'0xNN'``. See the `examples <#modulo-precision>`__ section on how modulo-precision +handles bytes in the correct sign-agnostic way. + +Contrapositively therefore ``z``'s purpose is to be used in environments where +signed-ness is *not* critical, and more likely than not where it is even +encouraged to treat the integers with respect to the modular arithmetic that +arises in two's complement hardware of fixed register sizes. In the example above +128 and -128 are the same modulo 256, and the respectable rendering is ``'0x80'``. +In general the purpose of ``z`` is to treat integers modulo ``base ** precision`` +as the same. So too 255 and -1 should both be rendered as ``'0xff'``, not +``'0x0ff'`` and ``'0xff'`` respectively; the truncation is not a hindrance, but +the desired behavior. Formally we may say that the formatting should be a well +defined bijection between the equivalence classes of ``Z/(base ** precision)Z`` +and strings with ``precision`` digits. + +The remaining question is "[sic] is there no chance to communicate this +truncation to user?" as a concern for the 'loss of information' arising from the +effectively left-truncated strings. We reject this question's premise that there +ever is such a case of unintentional loss of information, considering the two +cases of hardware-aware integers and otherwise: + +So far we have played around with examples of bytes in ``range(-128, 256)``, +the union of the signed and unsigned ranges, with respect to which the virtues +of formatting ``x`` and ``x - 256`` as the same are clearly established. In the +hardware-aware contexts that one expects to find ``z``, any integers corresponding +to bytes that lie outside that range are likely a programming error. For example +if a library sets a pixel brightness integer to be 257, and prints out ``'0x01'`` +instead of ``'0x101'`` via ``f"{x:z#.2x}"``, that's not our problem or doing; string +formatting shouldn't raise an exception, or even a ``SyntaxWarning`` as an invalid +escape sequence ``"\y"`` would, because ``ValueError: bytes must be in range(0, 256)`` +will be raised by ``bytes`` when trying to serialize that integer via ``bytes([257])``; +let the appropriate 'layer' of code raise the exception, as that is more indicative +of a defect in the library, not our string formatting. + +In the case of non-hardware aware integers one would have to intentionally opt to +use ``z``, in which modular arithmetic is the chosen desired effect. It is for +this reason also that we shall not raise a ``SyntaxWarning`` or ``ValueError`` +for integers lying outside of ``range(-base ** precision / 2, base ** precision)``. .. - XXX Give a good example of non-hardware aware use of modular arithmetic formatting like Minecraft buried treasure always being at 8,8 within a chunk. + XXX Give a good example of non-hardware aware use of modular arithmetic + formatting like Minecraft buried treasure always being at 8,8 within a chunk. -Thus we have defended the lossy behavior of ``z`` implemented as modulo-precision, and we have exhausted all reasonable use cases of lossless behavior. +Thus we have defended the lossy behavior of ``z`` implemented as modulo-precision, +and we have exhausted all reasonable use cases of lossless behavior. -A final compromise to consider and reject is implementing ``z`` not as a flag *dependent* on ``.``, but as a flag that can be *combined* with ``.``. Specifically: ``z`` without ``.`` would turn on two's complement mode to render the minimal width representation of the formatted integer, ``.`` without ``z`` would implement precision as already explained, a minimum number of digits in the magnitude and a sign if necessary, and ``z`` combined with ``.`` would turn on the left-truncating modulo-precision. This labyrinth of combinations does not seem useful to anyone, as we have already discredited the ergonomics of minimal width representation convention, whence ``z`` would rarely be used on its own, and this behavior of two options that individually render a *minimum* number of digits combining together to render an *exact* number of digits seems counterintuitive. +A final compromise to consider and reject is implementing ``z`` not as a flag +*dependent* on ``.``, but as a flag that can be *combined* with ``.``. +Specifically: ``z`` without ``.`` would turn on two's complement mode to render +the minimal width representation of the formatted integer, ``.`` without ``z`` +would implement precision as already explained, a minimum number of digits in the +magnitude and a sign if necessary, and ``z`` combined with ``.`` would turn on the +left-truncating modulo-precision. This labyrinth of combinations does not seem +useful to anyone, as we have already discredited the ergonomics of minimal width +representation convention, whence ``z`` would rarely be used on its own, and this +behavior of two options that individually render a *minimum* number of digits +combining together to render an *exact* number of digits seems counterintuitive. Infinite Length Indication '''''''''''''''''''''''''' -Another, less popular, rejected alternative was for ``z`` to directly acknowledge the infinite prefix of ``0``\ s or ``1``\ s that precede a non-negative or negative number respectively. For example: +Another, less popular, rejected alternative was for ``z`` to directly acknowledge +the infinite prefix of ``0``\ s or ``1``\ s that precede a non-negative or negative +number respectively. For example: .. code-block:: python @@ -177,9 +359,12 @@ Another, less popular, rejected alternative was for ``z`` to directly acknowledg >>> f"{300:z#.8b}" '0b[...0]100101100' -This is effectively the minimal width representation convention with an 'infinite' prefix attached to it. +This is effectively the minimal width representation convention with an 'infinite' +prefix attached to it. -In the C programming language the machine-width dependent two's complement formatting of ``int`` data with precision exhibits excessive lengths of prefixes that arise from negative numbers, even those with small magnitude: +In the C programming language the machine-width dependent two's complement +formatting of ``int`` data with precision exhibits excessive lengths of prefixes +that arise from negative numbers, even those with small magnitude: .. code-block:: C @@ -188,7 +373,10 @@ In the C programming language the machine-width dependent two's complement forma This prefix could continue on indefinitely if it were not limited by a maximum machine-width! -Python's ``int`` type is indeed not limited by a maximum machine-width. Thus to avoid printing infinitely long two's complement strings we could use a similar approach to that of the builtin ``list``'s string formatting for printing a list that contains itself: +Python's ``int`` type is indeed not limited by a maximum machine-width. Thus to +avoid printing infinitely long two's complement strings we could use a similar +approach to that of the builtin ``list``'s string formatting for printing a list +that contains itself: .. code-block:: python @@ -201,7 +389,10 @@ Python's ``int`` type is indeed not limited by a maximum machine-width. Thus to >>> f"{y:z#.8b}" '0b[...1]11111111' -This may have been useful to educate beginners on how bitwise binary operations work, for example showing how ``-1 & x`` is always trivially equal to ``x``, or how the binary representation of the negation of a number can be obtained by adding one to its bitwise complement: +This may have been useful to educate beginners on how bitwise binary operations +work, for example showing how ``-1 & x`` is always trivially equal to ``x``, or +how the binary representation of the negation of a number can be obtained by +adding one to its bitwise complement: .. code-block:: python @@ -230,13 +421,19 @@ General * What about ones's complement, or other binary representations? - Two's complement is so dominant that no one really considers other representations. GCC only supports two's complement. + Two's complement is so dominant that no one really considers other representations. + GCC only supports two's complement. * Could we do nothing? - Programmers continue to hobble on using the ``width`` format specifier with ad-hoc corrections to mimic precision. This is intolerable, and the rationale of this PEP makes conclusive arguments for the addition and implementation choices of precision. + Programmers continue to hobble on using the ``width`` format specifier with ad-hoc + corrections to mimic precision. This is intolerable, and the rationale of this PEP + makes conclusive arguments for the addition and implementation choices of precision. - Refusing to implement precision for integer fields using ``.`` reserves ``.`` for possible future uses. However in the ~20 year timespan since :pep:`3101` no alternatives have been accepted, and any alternate use of ``.`` takes it further out of sync with both old-style ``%`` formatting, and the C programming language. + Refusing to implement precision for integer fields using ``.`` reserves ``.`` for + possible future uses. However in the ~20 year timespan since :pep:`3101` no + alternatives have been accepted, and any alternate use of ``.`` takes it further + out of sync with both old-style ``%`` formatting, and the C programming language. Syntax @@ -246,21 +443,38 @@ Syntax Pros: - - ``!`` is graphically related to ``.``, an extension if you will. Precision with the modulo-precision flag set is indeed an extension of precision. - - ``!`` in the English language is often used for imperative, commanding sentences. So too modulo-precision commands the *exact* number of digits to which its input shall be formatted, whereas precision is the *minimum* number of digits. This is idiomatic. - - ``!`` is only one symbol as opposed to ``z.``. This coupled with ``!`` being mutually exclusive with ``.`` leaves the overall length of one's written code unaffected when switching on modulo-precision. + - ``!`` is graphically related to ``.``, an extension if you will. Precision + with the modulo-precision flag set is indeed an extension of precision. + - ``!`` in the English language is often used for imperative, commanding sentences. + So too modulo-precision commands the *exact* number of digits to which its input + shall be formatted, whereas precision is the *minimum* number of digits. + This is idiomatic. + - ``!`` is only one symbol as opposed to ``z.``. This coupled with ``!`` being + mutually exclusive with ``.`` leaves the overall length of one's written code + unaffected when switching on modulo-precision. - Using a new ``!`` symbol reserves ``z`` for other future uses, whatever that may be. Cons: - - ``z.`` also conveys a sense of extension from ``.``, a flag attached to ``.``, and lexicographically flows left to right as 'modulo' (``z``) 'precision' (``.``). - - ``.`` and ``!`` being mutually exclusive to each other may give a beginner programmer analysis-paralysis over which to choose when looking at the `format specification `_ documentation. - - ``!`` would be another addition to the format specification for a single purpose. It would not have any implementation for ``str``, ``float``, or any other type. - - There also already exists a ``["!" conversion]`` "explicit conversion flag" in the `format string syntax `_ as laid out in :pep:`3101`. For example in ``f"{s!r}"`` the ``!r`` calls ``repr`` on ``s``. This would *not* syntactically clash with a ``!`` format specifier, the format specifiers ``[":" format_spec]`` being separated by a well-defined preceding colon, however users unfamiliar with the new modulo-precision mode may glance over format strings containing ``!`` and expect different behavior. + - ``z.`` also conveys a sense of extension from ``.``, a flag attached to ``.``, + and lexicographically flows left to right as 'modulo' (``z``) 'precision' (``.``). + - ``.`` and ``!`` being mutually exclusive to each other may give a beginner + programmer analysis-paralysis over which to choose when looking at the + `format specification `_ documentation. + - ``!`` would be another addition to the format specification for a single purpose. + It would not have any implementation for ``str``, ``float``, or any other type. + - There also already exists a ``["!" conversion]`` "explicit conversion flag" + in the `format string syntax `_ as laid out in :pep:`3101`. + For example in ``f"{s!r}"`` the ``!r`` calls ``repr`` on ``s``. This would + *not* syntactically clash with a ``!`` format specifier, the format specifiers + ``[":" format_spec]`` being separated by a well-defined preceding colon, + however users unfamiliar with the new modulo-precision mode may glance over + format strings containing ``!`` and expect different behavior. Verdict: - - Whilst graphically attractive, ``!`` would clutter the format specification for a single purpose that can be achieved by overloading the preexisting ``z`` flag. + - Whilst graphically attractive, ``!`` would clutter the format specification for + a single purpose that can be achieved by overloading the preexisting ``z`` flag. Backwards Compatibility @@ -268,9 +482,11 @@ Backwards Compatibility To quote :pep:`682`: - The new formatting behavior is opt-in, so numerical formatting of existing programs will not be affected. + The new formatting behavior is opt-in, so numerical formatting of existing + programs will not be affected. -unless someone out there is specifically relying upon ``.`` raising a ``ValueError`` for integers as it currently does, but to quote :pep:`475`: +unless someone out there is specifically relying upon ``.`` raising a ``ValueError`` +for integers as it currently does, but to quote :pep:`475`: The authors of this PEP don't think that such applications exist @@ -281,9 +497,14 @@ Examples And Teaching Precision --------- -Documentation and tutorials in the Python sphere of influence should encourage the adoption of ``.``, precision, as the default format specifier for formatting ``int`` fields as opposed to ``width``, when it is clear a minimum number of *digits* is required, not a minimum length of the *whole replacement field*. +Documentation and tutorials in the Python sphere of influence should encourage +the adoption of ``.``, precision, as the default format specifier for formatting +``int`` fields as opposed to ``width``, when it is clear a minimum number of *digits* +is required, not a minimum length of the *whole replacement field*. -Since the concept of precision is common in other languages such as C, and was already present in Python's old-style ``%`` formatting, we don't need to go *too* overboard, but a decent few examples as below may demonstrate its uses. +Since the concept of precision is common in other languages such as C, and was +already present in Python's old-style ``%`` formatting, we don't need to go *too* +overboard, but a decent few examples as below may demonstrate its uses. .. code-block:: python @@ -307,7 +528,10 @@ Since the concept of precision is common in other languages such as C, and was a Modulo-Precision ---------------- -The clear area for encouraging the use of modulo-precision is when dealing with machine-width oriented integers such as those packed and unpacked by :mod:`struct`. We give an example of the consistent predictable two's complement formatting of signed and unsigned integers. +The clear area for encouraging the use of modulo-precision is when dealing with +machine-width oriented integers such as those packed and unpacked by :mod:`struct`. +We give an example of the consistent predictable two's complement formatting of +signed and unsigned integers. .. code-block:: python @@ -343,8 +567,8 @@ CC0-1.0-Universal license, whichever is more permissive. TODO AND REMOVE BEFORE MERGE ============================ -* Format all lines to ~80 characters. I've left this formatting until we're happy with the contents. -* Give a good example of non-hardware aware use of modular arithmetic formatting, my brain has gone blank... +* Give a good example of non-hardware aware use of modular arithmetic formatting, + my brain has gone blank... Footnotes From 045df95ef5c88ed625758e5b3691b12ab7bf40cf Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 22 May 2025 12:08:08 +0100 Subject: [PATCH 18/19] Restructure section about two cases of truncation No example needed for non-hardware aware modulo Remove the '[sic]' since we're not quoting Sergey. --- peps/pep-0786.rst | 45 +++++++++++++++++---------------------------- 1 file changed, 17 insertions(+), 28 deletions(-) diff --git a/peps/pep-0786.rst b/peps/pep-0786.rst index e91aeb26057..20feeefe398 100644 --- a/peps/pep-0786.rst +++ b/peps/pep-0786.rst @@ -301,39 +301,35 @@ the desired behavior. Formally we may say that the formatting should be a well defined bijection between the equivalence classes of ``Z/(base ** precision)Z`` and strings with ``precision`` digits. -The remaining question is "[sic] is there no chance to communicate this -truncation to user?" as a concern for the 'loss of information' arising from the -effectively left-truncated strings. We reject this question's premise that there -ever is such a case of unintentional loss of information, considering the two -cases of hardware-aware integers and otherwise: - -So far we have played around with examples of bytes in ``range(-128, 256)``, -the union of the signed and unsigned ranges, with respect to which the virtues -of formatting ``x`` and ``x - 256`` as the same are clearly established. In the -hardware-aware contexts that one expects to find ``z``, any integers corresponding -to bytes that lie outside that range are likely a programming error. For example -if a library sets a pixel brightness integer to be 257, and prints out ``'0x01'`` -instead of ``'0x101'`` via ``f"{x:z#.2x}"``, that's not our problem or doing; string -formatting shouldn't raise an exception, or even a ``SyntaxWarning`` as an invalid -escape sequence ``"\y"`` would, because ``ValueError: bytes must be in range(0, 256)`` +The remaining question is "is there no chance to communicate this truncation to +the user?" as a concern for the 'loss of information' arising from the effectively +left-truncated strings. We reject this question's premise that there ever is such +a case of unintentional loss of information, by considering the two cases of +hardware-aware integers and otherwise: + +With respect to hardware-aware integers we have so far played around with examples +of integers in ``range(-128, 256)``, the union of the signed and unsigned ranges +for bytes. The virtues of formatting ``x`` and ``x - 256`` as the same are clearly +established. In these contexts that one expects to find ``z``, any erroneous integers +corresponding to bytes that lie outside that range are likely a programming error. +For example if a library sets a pixel brightness integer to be 257, and prints out +``'0x01'`` instead of ``'0x101'`` via ``f"{x:z#.2x}"``, that's not our problem or +doing; string formatting shouldn't raise an exception, or even a ``SyntaxWarning`` +as an invalid escape sequence ``"\y"`` would, because ``ValueError: bytes must be in range(0, 256)`` will be raised by ``bytes`` when trying to serialize that integer via ``bytes([257])``; let the appropriate 'layer' of code raise the exception, as that is more indicative of a defect in the library, not our string formatting. -In the case of non-hardware aware integers one would have to intentionally opt to +In the case of non-hardware aware integers, one would have to intentionally opt to use ``z``, in which modular arithmetic is the chosen desired effect. It is for this reason also that we shall not raise a ``SyntaxWarning`` or ``ValueError`` for integers lying outside of ``range(-base ** precision / 2, base ** precision)``. -.. - XXX Give a good example of non-hardware aware use of modular arithmetic - formatting like Minecraft buried treasure always being at 8,8 within a chunk. - Thus we have defended the lossy behavior of ``z`` implemented as modulo-precision, and we have exhausted all reasonable use cases of lossless behavior. A final compromise to consider and reject is implementing ``z`` not as a flag -*dependent* on ``.``, but as a flag that can be *combined* with ``.``. +*contingent* on ``.``, but as a flag that can be *combined* with ``.``. Specifically: ``z`` without ``.`` would turn on two's complement mode to render the minimal width representation of the formatted integer, ``.`` without ``z`` would implement precision as already explained, a minimum number of digits in the @@ -564,13 +560,6 @@ This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive. -TODO AND REMOVE BEFORE MERGE -============================ - -* Give a good example of non-hardware aware use of modular arithmetic formatting, - my brain has gone blank... - - Footnotes ========= From 5325a70799924029a65af345b3b4dd2b32725d33 Mon Sep 17 00:00:00 2001 From: jb2170 Date: Thu, 22 May 2025 12:10:36 +0100 Subject: [PATCH 19/19] Ready for review