From a56079135741acee48a2461edd4ba8a73debdb93 Mon Sep 17 00:00:00 2001 From: Stan Ulbrych Date: Tue, 15 Apr 2025 20:33:31 +0100 Subject: [PATCH 1/5] Add whitespace term to glossary and ref in stdtypes --- Doc/glossary.rst | 11 +++++++++++ Doc/library/stdtypes.rst | 41 ++++++++++++++++++++-------------------- 2 files changed, 32 insertions(+), 20 deletions(-) diff --git a/Doc/glossary.rst b/Doc/glossary.rst index 0b26e18efd7f1b..e96a669886c09c 100644 --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -1443,6 +1443,17 @@ Glossary A computer defined entirely in software. Python's virtual machine executes the :term:`bytecode` emitted by the bytecode compiler. + whitespace + Characters that represent horizontal or vertical space. + In ASCII context, Python recognizes these characters as whitespace: + `` \t\n\v\f\r`` (space, tab, newline, vertical tab, form feed, carriage return). + + In Unicode context, whitespace characters are those + characters defined in the Unicode character database as "Other" or "Separator" + and those with bidirectional property being one of "WS", "B", or "S". + + This is used, for example, to :func:`split` or :func:`strip` strings. + Zen of Python Listing of Python design principles and philosophies that are helpful in understanding and using the language. The listing can be found by typing diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 48d179c270378c..8533df444b934e 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -2092,8 +2092,9 @@ expression support in the :mod:`re` module). Return a copy of the string with leading characters removed. The *chars* argument is a string specifying the set of characters to be removed. If omitted - or ``None``, the *chars* argument defaults to removing whitespace. The *chars* - argument is not a prefix; rather, all combinations of its values are stripped:: + or ``None``, the *chars* argument defaults to removing :term:`whitespace`. + The *chars* argument is not a prefix; rather, all combinations of its values + are stripped:: >>> ' spacious '.lstrip() 'spacious ' @@ -2211,8 +2212,9 @@ expression support in the :mod:`re` module). Return a copy of the string with trailing characters removed. The *chars* argument is a string specifying the set of characters to be removed. If omitted - or ``None``, the *chars* argument defaults to removing whitespace. The *chars* - argument is not a suffix; rather, all combinations of its values are stripped:: + or ``None``, the *chars* argument defaults to removing :term:`whitespace`. + The *chars* argument is not a suffix; rather, all combinations of its values + are stripped:: >>> ' spacious '.rstrip() ' spacious' @@ -2348,7 +2350,7 @@ expression support in the :mod:`re` module). Return a copy of the string with the leading and trailing characters removed. The *chars* argument is a string specifying the set of characters to be removed. - If omitted or ``None``, the *chars* argument defaults to removing whitespace. + If omitted or ``None``, the *chars* argument defaults to removing :term:`whitespace`. The *chars* argument is not a prefix or suffix; rather, all combinations of its values are stripped:: @@ -2735,7 +2737,7 @@ data and are closely related to string objects in a variety of other ways. This :class:`bytes` class method returns a bytes object, decoding the given string object. The string must contain two hexadecimal digits per - byte, with ASCII whitespace being ignored. + byte, with :term:`ASCII whitespace ` being ignored. >>> bytes.fromhex('2Ef0 F1f2 ') b'.\xf0\xf1\xf2' @@ -2824,7 +2826,7 @@ objects. This :class:`bytearray` class method returns bytearray object, decoding the given string object. The string must contain two hexadecimal digits - per byte, with ASCII whitespace being ignored. + per byte, with :term:`ASCII whitespace ` being ignored. >>> bytearray.fromhex('2Ef0 F1f2 ') bytearray(b'.\xf0\xf1\xf2') @@ -3243,8 +3245,8 @@ produce new objects. *chars* argument is a binary sequence specifying the set of byte values to be removed - the name refers to the fact this method is usually used with ASCII characters. If omitted or ``None``, the *chars* argument defaults - to removing ASCII whitespace. The *chars* argument is not a prefix; - rather, all combinations of its values are stripped:: + to removing :term:`ASCII whitespace `. The *chars* argument is + not a prefix; rather, all combinations of its values are stripped:: >>> b' spacious '.lstrip() b'spacious ' @@ -3287,8 +3289,8 @@ produce new objects. Split the binary sequence into subsequences of the same type, using *sep* as the delimiter string. If *maxsplit* is given, at most *maxsplit* splits are done, the *rightmost* ones. If *sep* is not specified or ``None``, - any subsequence consisting solely of ASCII whitespace is a separator. - Except for splitting from the right, :meth:`rsplit` behaves like + any subsequence consisting solely of :term:`ASCII whitespace ` + is a separator. Except for splitting from the right, :meth:`rsplit` behaves like :meth:`split` which is described in detail below. @@ -3299,8 +3301,8 @@ produce new objects. *chars* argument is a binary sequence specifying the set of byte values to be removed - the name refers to the fact this method is usually used with ASCII characters. If omitted or ``None``, the *chars* argument defaults to - removing ASCII whitespace. The *chars* argument is not a suffix; rather, - all combinations of its values are stripped:: + removing :term:`ASCII whitespace `. The *chars* argument is not + a suffix; rather, all combinations of its values are stripped:: >>> b' spacious '.rstrip() b' spacious' @@ -3352,7 +3354,8 @@ produce new objects. [b'1', b'2', b'3<4'] If *sep* is not specified or is ``None``, a different splitting algorithm - is applied: runs of consecutive ASCII whitespace are regarded as a single + is applied: runs of consecutive :term:`ASCII whitespace ` are + regarded as a single separator, and the result will contain no empty strings at the start or end if the sequence has leading or trailing whitespace. Consequently, splitting an empty sequence or a sequence consisting solely of ASCII @@ -3376,8 +3379,8 @@ produce new objects. removed. The *chars* argument is a binary sequence specifying the set of byte values to be removed - the name refers to the fact this method is usually used with ASCII characters. If omitted or ``None``, the *chars* - argument defaults to removing ASCII whitespace. The *chars* argument is - not a prefix or suffix; rather, all combinations of its values are + argument defaults to removing :term:`ASCII whitespace `. The *chars* + argument is not a prefix or suffix; rather, all combinations of its values are stripped:: >>> b' spacious '.strip() @@ -3519,10 +3522,8 @@ place, and instead produce new objects. .. method:: bytes.isspace() bytearray.isspace() - Return ``True`` if all bytes in the sequence are ASCII whitespace and the - sequence is not empty, ``False`` otherwise. ASCII whitespace characters are - those byte values in the sequence ``b' \t\n\r\x0b\f'`` (space, tab, newline, - carriage return, vertical tab, form feed). + Return ``True`` if all bytes in the sequence are :term:`ASCII whitespace ` + and the sequence is not empty, ``False`` otherwise. .. method:: bytes.istitle() From 9e9095694700e5109375170b0789374184b13015 Mon Sep 17 00:00:00 2001 From: Stan Ulbrych Date: Tue, 15 Apr 2025 20:43:35 +0100 Subject: [PATCH 2/5] Lint/ref fix --- Doc/glossary.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Doc/glossary.rst b/Doc/glossary.rst index e96a669886c09c..cfafc0b89f66b7 100644 --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -1446,13 +1446,14 @@ Glossary whitespace Characters that represent horizontal or vertical space. In ASCII context, Python recognizes these characters as whitespace: - `` \t\n\v\f\r`` (space, tab, newline, vertical tab, form feed, carriage return). + ``' \t\n\v\f\r'`` (space, tab, newline, vertical tab, form feed, carriage return). In Unicode context, whitespace characters are those characters defined in the Unicode character database as "Other" or "Separator" and those with bidirectional property being one of "WS", "B", or "S". - This is used, for example, to :func:`split` or :func:`strip` strings. + This is used, for example, to :meth:`split ` or + :meth:`strip ` strings. Zen of Python Listing of Python design principles and philosophies that are helpful in From 448fd5f4973de84d1156cbc1ddf9f05dbfacfd24 Mon Sep 17 00:00:00 2001 From: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com> Date: Wed, 16 Apr 2025 09:04:37 +0100 Subject: [PATCH 3/5] Apply suggestions from code review Co-authored-by: Peter Bierma --- Doc/glossary.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/Doc/glossary.rst b/Doc/glossary.rst index cfafc0b89f66b7..d38586ab07ed7c 100644 --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -1445,15 +1445,15 @@ Glossary whitespace Characters that represent horizontal or vertical space. - In ASCII context, Python recognizes these characters as whitespace: + In an ASCII context, Python recognizes these characters as whitespace: ``' \t\n\v\f\r'`` (space, tab, newline, vertical tab, form feed, carriage return). - In Unicode context, whitespace characters are those + In a Unicode context, whitespace characters are the characters defined in the Unicode character database as "Other" or "Separator" and those with bidirectional property being one of "WS", "B", or "S". - This is used, for example, to :meth:`split ` or - :meth:`strip ` strings. + This is used, for example, to :meth:`~str.split` or + :meth:`~str.strip` strings. Zen of Python Listing of Python design principles and philosophies that are helpful in From 240cbae5d5554acc2ec5f057b6bed877ce3a5cca Mon Sep 17 00:00:00 2001 From: Stan Ulbrych Date: Wed, 16 Apr 2025 09:21:15 +0100 Subject: [PATCH 4/5] Peters suggestions --- Doc/glossary.rst | 20 +++++++++++++++++--- Doc/library/stdtypes.rst | 5 +++-- 2 files changed, 20 insertions(+), 5 deletions(-) diff --git a/Doc/glossary.rst b/Doc/glossary.rst index d38586ab07ed7c..63b127abf5ac08 100644 --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -1446,11 +1446,25 @@ Glossary whitespace Characters that represent horizontal or vertical space. In an ASCII context, Python recognizes these characters as whitespace: - ``' \t\n\v\f\r'`` (space, tab, newline, vertical tab, form feed, carriage return). + + +-----------+-----------------+ + | ``' '`` | space | + +-----------+-----------------+ + | ``'\t'`` | tab | + +-----------+-----------------+ + | ``'\n'`` | newline | + +-----------+-----------------+ + | ``'\v'`` | vertical tab | + +-----------+-----------------+ + | ``'\f'`` | form feed | + +-----------+-----------------+ + | ``'\r'`` | carriage return | + +-----------+-----------------+ In a Unicode context, whitespace characters are the - characters defined in the Unicode character database as "Other" or "Separator" - and those with bidirectional property being one of "WS", "B", or "S". + characters defined in the `Unicode Character Database + `_ as "Other" or "Separator" + and those with bidirectional property being one of "WS," "B," or "S." This is used, for example, to :meth:`~str.split` or :meth:`~str.strip` strings. diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 8533df444b934e..600a307d69773f 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -3522,8 +3522,9 @@ place, and instead produce new objects. .. method:: bytes.isspace() bytearray.isspace() - Return ``True`` if all bytes in the sequence are :term:`ASCII whitespace ` - and the sequence is not empty, ``False`` otherwise. + Return ``True`` if all bytes in the sequence are + :term:`ASCII whitespace ` and the sequence is not empty, + ``False`` otherwise. .. method:: bytes.istitle() From dd9ddd004a4b5e088ef1ad3f0738c0d97b23993f Mon Sep 17 00:00:00 2001 From: Stan Ulbrych Date: Wed, 16 Apr 2025 13:10:23 +0100 Subject: [PATCH 5/5] Wrap some more lines... --- Doc/glossary.rst | 4 ++-- Doc/library/stdtypes.rst | 16 ++++++++-------- 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/Doc/glossary.rst b/Doc/glossary.rst index 63b127abf5ac08..cb407724029b22 100644 --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -1466,8 +1466,8 @@ Glossary `_ as "Other" or "Separator" and those with bidirectional property being one of "WS," "B," or "S." - This is used, for example, to :meth:`~str.split` or - :meth:`~str.strip` strings. + For example, this is used to :meth:`~str.split` or :meth:`~str.strip` + strings. Zen of Python Listing of Python design principles and philosophies that are helpful in diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 600a307d69773f..c5a57701715e25 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -2350,9 +2350,9 @@ expression support in the :mod:`re` module). Return a copy of the string with the leading and trailing characters removed. The *chars* argument is a string specifying the set of characters to be removed. - If omitted or ``None``, the *chars* argument defaults to removing :term:`whitespace`. - The *chars* argument is not a prefix or suffix; rather, all combinations of its - values are stripped:: + If omitted or ``None``, the *chars* argument defaults to removing + :term:`whitespace`. The *chars* argument is not a prefix or suffix; rather, + all combinations of its values are stripped:: >>> ' spacious '.strip() 'spacious' @@ -3290,8 +3290,8 @@ produce new objects. as the delimiter string. If *maxsplit* is given, at most *maxsplit* splits are done, the *rightmost* ones. If *sep* is not specified or ``None``, any subsequence consisting solely of :term:`ASCII whitespace ` - is a separator. Except for splitting from the right, :meth:`rsplit` behaves like - :meth:`split` which is described in detail below. + is a separator. Except for splitting from the right, :meth:`rsplit` behaves + like :meth:`split` which is described in detail below. .. method:: bytes.rstrip([chars]) @@ -3379,9 +3379,9 @@ produce new objects. removed. The *chars* argument is a binary sequence specifying the set of byte values to be removed - the name refers to the fact this method is usually used with ASCII characters. If omitted or ``None``, the *chars* - argument defaults to removing :term:`ASCII whitespace `. The *chars* - argument is not a prefix or suffix; rather, all combinations of its values are - stripped:: + argument defaults to removing :term:`ASCII whitespace `. + The *chars* argument is not a prefix or suffix; rather, all combinations of + its values are stripped:: >>> b' spacious '.strip() b'spacious'