python · gpshead · Feb 5, 2022 · Jan 30, 2023 · Jan 30, 2023 · Jan 30, 2023
diff --git a/Doc/library/re.rst b/Doc/library/re.rst
@@ -831,7 +831,7 @@ Flags
    value::
 
       def myfunc(text, flag=re.NOFLAG):
-          return re.match(text, flag)
+          return re.search(text, flag)
 
    .. versionadded:: 3.11
 
@@ -887,8 +887,8 @@ Functions
 
    Compile a regular expression pattern into a :ref:`regular expression object
    <re-objects>`, which can be used for matching using its
-   :func:`~Pattern.match`, :func:`~Pattern.search` and other methods, described
-   below.
+   :func:`~Pattern.prefixmatch` (:func:`~Pattern.match`),
+   :func:`~Pattern.search`, and other methods, described below.
 
    The expression's behaviour can be modified by specifying a *flags* value.
    Values can be any of the `flags`_ variables, combined using bitwise OR
@@ -897,11 +897,11 @@ Functions
    The sequence ::
 
       prog = re.compile(pattern)
-      result = prog.match(string)
+      result = prog.search(string)
 
    is equivalent to ::
 
-      result = re.match(pattern, string)
+      result = re.search(pattern, string)
 
    but using :func:`re.compile` and saving the resulting regular expression
    object for reuse is more efficient when the expression will be used several
@@ -928,14 +928,15 @@ Functions
 
 
 .. function:: match(pattern, string, flags=0)
+.. function:: prefixmatch(pattern, string, flags=0)
 
    If zero or more characters at the beginning of *string* match the regular
    expression *pattern*, return a corresponding :class:`~re.Match`.  Return
    ``None`` if the string does not match the pattern; note that this is
    different from a zero-length match.
 
-   Note that even in :const:`MULTILINE` mode, :func:`re.match` will only match
-   at the beginning of the string and not at the beginning of each line.
+   Note that even in :const:`MULTILINE` mode, this will only match at the
+   beginning of the string and not at the beginning of each line.
 
    If you want to locate a match anywhere in *string*, use :func:`search`
    instead (see also :ref:`search-vs-match`).
@@ -944,6 +945,18 @@ Functions
    Values can be any of the `flags`_ variables, combined using bitwise OR
    (the ``|`` operator).
 
+   This function now has two names and has long been known as
+   :func:`~re.match`.  Use that name when you need to retain compatibility with
+   older Python versions.
+
+   .. versionchanged:: next
+      An alternate :func:`~re.prefixmatch` name with this API was added as a
+      more descriptive explicit name for the behavior of :func:`~re.match`. Use
+      it to more clearly express intent. The norm in other languages and
+      regular expression implementations is to use the term *match* to refer to
+      the behavior of what Python has always called :func:`~re.search`.  See
+      :ref:`prefixmatch-vs-match`.
+
 
 .. function:: fullmatch(pattern, string, flags=0)
 
@@ -1264,23 +1277,42 @@ Regular Expression Objects
 
 
 .. method:: Pattern.match(string[, pos[, endpos]])
+.. method:: Pattern.prefixmatch(string[, pos[, endpos]])
 
    If zero or more characters at the *beginning* of *string* match this regular
    expression, return a corresponding :class:`~re.Match`. Return ``None`` if the
    string does not match the pattern; note that this is different from a
    zero-length match.
 
+   Note that even in :const:`MULTILINE` mode, this will only match at the
+   beginning of the string and not at the beginning of each line.
+
    The optional *pos* and *endpos* parameters have the same meaning as for the
    :meth:`~Pattern.search` method. ::
 
       >>> pattern = re.compile("o")
-      >>> pattern.match("dog")      # No match as "o" is not at the start of "dog".
-      >>> pattern.match("dog", 1)   # Match as "o" is the 2nd character of "dog".
+      >>> pattern.prefixmatch("dog")     # No match as "o" is not at the start of "dog".
+      >>> pattern.prefixmatch("dog", 1)  # Match as "o" is the 2nd character of "dog".
+      <re.Match object; span=(1, 2), match='o'>
+      >>> pattern.match("dog")           # Same as above.
+      >>> pattern.match("dog", 1)        # Same as above.
       <re.Match object; span=(1, 2), match='o'>
 
    If you want to locate a match anywhere in *string*, use
    :meth:`~Pattern.search` instead (see also :ref:`search-vs-match`).
 
+   This method now has two names and has long been known as
+   :meth:`~Pattern.match`.  Use that name when you need to retain compatibility
+   with older Python versions.
+
+   .. versionchanged:: next
+      An alternate :meth:`~Pattern.prefixmatch` name with this API was added as
+      a more descriptive explicit name for the behavior of
+      :meth:`~Pattern.match`. Use it to more clearly express intent. The norm
+      in other languages and regular expression implementations is to use the
+      term *match* to refer to the behavior of what Python has always called
+      :meth:`~Pattern.search`.  See :ref:`prefixmatch-vs-match`.
+
 
 .. method:: Pattern.fullmatch(string[, pos[, endpos]])
 
@@ -1368,8 +1400,7 @@ Since :meth:`~Pattern.match` and :meth:`~Pattern.search` return ``None``
 when there is no match, you can test whether there was a match with a simple
 ``if`` statement::
 
-   match = re.search(pattern, string)
-   if match:
+   if match := re.search(pattern, string):
        process(match)
 
 .. class:: Match
@@ -1407,7 +1438,7 @@ when there is no match, you can test whether there was a match with a simple
    If a group is contained in a part of the pattern that matched multiple times,
    the last match is returned. ::
 
-      >>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
+      >>> m = re.search(r"(\w+) (\w+)", "Isaac Newton, physicist")
       >>> m.group(0)       # The entire match
       'Isaac Newton'
       >>> m.group(1)       # The first parenthesized subgroup.
@@ -1424,7 +1455,7 @@ when there is no match, you can test whether there was a match with a simple
 
    A moderately complicated example::
 
-      >>> m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds")
+      >>> m = re.search(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds")
       >>> m.group('first_name')
       'Malcolm'
       >>> m.group('last_name')
@@ -1439,8 +1470,8 @@ when there is no match, you can test whether there was a match with a simple
 
    If a group matches multiple times, only the last match is accessible::
 
-      >>> m = re.match(r"(..)+", "a1b2c3")  # Matches 3 times.
-      >>> m.group(1)                        # Returns only the last match.
+      >>> m = re.search(r"(..)+", "a1b2c3")  # Matches 3 times.
+      >>> m.group(1)                         # Returns only the last match.
       'c3'
 
 
@@ -1449,7 +1480,7 @@ when there is no match, you can test whether there was a match with a simple
    This is identical to ``m.group(g)``.  This allows easier access to
    an individual group from a match::
 
-      >>> m = re.match(r"(\w+) (\w+)", "Isaac Newton, physicist")
+      >>> m = re.search(r"(\w+) (\w+)", "Isaac Newton, physicist")
       >>> m[0]       # The entire match
       'Isaac Newton'
       >>> m[1]       # The first parenthesized subgroup.
@@ -1459,7 +1490,7 @@ when there is no match, you can test whether there was a match with a simple
 
    Named groups are supported as well::
 
-      >>> m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Isaac Newton")
+      >>> m = re.search(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Isaac Newton")
       >>> m['first_name']
       'Isaac'
       >>> m['last_name']
@@ -1476,15 +1507,15 @@ when there is no match, you can test whether there was a match with a simple
 
    For example::
 
-      >>> m = re.match(r"(\d+)\.(\d+)", "24.1632")
+      >>> m = re.search(r"(\d+)\.(\d+)", "24.1632")
       >>> m.groups()
       ('24', '1632')
 
    If we make the decimal place and everything after it optional, not all groups
    might participate in the match.  These groups will default to ``None`` unless
    the *default* argument is given::
 
-      >>> m = re.match(r"(\d+)\.?(\d+)?", "24")
+      >>> m = re.search(r"(\d+)\.?(\d+)?", "24")
       >>> m.groups()      # Second group defaults to None.
       ('24', None)
       >>> m.groups('0')   # Now, the second group defaults to '0'.
@@ -1497,7 +1528,7 @@ when there is no match, you can test whether there was a match with a simple
    the subgroup name.  The *default* argument is used for groups that did not
    participate in the match; it defaults to ``None``.  For example::
 
-      >>> m = re.match(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds")
+      >>> m = re.search(r"(?P<first_name>\w+) (?P<last_name>\w+)", "Malcolm Reynolds")
       >>> m.groupdict()
       {'first_name': 'Malcolm', 'last_name': 'Reynolds'}
 
@@ -1603,38 +1634,38 @@ representing the card with that value.
 To see if a given string is a valid hand, one could do the following::
 
    >>> valid = re.compile(r"^[a2-9tjqk]{5}$")
-   >>> displaymatch(valid.match("akt5q"))  # Valid.
+   >>> displaymatch(valid.search("akt5q"))  # Valid.
    "<Match: 'akt5q', groups=()>"
-   >>> displaymatch(valid.match("akt5e"))  # Invalid.
-   >>> displaymatch(valid.match("akt"))    # Invalid.
-   >>> displaymatch(valid.match("727ak"))  # Valid.
+   >>> displaymatch(valid.search("akt5e"))  # Invalid.
+   >>> displaymatch(valid.search("akt"))    # Invalid.
+   >>> displaymatch(valid.search("727ak"))  # Valid.
    "<Match: '727ak', groups=()>"
 
 That last hand, ``"727ak"``, contained a pair, or two of the same valued cards.
 To match this with a regular expression, one could use backreferences as such::
 
-   >>> pair = re.compile(r".*(.).*\1")
-   >>> displaymatch(pair.match("717ak"))     # Pair of 7s.
+   >>> pair = re.compile(r"^.*(.).*\1")
+   >>> displaymatch(pair.search("717ak"))     # Pair of 7s.
    "<Match: '717', groups=('7',)>"
-   >>> displaymatch(pair.match("718ak"))     # No pairs.
-   >>> displaymatch(pair.match("354aa"))     # Pair of aces.
+   >>> displaymatch(pair.search("718ak"))     # No pairs.
+   >>> displaymatch(pair.search("354aa"))     # Pair of aces.
    "<Match: '354aa', groups=('a',)>"
 
 To find out what card the pair consists of, one could use the
 :meth:`~Match.group` method of the match object in the following manner::
 
-   >>> pair = re.compile(r".*(.).*\1")
-   >>> pair.match("717ak").group(1)
+   >>> pair = re.compile(r"^.*(.).*\1")
+   >>> pair.search("717ak").group(1)
    '7'
 
-   # Error because re.match() returns None, which doesn't have a group() method:
-   >>> pair.match("718ak").group(1)
+   # Error because re.search() returns None, which doesn't have a group() method:
+   >>> pair.search("718ak").group(1)
    Traceback (most recent call last):
      File "<pyshell#23>", line 1, in <module>
-       re.match(r".*(.).*\1", "718ak").group(1)
+       re.search(r".*(.).*\1", "718ak").group(1)
    AttributeError: 'NoneType' object has no attribute 'group'
 
-   >>> pair.match("354aa").group(1)
+   >>> pair.search("354aa").group(1)
    'a'
 
 
@@ -1693,16 +1724,17 @@ search() vs. match()
 
 Python offers different primitive operations based on regular expressions:
 
-+ :func:`re.match` checks for a match only at the beginning of the string
++ :func:`re.prefixmatch`, also known under the less explicit name
+  :func:`re.match`, checks for a match only at the beginning of the string
 + :func:`re.search` checks for a match anywhere in the string
   (this is what Perl does by default)
 + :func:`re.fullmatch` checks for entire string to be a match
 
-
 For example::
 
-   >>> re.match("c", "abcdef")    # No match
-   >>> re.search("c", "abcdef")   # Match
+   >>> re.match("c", "abcdef")        # No match
+   >>> re.prefixmatch("c", "abcdef")  # No match
+   >>> re.search("c", "abcdef")       # Match
    <re.Match object; span=(2, 3), match='c'>
    >>> re.fullmatch("p.*n", "python") # Match
    <re.Match object; span=(0, 6), match='python'>
@@ -1711,19 +1743,47 @@ For example::
 Regular expressions beginning with ``'^'`` can be used with :func:`search` to
 restrict the match at the beginning of the string::
 
-   >>> re.match("c", "abcdef")    # No match
-   >>> re.search("^c", "abcdef")  # No match
-   >>> re.search("^a", "abcdef")  # Match
+   >>> re.match("c", "abcdef")        # No match
+   >>> re.prefixmatch("c", "abcdef")  # No match
+   >>> re.search("^c", "abcdef")      # No match
+   >>> re.search("^a", "abcdef")      # Match
    <re.Match object; span=(0, 1), match='a'>
 
 Note however that in :const:`MULTILINE` mode :func:`match` only matches at the
 beginning of the string, whereas using :func:`search` with a regular expression
 beginning with ``'^'`` will match at the beginning of each line. ::
 
+   >>> re.prefixmatch("X", "A\nB\nX", re.MULTILINE)  # No match
    >>> re.match("X", "A\nB\nX", re.MULTILINE)  # No match
    >>> re.search("^X", "A\nB\nX", re.MULTILINE)  # Match
    <re.Match object; span=(4, 5), match='X'>
 
+.. _prefixmatch-vs-match:
+
+prefixmatch() vs. match()
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Why is the :func:`~re.match` function and method name being discouraged in
+favor of the longer :func:`~re.prefixmatch` spelling in very recent Python?
+
+Many other languages have gained regex support libraries since regular
+expressions were added to Python. However in the most popular of those, they
+use the term *match* in their APIs to mean the unanchored behavior provided in
+Python by :func:`~re.search`. Thus use of the plain term *match* can be
+unclear to those used to other languages when reading or writing code and
+not familiar with the Python API's divergence from what otherwise become the
+industry norm.
+
+Quoting from the Zen Of Python (``python3 -m this``): *"Explicit is better than
+implicit"*. Anyone reading the name :func:`~re.prefixmatch` is likely to
+understand the intended semantics. When reading :func:`~re.match` there remains
+a seed of doubt about the intended behavior to anyone not already familiar with
+this old Python gotcha.
+
+We **do not** plan to deprecate and remove the older *match* name in this
+decade, if ever, as it has been used in code for over 25 years.
+
+.. versionadded:: next
 
 Making a Phonebook
 ^^^^^^^^^^^^^^^^^^
@@ -1843,19 +1903,19 @@ every backslash (``'\'``) in a regular expression would have to be prefixed with
 another one to escape it.  For example, the two following lines of code are
 functionally identical::
 
-   >>> re.match(r"\W(.)\1\W", " ff ")
+   >>> re.search(r"\W(.)\1\W", " ff ")
    <re.Match object; span=(0, 4), match=' ff '>
-   >>> re.match("\\W(.)\\1\\W", " ff ")
+   >>> re.search("\\W(.)\\1\\W", " ff ")
    <re.Match object; span=(0, 4), match=' ff '>
 
 When one wants to match a literal backslash, it must be escaped in the regular
 expression.  With raw string notation, this means ``r"\\"``.  Without raw string
 notation, one must use ``"\\\\"``, making the following lines of code
 functionally identical::
 
-   >>> re.match(r"\\", r"\\")
+   >>> re.search(r"\\", r"\\")
    <re.Match object; span=(0, 1), match='\\'>
-   >>> re.match("\\\\", r"\\")
+   >>> re.search("\\\\", r"\\")
    <re.Match object; span=(0, 1), match='\\'>
 
 

diff --git a/Doc/whatsnew/3.14.rst b/Doc/whatsnew/3.14.rst
@@ -1056,6 +1056,19 @@ pydoc
   (Contributed by Jelle Zijlstra in :gh:`101552`.)
 
 
+re
+--
+
+* :func:`re.prefixmatch` and a corresponding :meth:`~re.Pattern.prefixmatch`
+  have been added as alternate more explicit names for the existing
+  :func:`re.match` and :meth:`~re.Pattern.match` APIs. These are intended
+  to be used to alleviate confusion around what *match* means by following the
+  Zen of Python's *"Explicit is better than implicit"* mantra. Most other
+  language regular expression libraries use an API named *match* to mean what
+  Python has always called *search*.
+  (Contributed by Gregory P. Smith in :gh:`86519`.)
+
+
 ssl
 ---