Skip to content

gh-125916: Allow functools.reduce 'initial' to be a keyword argument #125917

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Nov 12, 2024

Conversation

sayandipdutta
Copy link
Contributor

@sayandipdutta sayandipdutta commented Oct 24, 2024

Before:

from functools import reduce
from operator import sub
>>> reduce(sub, [1, 1, 2, 3, 5, 8], 21)
1
>>> reduce(sub, [1, 1, 2, 3, 5, 8], initial=21)
TypeError: reduce() takes no keyword arguments

After:

from functools import reduce
from operator import sub
>>> reduce(sub, [1, 1, 2, 3, 5, 8], 21)
1
>>> reduce(sub, [1, 1, 2, 3, 5, 8], initial=21)
1

Issue: gh-125916


📚 Documentation preview 📚: https://cpython-previews--125917.org.readthedocs.build/

@ghost
Copy link

ghost commented Oct 24, 2024

All commit authors signed the Contributor License Agreement.
CLA signed

Copy link
Member

@skirpichev skirpichev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also benchmark your implementation.

Kwargs handling will affect performance even if keyword will not be actually used (e.g. calls like reduce(f, seq, init)). IIUC, PyArg_ParseTupleAndKeywords is much slower in general than PyArg_UnpackTuple.

@skirpichev
Copy link
Member

CC @sobolevn, as you have added AC comment.

@skirpichev

This comment was marked as outdated.

Co-authored-by: Sergey B Kirpichev <skirpichev@gmail.com>
@skirpichev
Copy link
Member

skirpichev commented Oct 24, 2024

Now with AC (patch2).

Patch with AC (a draft).
diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c
index 802b1cf792..8faa8ad1ac 100644
--- a/Modules/_functoolsmodule.c
+++ b/Modules/_functoolsmodule.c
@@ -932,15 +932,30 @@ _functools_cmp_to_key_impl(PyObject *module, PyObject *mycmp)
 
 /* reduce (used to be a builtin) ********************************************/
 
-// Not converted to argument clinic, because of `args` in-place modification.
-// AC will affect performance.
+/*[clinic input]
+_functools.reduce
+
+    function as func: object
+    iterable as seq: object
+    /
+    initial as result: object(c_default="NULL") = None
+
+Apply a function of two arguments cumulatively.
+
+Apply it to the items of a sequence or iterable, from left to right, so as to
+reduce the iterable to a single value.  For example, reduce(lambda x, y: x+y,
+[1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5).  If initial is present, it is
+placed before the items of the iterable in the calculation, and serves as a
+default when the iterable is empty.
+[clinic start generated code]*/
+
 static PyObject *
-functools_reduce(PyObject *self, PyObject *args)
+_functools_reduce_impl(PyObject *module, PyObject *func, PyObject *seq,
+                       PyObject *result)
+/*[clinic end generated code: output=30d898fe1267c79d input=b7082b8b1473fdc2]*/
 {
-    PyObject *seq, *func, *result = NULL, *it;
+    PyObject *args, *it;
 
-    if (!PyArg_UnpackTuple(args, "reduce", 2, 3, &func, &seq, &result))
-        return NULL;
     if (result != NULL)
         Py_INCREF(result);
 
@@ -1006,16 +1021,6 @@ functools_reduce(PyObject *self, PyObject *args)
     return NULL;
 }
 
-PyDoc_STRVAR(functools_reduce_doc,
-"reduce(function, iterable[, initial], /) -> value\n\
-\n\
-Apply a function of two arguments cumulatively to the items of a sequence\n\
-or iterable, from left to right, so as to reduce the iterable to a single\n\
-value.  For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates\n\
-((((1+2)+3)+4)+5).  If initial is present, it is placed before the items\n\
-of the iterable in the calculation, and serves as a default when the\n\
-iterable is empty.");
-
 /* lru_cache object **********************************************************/
 
 /* There are four principal algorithmic differences from the pure python version:
@@ -1720,7 +1725,7 @@ PyDoc_STRVAR(_functools_doc,
 "Tools that operate on functions.");
 
 static PyMethodDef _functools_methods[] = {
-    {"reduce",          functools_reduce,     METH_VARARGS, functools_reduce_doc},
+    _FUNCTOOLS_REDUCE_METHODDEF
     _FUNCTOOLS_CMP_TO_KEY_METHODDEF
     {NULL,              NULL}           /* sentinel */
 };

You should run ./python Tools/clinic/clinic.py Modules/_functoolsmodule.c to update autogenerated code.

I did some benchmarks.

# a.py
import pyperf
from functools import reduce

f = lambda x, y: x + y
lst = list(range(10))
init = 123

runner = pyperf.Runner()
runner.bench_func('reduce(f, lst)', reduce, f, lst)
runner.bench_func('reduce(f, lst, init)', reduce, f, lst, init)

Run e.g. with: python a.py -q -o ref.json.

with results:

Benchmark ref patch patch2
reduce(f, lst) 2.18 us 2.42 us: 1.11x slower 2.11 us: 1.03x faster
reduce(f, lst, init) 2.35 us 2.64 us: 1.12x slower 2.27 us: 1.04x faster
Geometric mean (ref) 1.12x slower 1.03x faster

Looks the patch with AC even slightly faster than in the main.

@sayandipdutta
Copy link
Contributor Author

@skirpichev Is initial=None safe for backward compatibility? Does this mean reduce(Callable[[None, T], None], Iterable[T], None) will behave differently in 3.13 and 3.14?

Copy link
Member

@Eclips4 Eclips4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I think that if we want to add keyword support for functools.reduce, it should be done for all parameters, not just the initial. If so, this would match the behavior of the pure Python version of functools.

@Eclips4
Copy link
Member

Eclips4 commented Oct 24, 2024

@skirpichev Is initial=None safe for backward compatibility? Does this mean reduce(Callable[[None, T], None], Iterable[T], None) will behave differently in 3.13 and 3.14?

No, it doesn't. Someone can use None as initial value.

Taken from patch by Sergey B Kirpichev <skirpichev@gmail.com>
@skirpichev
Copy link
Member

skirpichev commented Oct 24, 2024

it should be done for all parameters, not just the initial

That might slowdown the patch v2.

No, it doesn't. Someone can use None as initial value.

That was a draft;) I think we could use same trick as for the Python version.

BTW, it seems the PEP 661 doesn't cover this at all.

Edit:

Updated AC patch with a sentinel value.
diff --git a/Modules/_functoolsmodule.c b/Modules/_functoolsmodule.c
index 802b1cf792..00b4a5e6cc 100644
--- a/Modules/_functoolsmodule.c
+++ b/Modules/_functoolsmodule.c
@@ -932,15 +932,31 @@ _functools_cmp_to_key_impl(PyObject *module, PyObject *mycmp)
 
 /* reduce (used to be a builtin) ********************************************/
 
-// Not converted to argument clinic, because of `args` in-place modification.
-// AC will affect performance.
+/*[clinic input]
+_functools.reduce
+
+    function as func: object
+    iterable as seq: object
+    /
+    initial as result: object(c_default="NULL") = _functools._initial_missing
+
+Apply a function of two arguments cumulatively to an iterable, from left to right.
+
+This efficiently reduce the iterable to a single value.  If initial is present,
+it is placed before the items of the iterable in the calculation, and serves as
+a default when the iterable is empty.
+
+For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
+calculates ((((1+2)+3)+4)+5).
+[clinic start generated code]*/
+
 static PyObject *
-functools_reduce(PyObject *self, PyObject *args)
+_functools_reduce_impl(PyObject *module, PyObject *func, PyObject *seq,
+                       PyObject *result)
+/*[clinic end generated code: output=30d898fe1267c79d input=40be8069bcbc1a75]*/
 {
-    PyObject *seq, *func, *result = NULL, *it;
+    PyObject *args, *it;
 
-    if (!PyArg_UnpackTuple(args, "reduce", 2, 3, &func, &seq, &result))
-        return NULL;
     if (result != NULL)
         Py_INCREF(result);
 
@@ -1006,16 +1022,6 @@ functools_reduce(PyObject *self, PyObject *args)
     return NULL;
 }
 
-PyDoc_STRVAR(functools_reduce_doc,
-"reduce(function, iterable[, initial], /) -> value\n\
-\n\
-Apply a function of two arguments cumulatively to the items of a sequence\n\
-or iterable, from left to right, so as to reduce the iterable to a single\n\
-value.  For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates\n\
-((((1+2)+3)+4)+5).  If initial is present, it is placed before the items\n\
-of the iterable in the calculation, and serves as a default when the\n\
-iterable is empty.");
-
 /* lru_cache object **********************************************************/
 
 /* There are four principal algorithmic differences from the pure python version:
@@ -1720,7 +1726,7 @@ PyDoc_STRVAR(_functools_doc,
 "Tools that operate on functions.");
 
 static PyMethodDef _functools_methods[] = {
-    {"reduce",          functools_reduce,     METH_VARARGS, functools_reduce_doc},
+    _FUNCTOOLS_REDUCE_METHODDEF
     _FUNCTOOLS_CMP_TO_KEY_METHODDEF
     {NULL,              NULL}           /* sentinel */
 };
@@ -1789,6 +1795,10 @@ _functools_exec(PyObject *module)
     // lru_list_elem is used only in _lru_cache_wrapper.
     // So we don't expose it in module namespace.
 
+    if (PyModule_Add(module, "_initial_missing", _PyObject_New(&PyBaseObject_Type)) < 0) {
+        return -1;
+    }
+
     return 0;
 }
 

@sayandipdutta
Copy link
Contributor Author

sayandipdutta commented Oct 24, 2024

@skirpichev should the default be specified at all? I think reduce(function, iterable, /, initial) is closer representation of internal working than reduce(function, iterable, /, initial=None). Or is there some sort of a convention?

EDIT: Ah scratch that. That makes initial required. I meant reduce(function, iterable, /[, initial])

@skirpichev
Copy link
Member

I think reduce(function, iterable, /, initial) is closer representation

No. Current code in the main more accurately can be described as function with multiple signatures. Funny notation reduce(function, iterable[, initial], /) means it's possible to have two signature:

reduce(function, iterable, /)
reduce(function, iterable, initial, /)

The AC can't represent multiple signatures yet. The only way - using the sentinel value _initial_missing, like pure-Python version does. See updated patch above. You shouldn't use None as default value.

Lib/functools.py Outdated
@@ -236,7 +236,7 @@ def __ge__(self, other):

def reduce(function, sequence, initial=_initial_missing):
"""
reduce(function, iterable[, initial], /) -> value
reduce(function, iterable, /, initial=None) -> value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use ellipsis:

Suggested change
reduce(function, iterable, /, initial=None) -> value
reduce(function, iterable, /, initial=...) -> value

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See PEP 661:)

Copy link
Contributor

@nineteendo nineteendo Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the sentinel is private and doesn't even exist in the C implementation. Ellipsis is frequently used for unspecified default values in typeshed. We could use multiple signatures though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the sentinel is private and doesn't even exist in the C implementation.

It's easy to add, see #125917 (comment)

Ellipsis is frequently used for unspecified default values in typeshed.

@Eclips4?

We could use multiple signatures though.

Yes, I think it's fine for the sphinx docs. But help will looks like this (as for pure-Python version):

>>> help(functools.reduce)
Help on built-in function reduce in module _functools:

reduce(function, iterable, /,
       initial=_functools._initial_missing)
    Apply a function of two arguments cumulatively to an iterable, from left to right.

    [...]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I've ever seen =... in the docs. Do we have precedent for that?

Copy link
Contributor Author

@sayandipdutta sayandipdutta Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the signature is giving inspect a hard time. But it is autogenerated by AC. Did I do something wrong?

reduce(function, iterable, /,
       initial=_functools._initial_missing)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the sentinel is private and doesn't even exist in the C implementation. Ellipsis is frequently used for unspecified default values in typeshed. We could use multiple signatures though.

Multiple signatures for a docs sounds like a good solution.
Using ... for default values is essentially the same as using None, and it's just wrong since users can pass ... as the initial value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I've ever seen =... in the docs. Do we have precedent for that?

Yeah, e.g. for the int.from_bytes, for example.

it seems like the signature is giving inspect a hard time. But it is autogenerated by AC. Did I do something wrong?

First, note that reduce() has no correct signature in the current main.

Now AC adds one, but it can't be parsed by inspect._signature_fromstr(): this helper has own opinion on what can be specified as a default value (e.g. it can't be a complex number).

Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
sayandipdutta and others added 2 commits October 24, 2024 19:42
@nineteendo
Copy link
Contributor

Do you update this test?

def test_functools_module_has_signatures(self):
no_signature = {'reduce'}
self._test_module_has_signatures(functools, no_signature)

@bedevere-app

This comment was marked as outdated.

@bedevere-app bedevere-app bot requested a review from erlend-aasland November 1, 2024 22:19
…Agz6D.rst

Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
@sayandipdutta
Copy link
Contributor Author

sayandipdutta commented Nov 1, 2024

Checked against debug build. Followed script from #125917 (comment)

call Main branch This branch
reduce(f, lst) 3.59 us +- 0.13 us 3.51 us +- 0.14 us
reduce(f, lst, initial) 3.82 us +- 0.25 us 3.78 us +- 0.13 us

@erlend-aasland

EDIT: On release:

call Main branch This branch
reduce(f, lst) 912 ns +- 51 ns 915 ns +- 50 ns
reduce(f, lst, initial) 987 ns +- 49 ns 979 ns +- 34 ns

Copy link
Contributor

@erlend-aasland erlend-aasland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@erlend-aasland
Copy link
Contributor

BTW, you need a What's New entry for this.

@skirpichev
Copy link
Member

@sayandipdutta, this now has merge conflicts.

@skirpichev
Copy link
Member

I think that most people are ok with adding only one positional-or-keyword parameter (not 3). Is there something to do, besides simple conflict resolution?

@Eclips4, are you ok with this (as your pr depends on the current one)?

@skirpichev skirpichev requested a review from Eclips4 November 11, 2024 02:35
@erlend-aasland erlend-aasland linked an issue Nov 12, 2024 that may be closed by this pull request
Copy link
Member

@Eclips4 Eclips4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

@erlend-aasland erlend-aasland enabled auto-merge (squash) November 12, 2024 11:23
@erlend-aasland
Copy link
Contributor

Congrats with landing your PR, @sayandipdutta; good job, and thanks for your contribution! Thanks for reviewing, y'all!

@ZeroIntensity
Copy link
Member

It looks like bedevere is stuck.

@erlend-aasland erlend-aasland merged commit abb90ba into python:main Nov 12, 2024
39 checks passed
@sayandipdutta
Copy link
Contributor Author

Thanks a lot everyone, for your help! Hope to be a regular contributor 🤞🏾

picnixz pushed a commit to picnixz/cpython that referenced this pull request Dec 8, 2024
ebonnal pushed a commit to ebonnal/cpython that referenced this pull request Jan 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow functools.reduces 'initial' to be a keyword argument
7 participants